Back to Articles
November 18, 2024 11 min

Database Indexing Best Practices for MySQL

#Database
#MySQL
#Tutorial

Database Indexing Best Practices for MySQL

Proper indexing can make the difference between a 2-second query and a 0.02-second query. Let’s dive into MySQL indexing strategies.

Understanding Indexes

An index is like a book’s table of contents - it helps MySQL find data without scanning every row.

How Indexes Work

-- Without index: Full table scan
SELECT * FROM users WHERE email = '[email protected]';
-- Scans all 1,000,000 rows ��

-- With index: Direct lookup
CREATE INDEX idx_email ON users(email);
-- Finds row in ~10 lookups ✅

Index Types

1. Primary Key Index

Automatically created, enforces uniqueness:

CREATE TABLE users (
  id INT PRIMARY KEY AUTO_INCREMENT,
  email VARCHAR(255),
  name VARCHAR(100)
);

2. Unique Index

Enforces uniqueness:

CREATE UNIQUE INDEX idx_email ON users(email);

3. Regular Index

Most common type:

CREATE INDEX idx_name ON users(name);

4. Composite Index

Multiple columns:

CREATE INDEX idx_name_email ON users(name, email);

5. Full-Text Index

For text search:

CREATE FULLTEXT INDEX idx_content ON posts(title, body);

When to Create Indexes

✅ Create indexes for:

  1. WHERE clauses:
-- Index on status
SELECT * FROM orders WHERE status = 'pending';
  1. JOIN conditions:
-- Index on user_id
SELECT * FROM posts JOIN users ON posts.user_id = users.id;
  1. ORDER BY clauses:
-- Index on created_at
SELECT * FROM posts ORDER BY created_at DESC;
  1. GROUP BY clauses:
-- Index on category_id
SELECT category_id, COUNT(*) FROM products GROUP BY category_id;

Composite Index Strategy

Column Order Matters!

Rule: Most selective columns first

-- Good: status has few values, user_id is more selective
CREATE INDEX idx_user_status ON orders(user_id, status);

-- Bad: status first reduces effectiveness
CREATE INDEX idx_status_user ON orders(status, user_id);

Leftmost Prefix Rule

Composite index (a, b, c) can be used for:

  • WHERE a = 1
  • WHERE a = 1 AND b = 2
  • WHERE a = 1 AND b = 2 AND c = 3
  • WHERE b = 2 (doesn’t use index!)
  • WHERE c = 3 (doesn’t use index!)
CREATE INDEX idx_composite ON orders(user_id, status, created_at);

-- Uses index
EXPLAIN SELECT * FROM orders WHERE user_id = 123;
EXPLAIN SELECT * FROM orders WHERE user_id = 123 AND status = 'pending';

-- Doesn't use index
EXPLAIN SELECT * FROM orders WHERE status = 'pending';

Covering Indexes

An index that contains all columns needed by a query.

-- Query needs: user_id, status, total
SELECT user_id, status, total
FROM orders
WHERE user_id = 123 AND status = 'pending';

-- Covering index: includes all queried columns
CREATE INDEX idx_covering ON orders(user_id, status, total);

Benefit: MySQL doesn’t need to access the table, only the index!

Analyzing Query Performance

EXPLAIN Command

EXPLAIN SELECT * FROM users WHERE email = '[email protected]';

Key columns to watch:

ColumnWhat to Look For
typeALL = bad (full scan), ref/const = good
possible_keysShows available indexes
keyWhich index MySQL actually uses
rowsFewer is better
Extra”Using filesort” or “Using temporary” = bad

Example Analysis

-- Bad: Full table scan
EXPLAIN SELECT * FROM orders WHERE status = 'pending';
-- type: ALL, rows: 1000000

-- Add index
CREATE INDEX idx_status ON orders(status);

-- Good: Index scan
EXPLAIN SELECT * FROM orders WHERE status = 'pending';
-- type: ref, rows: 50000, key: idx_status

Real-World Performance Example

Before Indexing

SELECT posts.*, users.name
FROM posts
JOIN users ON posts.user_id = users.id
WHERE posts.status = 'published'
  AND posts.created_at > '2024-01-01'
ORDER BY posts.created_at DESC
LIMIT 20;

-- Execution time: 2.3 seconds
-- Full table scan on posts (5 million rows)

After Indexing

-- Create composite index
CREATE INDEX idx_status_date ON posts(status, created_at);

-- Ensure foreign key is indexed
CREATE INDEX idx_user_id ON posts(user_id);

-- Same query now:
-- Execution time: 0.018 seconds (127x faster!)
-- Uses idx_status_date, scans only ~1000 rows

Common Mistakes

❌ Too Many Indexes

-- Bad: Redundant indexes
CREATE INDEX idx_email ON users(email);
CREATE INDEX idx_email_name ON users(email, name);
-- The second index can handle email-only queries too!

Impact: Slower INSERT/UPDATE/DELETE operations

❌ Indexing Low-Selectivity Columns

-- Bad: gender typically has only 2-3 values
CREATE INDEX idx_gender ON users(gender);

Rule: Only index columns with high cardinality (many unique values)

❌ Not Using LIMIT

-- Bad: Returns millions of rows
SELECT * FROM posts ORDER BY created_at DESC;

-- Good: Use LIMIT
SELECT * FROM posts ORDER BY created_at DESC LIMIT 20;

❌ Using Functions on Indexed Columns

-- Bad: Function on left side prevents index usage
SELECT * FROM users WHERE YEAR(created_at) = 2024;

-- Good: Rewrite to use index
SELECT * FROM users 
WHERE created_at >= '2024-01-01' 
  AND created_at < '2025-01-01';

Index Maintenance

Check Index Usage

-- MySQL 8.0+
SELECT * FROM sys.schema_unused_indexes;

Remove Unused Indexes

-- Drop index if not used
DROP INDEX idx_rarely_used ON users;

Rebuild Indexes

-- Rebuild table and indexes
OPTIMIZE TABLE users;

Advanced Techniques

Index Hints

Force MySQL to use specific index:

SELECT * FROM users USE INDEX (idx_email)
WHERE email = '[email protected]';

Invisible Indexes (MySQL 8.0+)

Test removing an index before actually dropping it:

ALTER TABLE users ALTER INDEX idx_name INVISIBLE;

-- Test queries...

-- If performance is fine, drop it
DROP INDEX idx_name ON users;

Prefix Indexes

For long VARCHAR columns:

-- Index only first 10 characters
CREATE INDEX idx_url ON pages(url(10));

Best Practices Checklist

  • ✅ Index foreign keys
  • ✅ Index WHERE clause columns
  • ✅ Index JOIN columns
  • ✅ Use composite indexes wisely (most selective first)
  • ✅ Consider covering indexes for frequent queries
  • ✅ Monitor index usage with EXPLAIN
  • ✅ Remove unused indexes
  • ❌ Don’t over-index (slows writes)
  • ❌ Don’t index low-cardinality columns
  • ❌ Don’t use functions on indexed columns

Performance Benchmarks

Real results from production database (5 million rows):

Query TypeWithout IndexWith IndexImprovement
WHERE email2.1s0.003s700x faster
JOIN users3.5s0.012s291x faster
ORDER BY date4.2s0.025s168x faster
Complex WHERE5.8s0.045s128x faster

Conclusion

Proper indexing is the single most effective way to improve database performance:

  1. Analyze slow queries with EXPLAIN
  2. Create targeted indexes for WHERE, JOIN, ORDER BY
  3. Use composite indexes with correct column order
  4. Remove unused indexes to speed up writes
  5. Monitor and maintain indexes regularly

With these strategies, you can achieve 10x-100x performance improvements on slow queries!

Further Reading