Database Indexing Best Practices for MySQL
Database Indexing Best Practices for MySQL
Proper indexing can make the difference between a 2-second query and a 0.02-second query. Let’s dive into MySQL indexing strategies.
Understanding Indexes
An index is like a book’s table of contents - it helps MySQL find data without scanning every row.
How Indexes Work
-- Without index: Full table scan
SELECT * FROM users WHERE email = '[email protected]';
-- Scans all 1,000,000 rows ��
-- With index: Direct lookup
CREATE INDEX idx_email ON users(email);
-- Finds row in ~10 lookups ✅
Index Types
1. Primary Key Index
Automatically created, enforces uniqueness:
CREATE TABLE users (
id INT PRIMARY KEY AUTO_INCREMENT,
email VARCHAR(255),
name VARCHAR(100)
);
2. Unique Index
Enforces uniqueness:
CREATE UNIQUE INDEX idx_email ON users(email);
3. Regular Index
Most common type:
CREATE INDEX idx_name ON users(name);
4. Composite Index
Multiple columns:
CREATE INDEX idx_name_email ON users(name, email);
5. Full-Text Index
For text search:
CREATE FULLTEXT INDEX idx_content ON posts(title, body);
When to Create Indexes
✅ Create indexes for:
- WHERE clauses:
-- Index on status
SELECT * FROM orders WHERE status = 'pending';
- JOIN conditions:
-- Index on user_id
SELECT * FROM posts JOIN users ON posts.user_id = users.id;
- ORDER BY clauses:
-- Index on created_at
SELECT * FROM posts ORDER BY created_at DESC;
- GROUP BY clauses:
-- Index on category_id
SELECT category_id, COUNT(*) FROM products GROUP BY category_id;
Composite Index Strategy
Column Order Matters!
Rule: Most selective columns first
-- Good: status has few values, user_id is more selective
CREATE INDEX idx_user_status ON orders(user_id, status);
-- Bad: status first reduces effectiveness
CREATE INDEX idx_status_user ON orders(status, user_id);
Leftmost Prefix Rule
Composite index (a, b, c) can be used for:
- ✅
WHERE a = 1 - ✅
WHERE a = 1 AND b = 2 - ✅
WHERE a = 1 AND b = 2 AND c = 3 - ❌
WHERE b = 2(doesn’t use index!) - ❌
WHERE c = 3(doesn’t use index!)
CREATE INDEX idx_composite ON orders(user_id, status, created_at);
-- Uses index
EXPLAIN SELECT * FROM orders WHERE user_id = 123;
EXPLAIN SELECT * FROM orders WHERE user_id = 123 AND status = 'pending';
-- Doesn't use index
EXPLAIN SELECT * FROM orders WHERE status = 'pending';
Covering Indexes
An index that contains all columns needed by a query.
-- Query needs: user_id, status, total
SELECT user_id, status, total
FROM orders
WHERE user_id = 123 AND status = 'pending';
-- Covering index: includes all queried columns
CREATE INDEX idx_covering ON orders(user_id, status, total);
Benefit: MySQL doesn’t need to access the table, only the index!
Analyzing Query Performance
EXPLAIN Command
EXPLAIN SELECT * FROM users WHERE email = '[email protected]';
Key columns to watch:
| Column | What to Look For |
|---|---|
| type | ALL = bad (full scan), ref/const = good |
| possible_keys | Shows available indexes |
| key | Which index MySQL actually uses |
| rows | Fewer is better |
| Extra | ”Using filesort” or “Using temporary” = bad |
Example Analysis
-- Bad: Full table scan
EXPLAIN SELECT * FROM orders WHERE status = 'pending';
-- type: ALL, rows: 1000000
-- Add index
CREATE INDEX idx_status ON orders(status);
-- Good: Index scan
EXPLAIN SELECT * FROM orders WHERE status = 'pending';
-- type: ref, rows: 50000, key: idx_status
Real-World Performance Example
Before Indexing
SELECT posts.*, users.name
FROM posts
JOIN users ON posts.user_id = users.id
WHERE posts.status = 'published'
AND posts.created_at > '2024-01-01'
ORDER BY posts.created_at DESC
LIMIT 20;
-- Execution time: 2.3 seconds
-- Full table scan on posts (5 million rows)
After Indexing
-- Create composite index
CREATE INDEX idx_status_date ON posts(status, created_at);
-- Ensure foreign key is indexed
CREATE INDEX idx_user_id ON posts(user_id);
-- Same query now:
-- Execution time: 0.018 seconds (127x faster!)
-- Uses idx_status_date, scans only ~1000 rows
Common Mistakes
❌ Too Many Indexes
-- Bad: Redundant indexes
CREATE INDEX idx_email ON users(email);
CREATE INDEX idx_email_name ON users(email, name);
-- The second index can handle email-only queries too!
Impact: Slower INSERT/UPDATE/DELETE operations
❌ Indexing Low-Selectivity Columns
-- Bad: gender typically has only 2-3 values
CREATE INDEX idx_gender ON users(gender);
Rule: Only index columns with high cardinality (many unique values)
❌ Not Using LIMIT
-- Bad: Returns millions of rows
SELECT * FROM posts ORDER BY created_at DESC;
-- Good: Use LIMIT
SELECT * FROM posts ORDER BY created_at DESC LIMIT 20;
❌ Using Functions on Indexed Columns
-- Bad: Function on left side prevents index usage
SELECT * FROM users WHERE YEAR(created_at) = 2024;
-- Good: Rewrite to use index
SELECT * FROM users
WHERE created_at >= '2024-01-01'
AND created_at < '2025-01-01';
Index Maintenance
Check Index Usage
-- MySQL 8.0+
SELECT * FROM sys.schema_unused_indexes;
Remove Unused Indexes
-- Drop index if not used
DROP INDEX idx_rarely_used ON users;
Rebuild Indexes
-- Rebuild table and indexes
OPTIMIZE TABLE users;
Advanced Techniques
Index Hints
Force MySQL to use specific index:
SELECT * FROM users USE INDEX (idx_email)
WHERE email = '[email protected]';
Invisible Indexes (MySQL 8.0+)
Test removing an index before actually dropping it:
ALTER TABLE users ALTER INDEX idx_name INVISIBLE;
-- Test queries...
-- If performance is fine, drop it
DROP INDEX idx_name ON users;
Prefix Indexes
For long VARCHAR columns:
-- Index only first 10 characters
CREATE INDEX idx_url ON pages(url(10));
Best Practices Checklist
- ✅ Index foreign keys
- ✅ Index WHERE clause columns
- ✅ Index JOIN columns
- ✅ Use composite indexes wisely (most selective first)
- ✅ Consider covering indexes for frequent queries
- ✅ Monitor index usage with EXPLAIN
- ✅ Remove unused indexes
- ❌ Don’t over-index (slows writes)
- ❌ Don’t index low-cardinality columns
- ❌ Don’t use functions on indexed columns
Performance Benchmarks
Real results from production database (5 million rows):
| Query Type | Without Index | With Index | Improvement |
|---|---|---|---|
| WHERE email | 2.1s | 0.003s | 700x faster |
| JOIN users | 3.5s | 0.012s | 291x faster |
| ORDER BY date | 4.2s | 0.025s | 168x faster |
| Complex WHERE | 5.8s | 0.045s | 128x faster |
Conclusion
Proper indexing is the single most effective way to improve database performance:
- Analyze slow queries with EXPLAIN
- Create targeted indexes for WHERE, JOIN, ORDER BY
- Use composite indexes with correct column order
- Remove unused indexes to speed up writes
- Monitor and maintain indexes regularly
With these strategies, you can achieve 10x-100x performance improvements on slow queries!