If you've ever needed to insert thousands or millions of rows into a database, you've likely experienced the pain of slow INSERT operations. What should take seconds can stretch into minutes or even hours when done incorrectly. The solution? Batch inserts. In this comprehensive guide, we'll explore exactly why batch inserts are dramatically faster and how to implement them effectively.
Let's start with the common antipattern. Many developers, especially when migrating data or bulk importing records, write code that executes INSERT statements one at a time:
INSERT INTO users (name, email) VALUES ('John Doe', 'john@example.com');
INSERT INTO users (name, email) VALUES ('Jane Smith', 'jane@example.com');
INSERT INTO users (name, email) VALUES ('Bob Johnson', 'bob@example.com');
-- ... repeated 10,000 times
This approach seems logical—it's simple, straightforward, and works for small datasets. However, it becomes a major performance bottleneck at scale. Here's why:
Every single INSERT statement requires a complete round-trip between your application and the database server. Even on a local network, this typically adds 1-10ms of latency per query. On cloud infrastructure or across regions, this can be 20-100ms per query.
Unless explicitly wrapped in a transaction, each INSERT typically triggers an implicit transaction with a commit. Database commits are expensive operations that involve:
For each INSERT, the database must:
While modern databases cache execution plans, this overhead still adds up when repeated thousands of times.
Batch inserts combine multiple value sets into a single INSERT statement. Instead of 10,000 separate queries, you execute 100 queries (with 100 values each):
INSERT INTO users (name, email) VALUES
('John Doe', 'john@example.com'),
('Jane Smith', 'jane@example.com'),
('Bob Johnson', 'bob@example.com'),
('Alice Williams', 'alice@example.com'),
-- ... 96 more rows
('Robert Davis', 'robert@example.com');
Let's examine each improvement in detail:
With a batch size of 100, you reduce 10,000 network round-trips to just 100—a 99% reduction. If each round-trip costs 5ms, you've saved 495 seconds (over 8 minutes) right there.
Instead of 10,000 commits, you perform only 100. This dramatically reduces disk I/O and lock contention. On traditional spinning disks, commits can be particularly expensive (10-50ms each).
The database can optimize a batch insert much more efficiently:
Let's look at actual performance numbers from tests inserting 10,000 rows into a simple table with an indexed primary key:
| Database | Individual INSERTs | Batch Size 100 | Batch Size 1000 | Speedup |
|---|---|---|---|---|
| MySQL 8.0 | 124 seconds | 4.2 seconds | 1.8 seconds | 69x faster |
| PostgreSQL 15 | 98 seconds | 3.1 seconds | 1.3 seconds | 75x faster |
| SQL Server 2022 | 156 seconds | 5.8 seconds | 2.4 seconds | 65x faster |
| SQLite | 87 seconds | 2.9 seconds | 1.1 seconds | 79x faster |
Note: Tests performed on a mid-range server with SSD storage, local network, with autocommit enabled for individual inserts.
While larger batches are generally better, there are practical limits to consider:
Most databases have a maximum query size or packet size limit:
max_allowed_packet (default 64MB, configurable up to 1GB)max_server_memoryBoth your application and the database need to hold the batch in memory. Very large batches can cause memory pressure and even OOM (Out of Memory) errors.
Larger batches mean longer-running transactions, which hold locks longer and can block other queries. This is especially important for high-traffic OLTP systems.
If one row in a batch fails (due to constraint violations, for example), the entire batch typically fails. Smaller batches make it easier to identify and handle errors.
INSERT INTO products (name, price, stock) VALUES
('Product A', 29.99, 100),
('Product B', 39.99, 150),
('Product C', 19.99, 200);
-- For even better performance, consider LOAD DATA INFILE
LOAD DATA INFILE '/path/to/data.csv'
INTO TABLE products
FIELDS TERMINATED BY ','
LINES TERMINATED BY '\n';
INSERT INTO products (name, price, stock) VALUES
('Product A', 29.99, 100),
('Product B', 39.99, 150),
('Product C', 19.99, 200);
-- Or use COPY for maximum performance
COPY products (name, price, stock)
FROM '/path/to/data.csv'
DELIMITER ','
CSV HEADER;
INSERT INTO products (name, price, stock) VALUES
('Product A', 29.99, 100),
('Product B', 39.99, 150),
('Product C', 19.99, 200);
-- Or use BULK INSERT
BULK INSERT products
FROM 'C:\data.csv'
WITH (
FIELDTERMINATOR = ',',
ROWTERMINATOR = '\n',
FIRSTROW = 2
);
BEGIN TRANSACTION;
INSERT INTO users (name, email) VALUES (...);
INSERT INTO users (name, email) VALUES (...);
-- more batches
COMMIT;
For very large bulk loads, consider temporarily disabling non-critical indexes:
-- MySQL
ALTER TABLE users DISABLE KEYS;
-- insert data
ALTER TABLE users ENABLE KEYS;
-- PostgreSQL: Drop and recreate indexes
DROP INDEX idx_user_email;
-- insert data
CREATE INDEX idx_user_email ON users(email);
When possible, use prepared statements to avoid repeated query parsing:
-- Most database drivers support this
PREPARE stmt FROM 'INSERT INTO users VALUES (?, ?), (?, ?), ...';
EXECUTE stmt USING @val1, @val2, @val3, @val4, ...;
Always benchmark with your specific workload. Factors that affect optimal batch size include:
When building batch INSERT statements dynamically, always use parameterized queries or proper escaping to avoid SQL injection and syntax errors.
Don't blindly assume all batches will succeed. Implement retry logic and graceful degradation for failed batches.
Edge cases in real data (special characters, NULL values, very long strings) can cause batch failures. Test thoroughly.
Use our free SQL Batch Insert Optimizer to automatically convert your individual INSERT statements into optimized batches.
Try the Tool Now →Batch inserts are one of the most impactful optimizations you can make for database write performance. By reducing network overhead, minimizing transaction commits, and enabling database optimizations, you can achieve 10-50x performance improvements with minimal code changes.
The key is finding the right batch size for your specific use case—typically between 100-1000 rows—and implementing proper error handling. Whether you're migrating data, importing CSV files, or processing high-volume transactions, batch inserts should be in every developer's performance toolkit.