Large-Scale Data Import and Export
Note: “Large-scale” here refers to importing and exporting 100,000+ records from text files (including Office documents such as docx, xlsx).
MySQL has two very common table types: MyISAM and InnoDB. There are many comparisons of these two types online. The main difference is that InnoDB supports transactions and row-level locking, while MyISAM does not. MyISAM is relatively more efficient, and each has its own advantages and disadvantages.
Use Cases for MyISAM and InnoDB
MyISAM is suitable for:
- Frequent COUNT calculations;
- Infrequent inserts but frequent queries;
- No transactions needed.
InnoDB is suitable for:
- High reliability requirements or transaction support needed;
- Frequent table updates and queries with high likelihood of table locking.
Differences Between MyISAM and InnoDB
- MyISAM does not support advanced features like transactions, while InnoDB does.
- MyISAM tables do not support foreign keys.
- During write operations (insert, update, delete), MyISAM locks the entire table, while InnoDB locks only the affected rows.
- When your database has heavy writes, updates, and deletions with relatively few queries, or when data integrity requirements are high, choose InnoDB. When your database is primarily read-heavy with relatively few updates and writes, and data integrity requirements are less strict, choose MyISAM, as its query performance is faster than InnoDB.
In summary, InnoDB is a good choice for this scenario.
Optimization Methods for Inserting Large Amounts of Data
When PHP needs to insert large amounts of data, inserting one record at a time results in severe I/O performance bottlenecks.
Demo Code:
| |
There are two methods to solve this problem:
1. Using Transactions (InnoDB supported)
For example, you can commit a transaction every 10,000 or 100,000 records, which greatly reduces I/O.
| |
2. SQL Statement Concatenation
This is a MySQL feature that uses optimized SQL statements: concatenate SQL statements using insert into table () values (),(),(),() and insert them all at once. If the string is too long, you need to configure MySQL by running:
| |
| |
The first method is more commonly used, while the second is mostly for test data.
By default, PHP has a memory limit of 128MB. When ThinkPHP exceeds this memory limit, you can either increase PHP’s default maximum memory or simply remove the APP_DEBUG constant definition in the entry file or set it to false. This will save a lot of memory. Of course, you can also use transaction mode to commit every n records. ThinkPHP’s addAll function can also be used for batch inserts, though it has quite a few bugs.
Starting and Committing Transactions in ThinkPHP
| |
When working with Office files, PHPWord and PHPExcel are recommended.