Wednesday, February 27, 2013
MySQL : Allocating memory for caches
If the server is dedicated to MySQL, any memory you don’t reserve for the operating system or for query processing is available for caches.
MySQL needs more memory for caches than anything else. It uses caches to avoid disk access, which are orders of magnitude slower than accessing data in memory.
The operating system may cache some data on MySQL’s behalf (especially for MyISAM), but MySQL needs lots of memory for itself too.
The following are the most important caches to consider for the majority of installations:
• The operating system caches for MyISAM data
• MyISAM key caches
• The InnoDB buffer pool
• The query cache there is other caches, but they generally don’t use much memory.
It is much easier to tune a server if you’re using only one storage engine.
If you’re using only MyISAM tables, you can disable InnoDB completely, and if you’re using only InnoDB, you need to allocate only minimal resources for MyISAM (MySQL uses MyISAM tables internally for some operations). But if you’re using a mixture of storage engines, it can be very hard to figure out the right balance between them.
The MyISAM key caches are also referred to as key buffers; there is one by default, but you can create more. Unlike InnoDB and some other storage engines, MyISAM itself caches only indexes, not data (it lets the operating system cache the data).
If you use mostly MyISAM, you should allocate a lot of memory to the key caches. The most important option is the key_buffer_size, which you should try setting to between 25% and 50% of the amount of memory you reserved for caches.
The remainder will be available for the operating system caches, which the operating system will usually fill with data from MyISAM’s.
MySQL 5.0 has a hard upper limit of 4 GB for this variable, no matter what architecture you’re running. (MySQL 5.1 allows larger sizes. Check the current documentation for your version of the server.) By default MyISAM caches all indexes in the default key buffer, but you can create multiple named key buffers. This lets you keep more than 4 GB of indexes in memory at once. To create key buffers named key_buffer_1 and key_buffer_2, each sized at 1 GB, place the following in the configuration file:
key_buffer_size = 1G key_buffer_2.key_buffer_size = 1G Now there are three key buffers: the two explicitly created by those lines and the default buffer. You can use the CACHE INDEX command to map tables to caches. You can also tell MySQL to use key_buffer_1 for the indexes from tables t1 and t2 with the following SQL statement:
>CACHE INDEX t1, t2 IN key_buffer_1;
Now when MySQL reads blocks from the indexes on these tables, it will cache the blocks in the specified buffer. You can also preload the tables’ indexes into the cache with the LOAD INDEX command:
>LOAD INDEX INTO CACHE t1, t2;
You can place this SQL into a file that’s executed when MySQL starts up. The filename must be specified in the init_file option, and the file can include multiple SQL commands, each on a single line (no comments are allowed).
Any indexes you don’t explicitly map to a key buffer will be assigned to the default buffer the first time MySQL needs to access the .
You can monitor the performance and usage of the key buffers with information from SHOW STATUS and SHOW VARIABLES. You can calculate the hit ratio and the percentage of the buffer in use with these equations:
Cache hit ratio 100 - ( (Key_reads * 100) / Key_read_requests )
Percentage of buffer in use
100 - ( (Key_blocks_unused * key_cache_block_size) * 100 / key_buffer_size )
It’s good to know the cache hit rate, but this number can be misleading.
For example, the difference between 99% and 99.9% looks small, but it really represents a tenfold increase. The cache hit rate is also application-dependent: some applications might work fine at 95%, whereas others might be I/O-bound at 99.9%.
You might even be able to get a 99.99% hit rate with properly sized caches. The number of cache misses per second is generally much more empirically useful.
Suppose you have a single hard drive that can do 100 random reads per second. Five misses per second will not cause your workload to be I/O-bound, but 80 per second will likely cause problems.
You can use the following equation to calculate this value cache misses:
Key_reads / Uptime Calculate the number of misses incrementally over intervals of 10 to 100 seconds, so you can get an idea of the current performance.
The following command will show the incremental values every 10 seconds:
$ mysqladmin extended-status -r -i 10 | grep Key_reads
When you’re deciding how much memory to allocate to the key caches, it might help to know how much space your MyISAM indexes are actually using on disk.
You don’t need to make the key buffers larger than the data they will cache. Space your MyISAM indexes are actually using on disk $ du -sch `find /path/to/mysql/data/directory/ -name "*.MYI"`
Remember that MyISAM uses the operating system cache for the data files, which are often larger than the indexes.
Therefore, it often makes sense to leave more memory for the operating system cache than for the key caches. Finally, even if you don’t have any MyISAM tables, bear in mind that you still need to set key_buffer_size to a small amount of memory, such as 32M.
The MySQL server sometimes uses MyISAM tables for internal purposes, such as temporary tables for GROUP BY queries. The MyISAM key block size The key block size is important (especially for write-intensive workloads) because of the way it causes MyISAM, the operating system cache, and the filesystem to interact. If the key block size is too small, you may encounter read-around writes, which are writes that the operating system cannot perform without first reading some data from the disk. Here’s how a read-around write happens, assuming the operating system’s page size is 4 KB (typically true on the x86 architecture) and the key block size is 1 KB:
1. MyISAM requests a 1 KB key block from disk.
2. The operating system reads 4 KB of data from the disk and caches it, then passes the desired 1 KB of data to MyISAM.
3. The operating system discards the cached data in favor of some other data.
4. MyISAM modifies the 1 KB key block and asks the operating system to write it back to disk.
5. The operating system reads the same 4 KB of data from the disk into the operating system cache, modifies the 1 KB that MyISAM changed, and writes the entire 4 KB back to disk.
The read-around write happened in step 5, when MyISAM asked the operating system to write only part of a 4 KB page. If MyISAM’s block size had matched the operating system’s, the disk read in step 5 could have been avoided.
Unfortunately, in MySQL 5.0 and earlier, there’s no way to configure the key block size.
However, in MySQL 5.1 and later, you can avoid read-around writes by making MyISAM’s key block size the same as the operating system’s.
The myisam_block_size variable controls the key block size. You can also specify the size for each key with the KEY_BLOCK_SIZE option in a
CREATE TABLE or CREATE INDEX statement,
but because all keys are stored in the same file, you really need all of them to have blocks as large as or larger than the operating system’s to avoid alignment issues that could still cause read-around writes. (For example, if one key has 1 KB blocks and another has 4 KB blocks, the 4 KB block boundaries might not match the operating system’s page boundaries.)
The InnoDB Buffer Pool If you use mostly InnoDB tables, the InnoDB buffer pool probably needs more memory than anything else.
Unlike the MyISAM key cache, the InnoDB buffer pool doesn’t just cache indexes:
it also holds row data, the adaptive hash index, the insert buffer, locks, and other internal structures.
InnoDB also uses the buffer pool to help it delay writes, so it can merge many writes together and perform them sequentially. In short, InnoDB relies heavily on the buffer pool, and you should be sure to allocate enough memory to it.
The MySQL manual suggests using up to 80% of the machine’s physical memory for the buffer pool on a dedicated server; in reality, you can use more than that if the machine has a lot of memory.
As with the MyISAM key buffers, you can use variables from SHOW commands or tools such as innotop to monitor your InnoDB buffer pool’s memory usage and performance.
There’s no equivalent of LOAD INDEX INTO CACHE for InnoDB tables.
However, if you’re trying to warm up a server and get it ready to handle a heavy load, you can issue queries that perform full table scans or full index scans. In most cases, you should make the InnoDB buffer pool as large as your available memory allows. However, in rare circumstances, very large buffer pools (say, 50 GB) can cause long stalls.
For example, a large buffer pool may become slow during checkpoints or insert buffer merge operations, and concurrency can drop as a result of locking.
If you experience these problems, you may have to reduce the buffer pool size. You can change the innodb_max_dirty_pages_pct variable to instruct InnoDB to keep more or fewer dirty (modified) pages in the buffer pool.
If you allow a lot of dirty pages, InnoDB can take a long time to shut down, because it writes the dirty pages to the data files upon shutdown. You can monitor the number of dirty pages by watching the Innodb_buffer_pool_pages_dirty server status variable or using innotop to monitor SHOW INNODB STATUS.
Lowering the value of the innodb_max_dirty_pages_pct variable doesn’t actually guarantee that InnoDB will keep fewer dirty pages in the buffer pool. Instead, it controls the threshold at which InnoDB stops being “lazy.” InnoDB’s default behavior is to flush dirty pages with a background thread, merging writes together and performing them sequentially for efficiency. This behavior is called “lazy” because it lets InnoDB delay flushing dirty pages in the buffer pool, unless it needs to use the space for some other data. When the percentage of dirty pages exceeds the threshold, InnoDB will flush pages as quickly as it can to try to keep the dirty page count lower.
The variable’s default value is 90, so by default InnoDB will flush lazily until the buffer pool is 90% full of dirty pages. You can tweak the threshold for your workload if you wish to spread out the writes a bit more. For example, lowering it to 50 will generally cause InnoDB to do more write operations, because it will flush pages sooner and therefore be unable to batch the writes as well. However, if your workload has a lot of write spikes, using a lower value may help InnoDB absorb the spikes better: it will have more “spare” memory to hold dirty pages, so it won’t have to wait for other dirty pages to be flushed to disk.
The Thread Cache The thread cache holds threads that aren’t currently associated with a connection but are ready to serve new connections. When there’s a thread in the cache and a new connection is created, MySQL removes the thread from the cache and gives it to the new connection. When the connection is closed, MySQL places the thread back into the cache, if there’s room. If isn’t room, MySQL destroys the thread. As long as MySQL has a free thread in the cache, it can respond very rapidly to connect requests, because it doesn’t have to create a new thread for each connection. The thread_cache_size variable specifies the number of threads MySQL can keep in the cache. You probably won’t need to tune this value, unless your server gets many connection requests.
To check whether the thread cache is large enough, watch the Threads_created status variable. We generally try to keep the thread cache large enough that we see fewer than 10 new threads created each second, but it’s often pretty easy to get this number lower than 1 per second. A good approach is to watch the
Threads_connected variable and try to set thread_cache_size large enough to handle the typical fluctuation in your workload. For example, if Threads_connected usually stays between 100 and 200, you can set the cache size to 100. If it stays between 500 and 700, a thread cache of 200 should be large enough. Think of it this way: at 700 connections, there are probably no threads in the cache; at 500 connections, there are 200 cached threads ready to be used if the load increases to 700 again. Making the thread cache very large is probably not necessary for most uses, but keeping it small doesn’t save much memory, so there’s little benefit in doing so. Each thread that’s in the thread cache or sleeping typically uses around 256 KB of memory.
This is very little compared to the amount of memory a thread can use when a connection is actively processing a query. In general, you should keep your thread cache large enough that Threads_created doesn’t increase very often. If this is a very large number, however (e.g., many thousand threads), you might want to set it lower because some operating systems don’t handle very large numbers of threads well, even when most of them are sleeping.
The Table Cache The table cache is similar in concept to the thread cache, but it stores objects that represent tables. Each object in the cache contains the associated table’s parsed .frm file, plus other data. Exactly what else is in the object depends on the table’s storage engine. For example, for MyISAM, it holds the table data and/or index file descriptors. For merge tables it may hold many file descriptors, because merge tables can have many underlying tables. The table cache can help you reuse resources.
For instance, when a query requests access to a MyISAM table, MySQL might be able to give it a file descriptor from the cached object instead of opening the file.
The table cache can also help avoid some of the I/O required for marking a MyISAM table as “in use” in the index headers.*The table cache’s design is a little MyISAM-centric—this is one of the areas where the separation between the server and the storage engines is not completely clean, for historical reasons. The table cache is a little less important for InnoDB, because InnoDB doesn’t rely on it for as many purposes (such as holding file descriptors; it has its own version of a table cache for this purpose). However, even InnoDB benefits from caching the parsed .frm files.
In MySQL 5.1, the table cache is separated into two parts: a cache of open tables and a table definition cache (configured via the table_open_cache and table_definition_cache variables). Thus, the table definitions (the parsed .frm files) are separated from the other resources, such as file descriptors. Opened tables are still per-thread, pertable-used, but the table definitions are global and can be shared among all connections efficiently. You can generally set table_definition_cache high enough to cache all your table definitions. Unless you have tens of thousands of tables, this is likely to be the easiest approach. If the Opened_tables status variable is large or increasing, the table cache isn’t large enough, and you should increase the table_cache system variable (or table_open_cache, in MySQL 5.1).
The only real downside to making the table cache very large is that it might cause longer shutdown times when your server has a lot of MyISAM tables, because the key blocks have to be flushed and the tables have to be marked as no longer open.
It can also make FLUSH TABLES WITH READ LOCK take a long time complete, for the same reason. If you get errors indicating that MySQL can’t open any more files (use the perror utility to check what the error number means), you might also need to increase the number of files MySQL is allowed to keep open. You can do this with the open_files_limit server variable in your my.cnf file.
The thread and table caches don’t really use much memory, and they are beneficial because they conserve resources. Although creating a new thread and opening a new file aren’t really expensive compared to other things MySQL might do, the overhead can add up quickly under a high-concurrency workload. Caching threads and tables can improve efficiency.