1.. _improved_memory_engine:
2
3====================================
4 ``Improved MEMORY`` Storage Engine
5====================================
6
7As of ``MySQL`` 5.5.15, a *Fixed Row Format* (``FRF``) is still being used in the ``MEMORY`` storage engine. The fixed row format imposes restrictions on the type of columns as it assigns on advance a limited amount of memory per row. This renders a ``VARCHAR`` field in a ``CHAR`` field in practice and makes impossible to have a ``TEXT`` or ``BLOB`` field with that engine implementation.
8
9To overcome this limitation, the *Improved MEMORY Storage Engine* is introduced in this release for supporting **true** ``VARCHAR``, ``VARBINARY``, ``TEXT`` and ``BLOB`` fields in ``MEMORY`` tables.
10
11This implementation is based on the *Dynamic Row Format* (``DFR``) introduced by the `mysql-heap-dynamic-rows <http://code.google.com/p/mysql-heap-dynamic-rows/>`_ patch.
12
13``DFR`` is used to store column values in a variable-length form, thus helping to decrease memory footprint of those columns and making possible ``BLOB`` and ``TEXT`` fields and real ``VARCHAR`` and ``VARBINARY``.
14
15Unlike the fixed implementation, each column value in ``DRF`` only uses as much space as required. This is, for variable-length values, up to 4 bytes is used to store the actual value length, and then only the necessary number of blocks is used to store the value.
16
17Rows in ``DFR`` are represented internally by multiple memory blocks, which means that a single row can consist of multiple blocks organized into one set. Each row occupies at least one block, there can not be multiple rows within a single block. Block size can be configured when creating a table (see below).
18
19This ``DFR`` implementation has two caveats regarding to ordering and indexes.
20
21Caveats
22=======
23
24Ordering of Rows
25----------------
26
27In the absence of ``ORDER BY``, records may be returned in a different order than the previous ``MEMORY`` implementation.
28
29This is not a bug. Any application relying on a specific order without an ``ORDER BY`` clause may deliver unexpected results. A specific order without ``ORDER BY`` is a side effect of a storage engine and query optimizer implementation which may and will change between minor |MySQL| releases.
30
31
32Indexing
33--------
34
35It is currently impossible to use indexes on ``BLOB`` columns due to some limitations of the *Dynamic Row Format*. Trying to create such an index will fail with the following error: ::
36
37  BLOB column '<name>' can't be used in key specification with the used table type.
38
39Restrictions
40============
41
42For performance reasons, a mixed solution is implemented: the fixed format is used at the beginning of the row, while the dynamic one is used for the rest of it.
43
44The size of the fixed-format portion of the record is chosen automatically on ``CREATE TABLE`` and cannot be changed later. This, in particular, means that no indexes can be created later with ``CREATE INDEX`` or ``ALTER TABLE`` when the dynamic row format is used.
45
46All values for columns used in indexes are stored in fixed format at the first block of the row, then the following columns are handled with ``DRF``.
47
48This sets two restrictions to tables:
49
50  * the order of the fields and therefore,
51
52  * the minimum size of the block used in the table.
53
54Ordering of Columns
55-------------------
56
57The columns used in fixed format must be defined before the dynamic ones in the ``CREATE TABLE`` statement. If this requirement is not met, the engine will not be able to add blocks to the set for these fields and they will be treated as fixed.
58
59Minimum Block Size
60------------------
61
62The block size has to be big enough to store all fixed-length information in the first block. If not, the ``CREATE TABLE`` or ``ALTER TABLE`` statements will fail (see below).
63
64Limitations
65===========
66
67|MyISAM| tables are still used for query optimizer internal temporary tables where the ``MEMORY`` tables could be used now instead: for temporary tables containing large ``VARCHAR``s, ``BLOB``, and ``TEXT`` columns.
68
69Setting Row Format
70==================
71
72Taking the restrictions into account, the *Improved MEMORY Storage Engine* will choose ``DRF`` over ``FRF`` at the moment of creating the table according to following criteria:
73
74  * There is an implicit request of the user in the column types **OR**
75
76  * There is an explicit request of the user **AND** the overhead incurred by ``DFR`` is beneficial.
77
78Implicit Request
79----------------
80
81The implicit request by the user is taken when there is at least one ``BLOB`` or ``TEXT`` column in the table definition. If there are none of these columns and no relevant option is given, the engine will choose ``FRF``.
82
83For example, this will yield the use of the dynamic format: ::
84
85  mysql> CREATE TABLE t1 (f1 VARCHAR(32), f2 TEXT, PRIMARY KEY (f1)) ENGINE=HEAP;
86
87While this will not: ::
88
89  mysql> CREATE TABLE t1 (f1 VARCHAR(16), f2 VARCHAR(16), PRIMARY KEY (f1)) ENGINE=HEAP;
90
91Explicit Request
92----------------
93
94The explicit request is set with one of the following options in the ``CREATE TABLE`` statement:
95
96  * ``KEY_BLOCK_SIZE = <value>``
97
98    * Requests the DFR with the specified block size (in bytes)
99
100  * ``ROW_FORMAT = DYNAMIC``
101
102    * Requests the dynamic format with the default block size (256 bytes)
103
104Despite its name, the ``KEY_BLOCK_SIZE`` option refers to a block size used to store data rather then indexes. The reason for this is that an existing ``CREATE TABLE`` option is reused to avoid introducing new ones.
105
106*The Improved MEMORY Engine* checks whether the specified block size is large enough to keep all key column values. If it is too small, table creation will abort with an error.
107
108After ``DRF`` is requested explicitly and there are no ``BLOB`` or ``TEXT`` columns in the table definition, the *Improved MEMORY Engine* will check if using the dynamic format provides any space saving benefits as compared to the fixed one:
109
110  * if the fixed row length is less than the dynamic block size (plus the dynamic row overhead - platform dependent) **OR**
111
112  * there isn't any variable-length columns in the table or ``VARCHAR`` fields are declared with length 31 or less,
113
114the engine will revert to the fixed format as it is more space efficient in such case. The row format being used by the engine can be checked using ``SHOW TABLE STATUS``.
115
116Examples
117========
118
119On a 32-bit platform: ::
120
121  mysql> CREATE TABLE t1 (f1 VARCHAR(32), f2 VARCHAR(32), f3 VARCHAR(32), f4 VARCHAR(32),
122                          PRIMARY KEY (f1)) KEY_BLOCK_SIZE=124 ENGINE=HEAP ROW_FORMAT=DYNAMIC;
123
124  mysql> SHOW TABLE STATUS LIKE 't1';
125  Name	Engine	Version	Row_format	Rows	Avg_row_length	Data_length	Max_data_length	Index_length	Data_free	Auto_increment	Create_time	Update_time	Check_time	Collation	Checksum	Create_options	Comment
126  t1	MEMORY	10	Dynamic	0	X	0	X	0	0	NULL	NULL	NULL	NULL	latin1_swedish_ci	NULL	row_format=DYNAMIC KEY_BLOCK_SIZE=124
127
128On a 64-bit platform: ::
129
130  mysql> CREATE TABLE t1 (f1 VARCHAR(32), f2 VARCHAR(32), f3 VARCHAR(32), f4 VARCHAR(32),
131                          PRIMARY KEY (f1)) KEY_BLOCK_SIZE=124 ENGINE=HEAP ROW_FORMAT=DYNAMIC;
132
133  mysql> SHOW TABLE STATUS LIKE 't1';
134  Name	Engine	Version	Row_format	Rows	Avg_row_length	Data_length	Max_data_length	Index_length	Data_free	Auto_increment	Create_time	Update_time	Check_time	Collation	Checksum	Create_options	Comment
135  t1	MEMORY	10	Fixed	0	X	0	X	0	0	NULL	NULL	NULL	NULL	latin1_swedish_ci	NULL	row_format=DYNAMIC KEY_BLOCK_SIZE=124
136
137Implementation Details
138======================
139
140|MySQL| *MEMORY* tables keep data in arrays of fixed-size chunks. These chunks are organized into two groups of ``HP_BLOCK`` structures:
141
142  * ``group1`` contains indexes, with one ``HP_BLOCK`` per key (part of ``HP_KEYDEF``),
143
144  * ``group2`` contains record data, with a single ``HP_BLOCK`` for all records.
145
146While columns used in indexes are usually small, other columns in the table may need to accommodate larger data. Typically, larger data is placed into ``VARCHAR`` or ``BLOB`` columns.
147
148*The Improved MEMORY Engine* implements the concept of dataspace, ``HP_DATASPACE``, which incorporates the ``HP_BLOCK`` structures for the record data, adding more information for managing variable-sized records.
149
150Variable-size records are stored in multiple “chunks”, which means that a single record of data (a database “row”) can consist of multiple chunks organized into one “set”, contained in ``HP_BLOCK`` structures.
151
152In variable-size format, one record is represented as one or many chunks depending on the actual data, while in fixed-size mode, one record is always represented as one chunk. The index structures would always point to the first chunk in the chunkset.
153
154Variable-size records are necessary only in the presence of variable-size columns. The *Improved Memory Engine* will be looking for ``BLOB`` or ``VARCHAR`` columns with a declared length of 32 or more. If no such columns are found, the table will be switched to the fixed-size format. You should always put such columns at the end of the table definition in order to use the variable-size format.
155
156Whenever data is being inserted or updated in the table, the *Improved Memory Engine* will calculate how many chunks are necessary.
157
158For ``INSERT`` operations, the engine only allocates new chunksets in the recordspace. For ``UPDATE`` operations it will modify the length of the existing chunkset if necessary, unlinking unnecessary chunks at the end, or allocating and adding more if a larger length is needed.
159
160When writing data to chunks or copying data back to a record, fixed-size columns are copied in their full format, while ``VARCHAR`` and ``BLOB`` columns are copied based on their actual length, skipping any ``NULL`` values.
161
162When allocating a new chunkset of N chunks, the engine will try to allocate chunks one-by-one, linking them as they become allocated. For allocating a single chunk, it will attempt to reuse a deleted (freed) chunk. If no free chunks are available, it will try to allocate a new area inside a ``HP_BLOCK``.
163
164When freeing chunks, the engine will place them at the front of a free list in the dataspace, each one containing a reference to the previously freed chunk.
165
166The allocation and contents of the actual chunks varies between fixed and variable-size modes:
167
168  * Format of a fixed-size chunk:
169
170    * ``uchar[]``
171
172      * With ``sizeof=chunk_dataspace_length``, but at least ``sizeof(uchar*)`` bytes. It keeps actual data or pointer to the next deleted chunk, where ``chunk_dataspace_length`` equals to full record length
173
174    * ``uchar``
175
176      * Status field (1 means “in use”, 0 means “deleted”)
177
178  * Format of a variable-size chunk:
179
180      * ``uchar[]``
181
182        * With ``sizeof=chunk_dataspace_length``, but at least ``sizeof(uchar*)`` bytes. It keeps actual data or pointer to the next deleted chunk, where ``chunk_dataspace_length`` is set according to table's ``key_block_size``
183
184    * ``uchar*``
185
186      * Pointer to the next chunk in this chunkset, or NULL for the last chunk
187
188    * ``uchar``
189
190      * Status field (1 means “first”, 0 means “deleted”, 2 means “linked”)
191
192Total chunk length is always aligned to the next ``sizeof(uchar*)``.
193
194See Also
195========
196
197  * `Dynamic row format for MEMORY tables <http://www.mysqlperformanceblog.com/2011/09/06/dynamic-row-format-for-memory-tables/>`_
198