xref: /qemu/docs/xbzrle.txt (revision cbde7be9)
134c26412SOrit WassermanXBZRLE (Xor Based Zero Run Length Encoding)
234c26412SOrit Wasserman===========================================
334c26412SOrit Wasserman
434c26412SOrit WassermanUsing XBZRLE (Xor Based Zero Run Length Encoding) allows for the reduction
534c26412SOrit Wassermanof VM downtime and the total live-migration time of Virtual machines.
634c26412SOrit WassermanIt is particularly useful for virtual machines running memory write intensive
734c26412SOrit Wassermanworkloads that are typical of large enterprise applications such as SAP ERP
834c26412SOrit WassermanSystems, and generally speaking for any application that uses a sparse memory
934c26412SOrit Wassermanupdate pattern.
1034c26412SOrit Wasserman
1134c26412SOrit WassermanInstead of sending the changed guest memory page this solution will send a
1234c26412SOrit Wassermancompressed version of the updates, thus reducing the amount of data sent during
1334c26412SOrit Wassermanlive migration.
1434c26412SOrit WassermanIn order to be able to calculate the update, the previous memory pages need to
1534c26412SOrit Wassermanbe stored on the source. Those pages are stored in a dedicated cache
1634c26412SOrit Wasserman(hash table) and are accessed by their address.
1734c26412SOrit WassermanThe larger the cache size the better the chances are that the page has already
1834c26412SOrit Wassermanbeen stored in the cache.
1934c26412SOrit WassermanA small cache size will result in high cache miss rate.
2034c26412SOrit WassermanCache size can be changed before and during migration.
2134c26412SOrit Wasserman
2234c26412SOrit WassermanFormat
2334c26412SOrit Wasserman=======
2434c26412SOrit Wasserman
2534c26412SOrit WassermanThe compression format performs a XOR between the previous and current content
2634c26412SOrit Wassermanof the page, where zero represents an unchanged value.
2734c26412SOrit WassermanThe page data delta is represented by zero and non zero runs.
2834c26412SOrit WassermanA zero run is represented by its length (in bytes).
2934c26412SOrit WassermanA non zero run is represented by its length (in bytes) and the new data.
3034c26412SOrit WassermanThe run length is encoded using ULEB128 (http://en.wikipedia.org/wiki/LEB128)
3134c26412SOrit Wasserman
3234c26412SOrit WassermanThere can be more than one valid encoding, the sender may send a longer encoding
3334c26412SOrit Wassermanfor the benefit of reducing computation cost.
3434c26412SOrit Wasserman
3534c26412SOrit Wassermanpage = zrun nzrun
3634c26412SOrit Wasserman       | zrun nzrun page
3734c26412SOrit Wasserman
3834c26412SOrit Wassermanzrun = length
3934c26412SOrit Wasserman
4034c26412SOrit Wassermannzrun = length byte...
4134c26412SOrit Wasserman
4234c26412SOrit Wassermanlength = uleb128 encoded integer
4334c26412SOrit Wasserman
4434c26412SOrit WassermanOn the sender side XBZRLE is used as a compact delta encoding of page updates,
457c2b0f65SCao jinretrieving the old page content from the cache (default size of 64MB). The
4634c26412SOrit Wassermanreceiving side uses the existing page's content and XBZRLE to decode the new
4734c26412SOrit Wassermanpage's content.
4834c26412SOrit Wasserman
4934c26412SOrit WassermanThis work was originally based on research results published
5034c26412SOrit WassermanVEE 2011: Evaluation of Delta Compression Techniques for Efficient Live
5134c26412SOrit WassermanMigration of Large Virtual Machines by Benoit, Svard, Tordsson and Elmroth.
5234c26412SOrit WassermanAdditionally the delta encoder XBRLE was improved further using the XBZRLE
5334c26412SOrit Wassermaninstead.
5434c26412SOrit Wasserman
5534c26412SOrit WassermanXBZRLE has a sustained bandwidth of 2-2.5 GB/s for typical workloads making it
5634c26412SOrit Wassermanideal for in-line, real-time encoding such as is needed for live-migration.
5734c26412SOrit Wasserman
5834c26412SOrit WassermanExample
5934c26412SOrit Wassermanold buffer:
6034c26412SOrit Wasserman1001 zeros
6134c26412SOrit Wasserman05 06 07 08 09 0a 0b 0c 0d 0e 0f 10 11 12 13 68 00 00 6b 00 6d
6234c26412SOrit Wasserman3074 zeros
6334c26412SOrit Wasserman
6434c26412SOrit Wassermannew buffer:
6534c26412SOrit Wasserman1001 zeros
6634c26412SOrit Wasserman01 02 03 04 05 06 07 08 09 0a 0b 0c 0d 0e 0f 68 00 00 67 00 69
6734c26412SOrit Wasserman3074 zeros
6834c26412SOrit Wasserman
6934c26412SOrit Wassermanencoded buffer:
7034c26412SOrit Wasserman
7134c26412SOrit Wassermanencoded length 24
7234c26412SOrit Wassermane9 07 0f 01 02 03 04 05 06 07 08 09 0a 0b 0c 0d 0e 0f 03 01 67 01 01 69
7334c26412SOrit Wasserman
7427af7d6eSChenLiangCache update strategy
7527af7d6eSChenLiang=====================
767c2b0f65SCao jinKeeping the hot pages in the cache is effective for decreasing cache
7727af7d6eSChenLiangmisses. XBZRLE uses a counter as the age of each page. The counter will
7827af7d6eSChenLiangincrease after each ram dirty bitmap sync. When a cache conflict is
7927af7d6eSChenLiangdetected, XBZRLE will only evict pages in the cache that are older than
8027af7d6eSChenLianga threshold.
8127af7d6eSChenLiang
8234c26412SOrit WassermanUsage
8334c26412SOrit Wasserman======================
8434c26412SOrit Wasserman1. Verify the destination QEMU version is able to decode the new format.
8534c26412SOrit Wasserman    {qemu} info migrate_capabilities
8634c26412SOrit Wasserman    {qemu} xbzrle: off , ...
8734c26412SOrit Wasserman
8834c26412SOrit Wasserman2. Activate xbzrle on both source and destination:
8934c26412SOrit Wasserman   {qemu} migrate_set_capability xbzrle on
9034c26412SOrit Wasserman
9134c26412SOrit Wasserman3. Set the XBZRLE cache size - the cache size is in MBytes and should be a
9234c26412SOrit Wassermanpower of 2. The cache default value is 64MBytes. (on source only)
9306b1c6f8SMao Zhongyi    {qemu} migrate_set_parameter xbzrle-cache-size 256m
9406b1c6f8SMao Zhongyi
9534c26412SOrit Wasserman4. Start outgoing migration
9634c26412SOrit Wasserman    {qemu} migrate -d tcp:destination.host:4444
9734c26412SOrit Wasserman    {qemu} info migrate
9834c26412SOrit Wasserman    capabilities: xbzrle: on
9934c26412SOrit Wasserman    Migration status: active
10034c26412SOrit Wasserman    transferred ram: A kbytes
10134c26412SOrit Wasserman    remaining ram: B kbytes
10234c26412SOrit Wasserman    total ram: C kbytes
10334c26412SOrit Wasserman    total time: D milliseconds
10434c26412SOrit Wasserman    duplicate: E pages
10534c26412SOrit Wasserman    normal: F pages
10634c26412SOrit Wasserman    normal bytes: G kbytes
10734c26412SOrit Wasserman    cache size: H bytes
10834c26412SOrit Wasserman    xbzrle transferred: I kbytes
10934c26412SOrit Wasserman    xbzrle pages: J pages
110afb5d01cSMao Zhongyi    xbzrle cache miss: K pages
111*6bcd361aSMao Zhongyi    xbzrle cache miss rate: L
112*6bcd361aSMao Zhongyi    xbzrle encoding rate: M
113*6bcd361aSMao Zhongyi    xbzrle overflow: N
11434c26412SOrit Wasserman
115*6bcd361aSMao Zhongyixbzrle cache miss: the number of cache misses to date - high cache-miss rate
11634c26412SOrit Wassermanindicates that the cache size is set too low.
11734c26412SOrit Wassermanxbzrle overflow: the number of overflows in the decoding which where the delta
11834c26412SOrit Wassermancould not be compressed. This can happen if the changes in the pages are too
11934c26412SOrit Wassermanlarge or there are many short changes; for example, changing every second byte
12034c26412SOrit Wasserman(half a page).
12134c26412SOrit Wasserman
12234c26412SOrit WassermanTesting: Testing indicated that live migration with XBZRLE was completed in 110
12334c26412SOrit Wassermanseconds, whereas without it would not be able to complete.
12434c26412SOrit Wasserman
12534c26412SOrit WassermanA simple synthetic memory r/w load generator:
12634c26412SOrit Wasserman..    include <stdlib.h>
12734c26412SOrit Wasserman..    include <stdio.h>
12834c26412SOrit Wasserman..    int main()
12934c26412SOrit Wasserman..    {
13034c26412SOrit Wasserman..        char *buf = (char *) calloc(4096, 4096);
13134c26412SOrit Wasserman..        while (1) {
13234c26412SOrit Wasserman..            int i;
13334c26412SOrit Wasserman..            for (i = 0; i < 4096 * 4; i++) {
13434c26412SOrit Wasserman..                buf[i * 4096 / 4]++;
13534c26412SOrit Wasserman..            }
13634c26412SOrit Wasserman..            printf(".");
13734c26412SOrit Wasserman..        }
13834c26412SOrit Wasserman..    }
139