1--- 2pageid: bser 3title: BSER Binary Protocol 4layout: docs 5section: Internals 6permalink: docs/bser.html 7--- 8 9The basic JSON protocol in watchman allows quick and easy integration. 10Applications with higher performance requirements may want to consider the 11binary protocol instead. 12 13The binary protocol is enabled by the client sending the byte sequence 14"\x00x\x01". 15 16## PDU 17 18A PDU is prefixed by its length expressed as an encoded integer. This allows 19the peer to determine how much storage is required to read and decode it. 20 21## Arrays 22 23Arrays are indicated by a `0x00` byte value followed by an integer value to 24indicate how many items follow. Then each item is encoded one after the other. 25 26## Objects 27 28Objects are indicated by a `0x01` byte value followed by an integer value to 29indicate the number of properties in the object. Then each key/value pair is 30encoded one after the other. 31 32## Strings 33 34Strings are indicated by a `0x02` byte value followed by an integer value to 35indicate the number of bytes in the string, followed by the bytes of the 36string. 37 38### Encoding 39 40Unlike JSON, strings are not defined as having any particular encoding; they 41are transmitted as binary strings. This is because the underlying filesystem 42APIs don't define any particular encoding for names. 43 44*Exception:* Keys in objects that are defined by watchman commands are always 45ASCII. In general, keys in objects are always UTF-8. 46 47*Rationale:* Several programming languages like Python 3 expect all text to be 48in a particular encoding and make it inconvenient to pass in bytestrings or 49other encodings. Also, the primary purpose of not defining an encoding is that 50filenames don't always have one, and filenames are unlikely to show up as keys. 51 52## Integers 53 54All integers are signed and transmitted in the host byte order of the system 55running the watchman daemon. 56 57 * `0x03` indicates an int8_t. It is followed by the int8_t value. 58 * `0x04` indicates an int16_t. It is followed by the int16_t value. 59 * `0x05` indicates an int32_t. It is followed by the int32_t value. 60 * `0x06` indicates an int64_t. It is followed by the int64_t value. 61 62## Real 63 64A real number is indicated by a `0x07` byte followed by 8 bytes of double value. 65 66## Boolean 67 68 * `0x08` indicates boolean true 69 * `0x09` indicates boolean false 70 71## Null 72 73`0x0a` indicates the null value 74 75## Array of Templated Objects 76 77`0x0b` indicates a compact array of objects follows. Some of the bigger 78datastructures returned by watchman are tabular data expressed as an array 79of objects. This serialization type factors out the repeated object keys 80into a header array listing the keys, followed by an array containing 81all the values of the objects. 82 83To represent missing keys in templated arrays, the `0x0c` encoding value may 84be present. If encountered it is interpreted as meaning that there is no value 85for the key that would have been decoded in this position. This is distinct 86from the null value. 87 88For example: 89 90``` 91[ 92 {"name": "fred", "age": 20}, 93 {"name": "pete", "age": 30}, 94 {"age": 25 }, 95] 96``` 97 98is represented similar to: 99 100``` 101["name", "age"], 102[ 103 "fred", 20, 104 "pete", 30, 105 0x0c, 25 106] 107``` 108 109The precise sequence is: 110 111``` 1120b template 11300 array -- start prop names 1140302 int, 2 -- two prop names 11502 string -- first prop "name" 1160304 int, 4 1176e616d65 "name" 11802 string -- 2nd prop "age" 1190303 int, 3 120616765 "age" 1210303 int, 3 -- there are 3 objects 12202 string -- object 1, prop 1 name=fred 1230304 int, 4 12466726564 "fred" 1250314 int 0x14 -- object 1, prop 2 age=20 12602 string -- object 2, prop 1 name=pete 1270304 int 4 12870657465 "pete" 129031e int, 0x1e -- object 2, prop 2 age=30 1300c skip -- object 3, prop 1, not set 1310319 int, 0x19 -- object 3, prop 2 age=25 132``` 133