1---
2layout: page
3title: C++ Serialization
4---
5
6# C++ Serialization
7
8The Cap'n Proto C++ runtime implementation provides an easy-to-use interface for manipulating
9messages backed by fast pointer arithmetic.  This page discusses the serialization layer of
10the runtime; see [C++ RPC](cxxrpc.html) for information about the RPC layer.
11
12## Example Usage
13
14For the Cap'n Proto definition:
15
16{% highlight capnp %}
17struct Person {
18  id @0 :UInt32;
19  name @1 :Text;
20  email @2 :Text;
21  phones @3 :List(PhoneNumber);
22
23  struct PhoneNumber {
24    number @0 :Text;
25    type @1 :Type;
26
27    enum Type {
28      mobile @0;
29      home @1;
30      work @2;
31    }
32  }
33
34  employment :union {
35    unemployed @4 :Void;
36    employer @5 :Text;
37    school @6 :Text;
38    selfEmployed @7 :Void;
39    # We assume that a person is only one of these.
40  }
41}
42
43struct AddressBook {
44  people @0 :List(Person);
45}
46{% endhighlight %}
47
48You might write code like:
49
50{% highlight c++ %}
51#include "addressbook.capnp.h"
52#include <capnp/message.h>
53#include <capnp/serialize-packed.h>
54#include <iostream>
55
56void writeAddressBook(int fd) {
57  ::capnp::MallocMessageBuilder message;
58
59  AddressBook::Builder addressBook = message.initRoot<AddressBook>();
60  ::capnp::List<Person>::Builder people = addressBook.initPeople(2);
61
62  Person::Builder alice = people[0];
63  alice.setId(123);
64  alice.setName("Alice");
65  alice.setEmail("alice@example.com");
66  // Type shown for explanation purposes; normally you'd use auto.
67  ::capnp::List<Person::PhoneNumber>::Builder alicePhones =
68      alice.initPhones(1);
69  alicePhones[0].setNumber("555-1212");
70  alicePhones[0].setType(Person::PhoneNumber::Type::MOBILE);
71  alice.getEmployment().setSchool("MIT");
72
73  Person::Builder bob = people[1];
74  bob.setId(456);
75  bob.setName("Bob");
76  bob.setEmail("bob@example.com");
77  auto bobPhones = bob.initPhones(2);
78  bobPhones[0].setNumber("555-4567");
79  bobPhones[0].setType(Person::PhoneNumber::Type::HOME);
80  bobPhones[1].setNumber("555-7654");
81  bobPhones[1].setType(Person::PhoneNumber::Type::WORK);
82  bob.getEmployment().setUnemployed();
83
84  writePackedMessageToFd(fd, message);
85}
86
87void printAddressBook(int fd) {
88  ::capnp::PackedFdMessageReader message(fd);
89
90  AddressBook::Reader addressBook = message.getRoot<AddressBook>();
91
92  for (Person::Reader person : addressBook.getPeople()) {
93    std::cout << person.getName().cStr() << ": "
94              << person.getEmail().cStr() << std::endl;
95    for (Person::PhoneNumber::Reader phone: person.getPhones()) {
96      const char* typeName = "UNKNOWN";
97      switch (phone.getType()) {
98        case Person::PhoneNumber::Type::MOBILE: typeName = "mobile"; break;
99        case Person::PhoneNumber::Type::HOME: typeName = "home"; break;
100        case Person::PhoneNumber::Type::WORK: typeName = "work"; break;
101      }
102      std::cout << "  " << typeName << " phone: "
103                << phone.getNumber().cStr() << std::endl;
104    }
105    Person::Employment::Reader employment = person.getEmployment();
106    switch (employment.which()) {
107      case Person::Employment::UNEMPLOYED:
108        std::cout << "  unemployed" << std::endl;
109        break;
110      case Person::Employment::EMPLOYER:
111        std::cout << "  employer: "
112                  << employment.getEmployer().cStr() << std::endl;
113        break;
114      case Person::Employment::SCHOOL:
115        std::cout << "  student at: "
116                  << employment.getSchool().cStr() << std::endl;
117        break;
118      case Person::Employment::SELF_EMPLOYED:
119        std::cout << "  self-employed" << std::endl;
120        break;
121    }
122  }
123}
124{% endhighlight %}
125
126## C++ Feature Usage:  C++11, Exceptions
127
128This implementation makes use of C++11 features.  If you are using GCC, you will need at least
129version 4.7 to compile Cap'n Proto.  If you are using Clang, you will need at least version 3.2.
130These compilers required the flag `-std=c++11` to enable C++11 features -- your code which
131`#include`s Cap'n Proto headers will need to be compiled with this flag.  Other compilers have not
132been tested at this time.
133
134This implementation prefers to handle errors using exceptions.  Exceptions are only used in
135circumstances that should never occur in normal operation.  For example, exceptions are thrown
136on assertion failures (indicating bugs in the code), network failures, and invalid input.
137Exceptions thrown by Cap'n Proto are never part of the interface and never need to be caught in
138correct usage.  The purpose of throwing exceptions is to allow higher-level code a chance to
139recover from unexpected circumstances without disrupting other work happening in the same process.
140For example, a server that handles requests from multiple clients should, on exception, return an
141error to the client that caused the exception and close that connection, but should continue
142handling other connections normally.
143
144When Cap'n Proto code might throw an exception from a destructor, it first checks
145`std::uncaught_exception()` to ensure that this is safe.  If another exception is already active,
146the new exception is assumed to be a side-effect of the main exception, and is either silently
147swallowed or reported on a side channel.
148
149In recognition of the fact that some teams prefer not to use exceptions, and that even enabling
150exceptions in the compiler introduces overhead, Cap'n Proto allows you to disable them entirely
151by registering your own exception callback.  The callback will be called in place of throwing an
152exception.  The callback may abort the process, and is required to do so in certain circumstances
153(when a fatal bug is detected).  If the callback returns normally, Cap'n Proto will attempt
154to continue by inventing "safe" values.  This will lead to garbage output, but at least the program
155will not crash.  Your exception callback should set some sort of a flag indicating that an error
156occurred, and somewhere up the stack you should check for that flag and cancel the operation.
157See the header `kj/exception.h` for details on how to register an exception callback.
158
159## KJ Library
160
161Cap'n Proto is built on top of a basic utility library called KJ.  The two were actually developed
162together -- KJ is simply the stuff which is not specific to Cap'n Proto serialization, and may be
163useful to others independently of Cap'n Proto.  For now, the the two are distributed together.  The
164name "KJ" has no particular meaning; it was chosen to be short and easy-to-type.
165
166As of v0.3, KJ is distributed with Cap'n Proto but built as a separate library.  You may need
167to explicitly link against libraries:  `-lcapnp -lkj`
168
169## Generating Code
170
171To generate C++ code from your `.capnp` [interface definition](language.html), run:
172
173    capnp compile -oc++ myproto.capnp
174
175This will create `myproto.capnp.h` and `myproto.capnp.c++` in the same directory as `myproto.capnp`.
176
177To use this code in your app, you must link against both `libcapnp` and `libkj`.  If you use
178`pkg-config`, Cap'n Proto provides the `capnp` module to simplify discovery of compiler and linker
179flags.
180
181If you use [RPC](cxxrpc.html) (i.e., your schema defines [interfaces](language.html#interfaces)),
182then you will additionally nead to link against `libcapnp-rpc` and `libkj-async`, or use the
183`capnp-rpc` `pkg-config` module.
184
185### Setting a Namespace
186
187You probably want your generated types to live in a C++ namespace.  You will need to import
188`/capnp/c++.capnp` and use the `namespace` annotation it defines:
189
190{% highlight capnp %}
191using Cxx = import "/capnp/c++.capnp";
192$Cxx.namespace("foo::bar::baz");
193{% endhighlight %}
194
195Note that `capnp/c++.capnp` is installed in `$PREFIX/include` (`/usr/local/include` by default)
196when you install the C++ runtime.  The `capnp` tool automatically searches `/usr/include` and
197`/usr/local/include` for imports that start with a `/`, so it should "just work".  If you installed
198somewhere else, you may need to add it to the search path with the `-I` flag to `capnp compile`,
199which works much like the compiler flag of the same name.
200
201## Types
202
203### Primitive Types
204
205Primitive types map to the obvious C++ types:
206
207* `Bool` -> `bool`
208* `IntNN` -> `intNN_t`
209* `UIntNN` -> `uintNN_t`
210* `Float32` -> `float`
211* `Float64` -> `double`
212* `Void` -> `::capnp::Void` (An empty struct; its only value is `::capnp::VOID`)
213
214### Structs
215
216For each struct `Foo` in your interface, a C++ type named `Foo` generated.  This type itself is
217really just a namespace; it contains two important inner classes:  `Reader` and `Builder`.
218
219`Reader` represents a read-only instance of `Foo` while `Builder` represents a writable instance
220(usually, one that you are building).  Both classes behave like pointers, in that you can pass them
221by value and they do not own the underlying data that they operate on.  In other words,
222`Foo::Builder` is like a pointer to a `Foo` while `Foo::Reader` is like a const pointer to a `Foo`.
223
224For every field `bar` defined in `Foo`, `Foo::Reader` has a method `getBar()`.  For primitive types,
225`get` just returns the type, but for structs, lists, and blobs, it returns a `Reader` for the
226type.
227
228{% highlight c++ %}
229// Example Reader methods:
230
231// myPrimitiveField @0 :Int32;
232int32_t getMyPrimitiveField();
233
234// myTextField @1 :Text;
235::capnp::Text::Reader getMyTextField();
236// (Note that Text::Reader may be implicitly cast to const char* and
237// std::string.)
238
239// myStructField @2 :MyStruct;
240MyStruct::Reader getMyStructField();
241
242// myListField @3 :List(Float64);
243::capnp::List<double> getMyListField();
244{% endhighlight %}
245
246`Foo::Builder`, meanwhile, has several methods for each field `bar`:
247
248* `getBar()`:  For primitives, returns the value.  For composites, returns a Builder for the
249  composite.  If a composite field has not been initialized (i.e. this is the first time it has
250  been accessed), it will be initialized to a copy of the field's default value before returning.
251* `setBar(x)`:  For primitives, sets the value to x.  For composites, sets the value to a deep copy
252  of x, which must be a Reader for the type.
253* `initBar(n)`:  Only for lists and blobs.  Sets the field to a newly-allocated list or blob
254  of size n and returns a Builder for it.  The elements of the list are initialized to their empty
255  state (zero for numbers, default values for structs).
256* `initBar()`:  Only for structs.  Sets the field to a newly-allocated struct and returns a
257  Builder for it.  Note that the newly-allocated struct is initialized to the default value for
258  the struct's _type_ (i.e., all-zero) rather than the default value for the field `bar` (if it
259  has one).
260* `hasBar()`:  Only for pointer fields (e.g. structs, lists, blobs).  Returns true if the pointer
261  has been initialized (non-null).  (This method is also available on readers.)
262* `adoptBar(x)`:  Only for pointer fields.  Adopts the orphaned object x, linking it into the field
263  `bar` without copying.  See the section on orphans.
264* `disownBar()`:  Disowns the value pointed to by `bar`, setting the pointer to null and returning
265  its previous value as an orphan.  See the section on orphans.
266
267{% highlight c++ %}
268// Example Builder methods:
269
270// myPrimitiveField @0 :Int32;
271int32_t getMyPrimitiveField();
272void setMyPrimitiveField(int32_t value);
273
274// myTextField @1 :Text;
275::capnp::Text::Builder getMyTextField();
276void setMyTextField(::capnp::Text::Reader value);
277::capnp::Text::Builder initMyTextField(size_t size);
278// (Note that Text::Reader is implicitly constructable from const char*
279// and std::string, and Text::Builder can be implicitly cast to
280// these types.)
281
282// myStructField @2 :MyStruct;
283MyStruct::Builder getMyStructField();
284void setMyStructField(MyStruct::Reader value);
285MyStruct::Builder initMyStructField();
286
287// myListField @3 :List(Float64);
288::capnp::List<double>::Builder getMyListField();
289void setMyListField(::capnp::List<double>::Reader value);
290::capnp::List<double>::Builder initMyListField(size_t size);
291{% endhighlight %}
292
293### Groups
294
295Groups look a lot like a combination of a nested type and a field of that type, except that you
296cannot set, adopt, or disown a group -- you can only get and init it.
297
298### Unions
299
300A named union (as opposed to an unnamed one) works just like a group, except with some additions:
301
302* For each field `foo`, the union reader and builder have a method `isFoo()` which returns true
303  if `foo` is the currently-set field in the union.
304* The union reader and builder also have a method `which()` that returns an enum value indicating
305  which field is currently set.
306* Calling the set, init, or adopt accessors for a field makes it the currently-set field.
307* Calling the get or disown accessors on a field that isn't currently set will throw an
308  exception in debug mode or return garbage when `NDEBUG` is defined.
309
310Unnamed unions differ from named unions only in that the accessor methods from the union's members
311are added directly to the containing type's reader and builder, rather than generating a nested
312type.
313
314See the [example](#example-usage) at the top of the page for an example of unions.
315
316### Lists
317
318Lists are represented by the type `capnp::List<T>`, where `T` is any of the primitive types,
319any Cap'n Proto user-defined type, `capnp::Text`, `capnp::Data`, or `capnp::List<U>`
320(to form a list of lists).
321
322The type `List<T>` itself is not instantiatable, but has two inner classes: `Reader` and `Builder`.
323As with structs, these types behave like pointers to read-only and read-write data, respectively.
324
325Both `Reader` and `Builder` implement `size()`, `operator[]`, `begin()`, and `end()`, as good C++
326containers should.  Note, though, that `operator[]` is read-only -- you cannot use it to assign
327the element, because that would require returning a reference, which is impossible because the
328underlying data may not be in your CPU's native format (e.g., wrong byte order).  Instead, to
329assign an element of a list, you must use `builder.set(index, value)`.
330
331For `List<Foo>` where `Foo` is a non-primitive type, the type returned by `operator[]` and
332`iterator::operator*()` is `Foo::Reader` (for `List<Foo>::Reader`) or `Foo::Builder`
333(for `List<Foo>::Builder`).  The builder's `set` method takes a `Foo::Reader` as its second
334parameter.
335
336For lists of lists or lists of blobs, the builder also has a method `init(index, size)` which sets
337the element at the given index to a newly-allocated value with the given size and returns a builder
338for it.  Struct lists do not have an `init` method because all elements are initialized to empty
339values when the list is created.
340
341### Enums
342
343Cap'n Proto enums become C++11 "enum classes".  That means they behave like any other enum, but
344the enum's values are scoped within the type.  E.g. for an enum `Foo` with value `bar`, you must
345refer to the value as `Foo::BAR`.
346
347To match prevaling C++ style, an enum's value names are converted to UPPERCASE_WITH_UNDERSCORES
348(whereas in the schema language you'd write them in camelCase).
349
350Keep in mind when writing `switch` blocks that an enum read off the wire may have a numeric
351value that is not listed in its definition.  This may be the case if the sender is using a newer
352version of the protocol, or if the message is corrupt or malicious.  In C++11, enums are allowed
353to have any value that is within the range of their base type, which for Cap'n Proto enums is
354`uint16_t`.
355
356### Blobs (Text and Data)
357
358Blobs are manipulated using the classes `capnp::Text` and `capnp::Data`.  These classes are,
359again, just containers for inner classes `Reader` and `Builder`.  These classes are iterable and
360implement `size()` and `operator[]` methods.  `Builder::operator[]` even returns a reference
361(unlike with `List<T>`).  `Text::Reader` additionally has a method `cStr()` which returns a
362NUL-terminated `const char*`.
363
364As a special convenience, if you are using GCC 4.8+ or Clang, `Text::Reader` (and its underlying
365type, `kj::StringPtr`) can be implicitly converted to and from `std::string` format.  This is
366accomplished without actually `#include`ing `<string>`, since some clients do not want to rely
367on this rather-bulky header.  In fact, any class which defines a `.c_str()` method will be
368implicitly convertible in this way.  Unfortunately, this trick doesn't work on GCC 4.7.
369
370### Interfaces
371
372[Interfaces (RPC) have their own page.](cxxrpc.html)
373
374### Generics
375
376[Generic types](language.html#generic-types) become templates in C++. The outer type (the one whose
377name matches the schema declaration's name) is templatized; the inner `Reader` and `Builder` types
378are not, because they inherit the parameters from the outer type. Similarly, template parameters
379should refer to outer types, not `Reader` or `Builder` types.
380
381For example, given:
382
383{% highlight capnp %}
384struct Map(Key, Value) {
385  entries @0 :List(Entry);
386  struct Entry {
387    key @0 :Key;
388    value @1 :Value;
389  }
390}
391
392struct People {
393  byName @0 :Map(Text, Person);
394  # Maps names to Person instances.
395}
396{% endhighlight %}
397
398You might write code like:
399
400{% highlight c++ %}
401void processPeople(People::Reader people) {
402  Map<Text, Person>::Reader reader = people.getByName();
403  capnp::List<Map<Text, Person>::Entry>::Reader entries =
404      reader.getEntries()
405  for (auto entry: entries) {
406    processPerson(entry);
407  }
408}
409{% endhighlight %}
410
411Note that all template parameters will be specified with a default value of `AnyPointer`.
412Therefore, the type `Map<>` is equivalent to `Map<capnp::AnyPointer, capnp::AnyPointer>`.
413
414### Constants
415
416Constants are exposed with their names converted to UPPERCASE_WITH_UNDERSCORES naming style
417(whereas in the schema language you’d write them in camelCase).  Primitive constants are just
418`constexpr` values.  Pointer-type constants (e.g. structs, lists, and blobs) are represented
419using a proxy object that can be converted to the relevant `Reader` type, either implicitly or
420using the unary `*` or `->` operators.
421
422## Messages and I/O
423
424To create a new message, you must start by creating a `capnp::MessageBuilder`
425(`capnp/message.h`).  This is an abstract type which you can implement yourself, but most users
426will want to use `capnp::MallocMessageBuilder`.  Once your message is constructed, write it to
427a file descriptor with `capnp::writeMessageToFd(fd, builder)` (`capnp/serialize.h`) or
428`capnp::writePackedMessageToFd(fd, builder)` (`capnp/serialize-packed.h`).
429
430To read a message, you must create a `capnp::MessageReader`, which is another abstract type.
431Implementations are specific to the data source.  You can use `capnp::StreamFdMessageReader`
432(`capnp/serialize.h`) or `capnp::PackedFdMessageReader` (`capnp/serialize-packed.h`)
433to read from file descriptors; both take the file descriptor as a constructor argument.
434
435Note that if your stream contains additional data after the message, `PackedFdMessageReader` may
436accidentally read some of that data, since it does buffered I/O.  To make this work correctly, you
437will need to set up a multi-use buffered stream.  Buffered I/O may also be a good idea with
438`StreamFdMessageReader` and also when writing, for performance reasons.  See `capnp/io.h` for
439details.
440
441There is an [example](#example-usage) of all this at the beginning of this page.
442
443### Using mmap
444
445Cap'n Proto can be used together with `mmap()` (or Win32's `MapViewOfFile()`) for extremely fast
446reads, especially when you only need to use a subset of the data in the file.  Currently,
447Cap'n Proto is not well-suited for _writing_ via `mmap()`, only reading, but this is only because
448we have not yet invented a mutable segment framing format -- the underlying design should
449eventually work for both.
450
451To take advantage of `mmap()` at read time, write your file in regular serialized (but NOT packed)
452format -- that is, use `writeMessageToFd()`, _not_ `writePackedMessageToFd()`.  Now, `mmap()` in
453the entire file, and then pass the mapped memory to the constructor of
454`capnp::FlatArrayMessageReader` (defined in `capnp/serialize.h`).  That's it.  You can use the
455reader just like a normal `StreamFdMessageReader`.  The operating system will automatically page
456in data from disk as you read it.
457
458`mmap()` works best when reading from flash media, or when the file is already hot in cache.
459It works less well with slow rotating disks.  Here, disk seeks make random access relatively
460expensive.  Also, if I/O throughput is your bottleneck, then the fact that mmaped data cannot
461be packed or compressed may hurt you.  However, it all depends on what fraction of the file you're
462actually reading -- if you only pull one field out of one deeply-nested struct in a huge tree, it
463may still be a win.  The only way to know for sure is to do benchmarks!  (But be careful to make
464sure your benchmark is actually interacting with disk and not cache.)
465
466## Dynamic Reflection
467
468Sometimes you want to write generic code that operates on arbitrary types, iterating over the
469fields or looking them up by name.  For example, you might want to write code that encodes
470arbitrary Cap'n Proto types in JSON format.  This requires something like "reflection", but C++
471does not offer reflection.  Also, you might even want to operate on types that aren't compiled
472into the binary at all, but only discovered at runtime.
473
474The C++ API supports inspecting schemas at runtime via the interface defined in
475`capnp/schema.h`, and dynamically reading and writing instances of arbitrary types via
476`capnp/dynamic.h`.  Here's the example from the beginning of this file rewritten in terms
477of the dynamic API:
478
479{% highlight c++ %}
480#include "addressbook.capnp.h"
481#include <capnp/message.h>
482#include <capnp/serialize-packed.h>
483#include <iostream>
484#include <capnp/schema.h>
485#include <capnp/dynamic.h>
486
487using ::capnp::DynamicValue;
488using ::capnp::DynamicStruct;
489using ::capnp::DynamicEnum;
490using ::capnp::DynamicList;
491using ::capnp::List;
492using ::capnp::Schema;
493using ::capnp::StructSchema;
494using ::capnp::EnumSchema;
495
496using ::capnp::Void;
497using ::capnp::Text;
498using ::capnp::MallocMessageBuilder;
499using ::capnp::PackedFdMessageReader;
500
501void dynamicWriteAddressBook(int fd, StructSchema schema) {
502  // Write a message using the dynamic API to set each
503  // field by text name.  This isn't something you'd
504  // normally want to do; it's just for illustration.
505
506  MallocMessageBuilder message;
507
508  // Types shown for explanation purposes; normally you'd
509  // use auto.
510  DynamicStruct::Builder addressBook =
511      message.initRoot<DynamicStruct>(schema);
512
513  DynamicList::Builder people =
514      addressBook.init("people", 2).as<DynamicList>();
515
516  DynamicStruct::Builder alice =
517      people[0].as<DynamicStruct>();
518  alice.set("id", 123);
519  alice.set("name", "Alice");
520  alice.set("email", "alice@example.com");
521  auto alicePhones = alice.init("phones", 1).as<DynamicList>();
522  auto phone0 = alicePhones[0].as<DynamicStruct>();
523  phone0.set("number", "555-1212");
524  phone0.set("type", "mobile");
525  alice.get("employment").as<DynamicStruct>()
526       .set("school", "MIT");
527
528  auto bob = people[1].as<DynamicStruct>();
529  bob.set("id", 456);
530  bob.set("name", "Bob");
531  bob.set("email", "bob@example.com");
532
533  // Some magic:  We can convert a dynamic sub-value back to
534  // the native type with as<T>()!
535  List<Person::PhoneNumber>::Builder bobPhones =
536      bob.init("phones", 2).as<List<Person::PhoneNumber>>();
537  bobPhones[0].setNumber("555-4567");
538  bobPhones[0].setType(Person::PhoneNumber::Type::HOME);
539  bobPhones[1].setNumber("555-7654");
540  bobPhones[1].setType(Person::PhoneNumber::Type::WORK);
541  bob.get("employment").as<DynamicStruct>()
542     .set("unemployed", ::capnp::VOID);
543
544  writePackedMessageToFd(fd, message);
545}
546
547void dynamicPrintValue(DynamicValue::Reader value) {
548  // Print an arbitrary message via the dynamic API by
549  // iterating over the schema.  Look at the handling
550  // of STRUCT in particular.
551
552  switch (value.getType()) {
553    case DynamicValue::VOID:
554      std::cout << "";
555      break;
556    case DynamicValue::BOOL:
557      std::cout << (value.as<bool>() ? "true" : "false");
558      break;
559    case DynamicValue::INT:
560      std::cout << value.as<int64_t>();
561      break;
562    case DynamicValue::UINT:
563      std::cout << value.as<uint64_t>();
564      break;
565    case DynamicValue::FLOAT:
566      std::cout << value.as<double>();
567      break;
568    case DynamicValue::TEXT:
569      std::cout << '\"' << value.as<Text>().cStr() << '\"';
570      break;
571    case DynamicValue::LIST: {
572      std::cout << "[";
573      bool first = true;
574      for (auto element: value.as<DynamicList>()) {
575        if (first) {
576          first = false;
577        } else {
578          std::cout << ", ";
579        }
580        dynamicPrintValue(element);
581      }
582      std::cout << "]";
583      break;
584    }
585    case DynamicValue::ENUM: {
586      auto enumValue = value.as<DynamicEnum>();
587      KJ_IF_MAYBE(enumerant, enumValue.getEnumerant()) {
588        std::cout <<
589            enumerant->getProto().getName().cStr();
590      } else {
591        // Unknown enum value; output raw number.
592        std::cout << enumValue.getRaw();
593      }
594      break;
595    }
596    case DynamicValue::STRUCT: {
597      std::cout << "(";
598      auto structValue = value.as<DynamicStruct>();
599      bool first = true;
600      for (auto field: structValue.getSchema().getFields()) {
601        if (!structValue.has(field)) continue;
602        if (first) {
603          first = false;
604        } else {
605          std::cout << ", ";
606        }
607        std::cout << field.getProto().getName().cStr()
608                  << " = ";
609        dynamicPrintValue(structValue.get(field));
610      }
611      std::cout << ")";
612      break;
613    }
614    default:
615      // There are other types, we aren't handling them.
616      std::cout << "?";
617      break;
618  }
619}
620
621void dynamicPrintMessage(int fd, StructSchema schema) {
622  PackedFdMessageReader message(fd);
623  dynamicPrintValue(message.getRoot<DynamicStruct>(schema));
624  std::cout << std::endl;
625}
626{% endhighlight %}
627
628Notes about the dynamic API:
629
630* You can implicitly cast any compiled Cap'n Proto struct reader/builder type directly to
631  `DynamicStruct::Reader`/`DynamicStruct::Builder`.  Similarly with `List<T>` and `DynamicList`,
632  and even enum types and `DynamicEnum`.  Finally, all valid Cap'n Proto field types may be
633  implicitly converted to `DynamicValue`.
634
635* You can load schemas dynamically at runtime using `SchemaLoader` (`capnp/schema-loader.h`) and
636  use the Dynamic API to manipulate objects of these types.  `MessageBuilder` and `MessageReader`
637  have methods for accessing the message root using a dynamic schema.
638
639* While `SchemaLoader` loads binary schemas, you can also parse directly from text using
640  `SchemaParser` (`capnp/schema-parser.h`).  However, this requires linking against `libcapnpc`
641  (in addition to `libcapnp` and `libkj`) -- this code is bulky and not terribly efficient.  If
642  you can arrange to use only binary schemas at runtime, you'll be better off.
643
644* Unlike with Protobufs, there is no "global registry" of compiled-in types.  To get the schema
645  for a compiled-in type, use `capnp::Schema::from<MyType>()`.
646
647* Unlike with Protobufs, the overhead of supporting reflection is small.  Generated `.capnp.c++`
648  files contain only some embedded const data structures describing the schema, no code at all,
649  and the runtime library support code is relatively small.  Moreover, if you do not use the
650  dynamic API or the schema API, you do not even need to link their implementations into your
651  executable.
652
653* The dynamic API performs type checks at runtime.  In case of error, it will throw an exception.
654  If you compile with `-fno-exceptions`, it will crash instead.  Correct usage of the API should
655  never throw, but bugs happen.  Enabling and catching exceptions will make your code more robust.
656
657* Loading user-provided schemas has security implications: it greatly increases the attack
658  surface of the Cap'n Proto library.  In particular, it is easy for an attacker to trigger
659  exceptions.  To protect yourself, you are strongly advised to enable exceptions and catch them.
660
661## Orphans
662
663An "orphan" is a Cap'n Proto object that is disconnected from the message structure.  That is,
664it is not the root of a message, and there is no other Cap'n Proto object holding a pointer to it.
665Thus, it has no parents.  Orphans are an advanced feature that can help avoid copies and make it
666easier to use Cap'n Proto objects as part of your application's internal state.  Typical
667applications probably won't use orphans.
668
669The class `capnp::Orphan<T>` (defined in `<capnp/orphan.h>`) represents a pointer to an orphaned
670object of type `T`.  `T` can be any struct type, `List<T>`, `Text`, or `Data`.  E.g.
671`capnp::Orphan<Person>` would be an orphaned `Person` structure.  `Orphan<T>` is a move-only class,
672similar to `std::unique_ptr<T>`.  This prevents two different objects from adopting the same
673orphan, which would result in an invalid message.
674
675An orphan can be "adopted" by another object to link it into the message structure.  Conversely,
676an object can "disown" one of its pointers, causing the pointed-to object to become an orphan.
677Every pointer-typed field `foo` provides builder methods `adoptFoo()` and `disownFoo()` for these
678purposes.  Again, these methods use C++11 move semantics.  To use them, you will need to be
679familiar with `std::move()` (or the equivalent but shorter-named `kj::mv()`).
680
681Even though an orphan is unlinked from the message tree, it still resides inside memory allocated
682for a particular message (i.e. a particular `MessageBuilder`).  An orphan can only be adopted by
683objects that live in the same message.  To move objects between messages, you must perform a copy.
684If the message is serialized while an `Orphan<T>` living within it still exists, the orphan's
685content will be part of the serialized message, but the only way the receiver could find it is by
686investigating the raw message; the Cap'n Proto API provides no way to detect or read it.
687
688To construct an orphan from scratch (without having some other object disown it), you need an
689`Orphanage`, which is essentially an orphan factory associated with some message.  You can get one
690by calling the `MessageBuilder`'s `getOrphanage()` method, or by calling the static method
691`Orphanage::getForMessageContaining(builder)` and passing it any struct or list builder.
692
693Note that when an `Orphan<T>` goes out-of-scope without being adopted, the underlying memory that
694it occupied is overwritten with zeros.  If you use packed serialization, these zeros will take very
695little bandwidth on the wire, but will still waste memory on the sending and receiving ends.
696Generally, you should avoid allocating message objects that won't be used, or if you cannot avoid
697it, arrange to copy the entire message over to a new `MessageBuilder` before serializing, since
698only the reachable objects will be copied.
699
700## Reference
701
702The runtime library contains lots of useful features not described on this page.  For now, the
703best reference is the header files.  See:
704
705    capnp/list.h
706    capnp/blob.h
707    capnp/message.h
708    capnp/serialize.h
709    capnp/serialize-packed.h
710    capnp/schema.h
711    capnp/schema-loader.h
712    capnp/dynamic.h
713
714## Tips and Best Practices
715
716Here are some tips for using the C++ Cap'n Proto runtime most effectively:
717
718* Accessor methods for primitive (non-pointer) fields are fast and inline.  They should be just
719  as fast as accessing a struct field through a pointer.
720
721* Accessor methods for pointer fields, on the other hand, are not inline, as they need to validate
722  the pointer.  If you intend to access the same pointer multiple times, it is a good idea to
723  save the value to a local variable to avoid repeating this work.  This is generally not a
724  problem given C++11's `auto`.
725
726  Example:
727
728      // BAD
729      frob(foo.getBar().getBaz(),
730           foo.getBar().getQux(),
731           foo.getBar().getCorge());
732
733      // GOOD
734      auto bar = foo.getBar();
735      frob(bar.getBaz(), bar.getQux(), bar.getCorge());
736
737  It is especially important to use this style when reading messages, for another reason:  as
738  described under the "security tips" section, below, every time you `get` a pointer, Cap'n Proto
739  increments a counter by the size of the target object.  If that counter hits a pre-defined limit,
740  an exception is thrown (or a default value is returned, if exceptions are disabled), to prevent
741  a malicious client from sending your server into an infinite loop with a specially-crafted
742  message.  If you repeatedly `get` the same object, you are repeatedly counting the same bytes,
743  and so you may hit the limit prematurely.  (Since Cap'n Proto readers are backed directly by
744  the underlying message buffer and do not have anywhere else to store per-object information, it
745  is impossible to remember whether you've seen a particular object already.)
746
747* Internally, all pointer fields start out "null", even if they have default values.  When you have
748  a pointer field `foo` and you call `getFoo()` on the containing struct's `Reader`, if the field
749  is "null", you will receive a reader for that field's default value.  This reader is backed by
750  read-only memory; nothing is allocated.  However, when you call `get` on a _builder_, and the
751  field is null, then the implementation must make a _copy_ of the default value to return to you.
752  Thus, you've caused the field to become non-null, just by "reading" it.  On the other hand, if
753  you call `init` on that field, you are explicitly replacing whatever value is already there
754  (null or not) with a newly-allocated instance, and that newly-allocated instance is _not_ a
755  copy of the field's default value, but just a completely-uninitialized instance of the
756  appropriate type.
757
758* It is possible to receive a struct value constructed from a newer version of the protocol than
759  the one your binary was built with, and that struct might have extra fields that you don't know
760  about.  The Cap'n Proto implementation tries to avoid discarding this extra data.  If you copy
761  the struct from one message to another (e.g. by calling a set() method on a parent object), the
762  extra fields will be preserved.  This makes it possible to build proxies that receive messages
763  and forward them on without having to rebuild the proxy every time a new field is added.  You
764  must be careful, however:  in some cases, it's not possible to retain the extra fields, because
765  they need to be copied into a space that is allocated before the expected content is known.
766  In particular, lists of structs are represented as a flat array, not as an array of pointers.
767  Therefore, all memory for all structs in the list must be allocated upfront.  Hence, copying
768  a struct value from another message into an element of a list will truncate the value.  Because
769  of this, the setter method for struct lists is called `setWithCaveats()` rather than just `set()`.
770
771* Messages are built in "arena" or "region" style:  each object is allocated sequentially in
772  memory, until there is no more room in the segment, in which case a new segment is allocated,
773  and objects continue to be allocated sequentially in that segment.  This design is what makes
774  Cap'n Proto possible at all, and it is very fast compared to other allocation strategies.
775  However, it has the disadvantage that if you allocate an object and then discard it, that memory
776  is lost.  In fact, the empty space will still become part of the serialized message, even though
777  it is unreachable.  The implementation will try to zero it out, so at least it should pack well,
778  but it's still better to avoid this situation.  Some ways that this can happen include:
779  * If you `init` a field that is already initialized, the previous value is discarded.
780  * If you create an orphan that is never adopted into the message tree.
781  * If you use `adoptWithCaveats` to adopt an orphaned struct into a struct list, then a shallow
782    copy is necessary, since the struct list requires that its elements are sequential in memory.
783    The previous copy of the struct is discarded (although child objects are transferred properly).
784  * If you copy a struct value from another message using a `set` method, the copy will have the
785    same size as the original.  However, the original could have been built with an older version
786    of the protocol which lacked some fields compared to the version your program was built with.
787    If you subsequently `get` that struct, the implementation will be forced to allocate a new
788    (shallow) copy which is large enough to hold all known fields, and the old copy will be
789    discarded.  Child objects will be transferred over without being copied -- though they might
790    suffer from the same problem if you `get` them later on.
791  Sometimes, avoiding these problems is too inconvenient.  Fortunately, it's also possible to
792  clean up the mess after-the-fact:  if you copy the whole message tree into a fresh
793  `MessageBuilder`, only the reachable objects will be copied, leaving out all of the unreachable
794  dead space.
795
796  In the future, Cap'n Proto may be improved such that it can re-use dead space in a message.
797  However, this will only improve things, not fix them entirely: fragementation could still leave
798  dead space.
799
800### Build Tips
801
802* If you are worried about the binary footprint of the Cap'n Proto library, consider statically
803  linking with the `--gc-sections` linker flag.  This will allow the linker to drop pieces of the
804  library that you do not actually use.  For example, many users do not use the dynamic schema and
805  reflection APIs, which contribute a large fraction of the Cap'n Proto library's overall
806  footprint.  Keep in mind that if you ever stringify a Cap'n Proto type, the stringification code
807  depends on the dynamic API; consider only using stringification in debug builds.
808
809  If you are dynamically linking against the system's shared copy of `libcapnp`, don't worry about
810  its binary size.  Remember that only the code which you actually use will be paged into RAM, and
811  those pages are shared with other applications on the system.
812
813  Also remember to strip your binary.  In particular, `libcapnpc` (the schema parser) has
814  excessively large symbol names caused by its use of template-based parser combinators.  Stripping
815  the binary greatly reduces its size.
816
817* The Cap'n Proto library has lots of debug-only asserts that are removed if you `#define NDEBUG`,
818  including in headers.  If you care at all about performance, you should compile your production
819  binaries with the `-DNDEBUG` compiler flag.  In fact, if Cap'n Proto detects that you have
820  optimization enabled but have not defined `NDEBUG`, it will define it for you (with a warning),
821  unless you define `DEBUG` or `KJ_DEBUG` to explicitly request debugging.
822
823### Security Tips
824
825Cap'n Proto has not yet undergone security review.  It most likely has some vulnerabilities.  You
826should not attempt to decode Cap'n Proto messages from sources you don't trust at this time.
827
828However, assuming the Cap'n Proto implementation hardens up eventually, then the following security
829tips will apply.
830
831* It is highly recommended that you enable exceptions.  When compiled with `-fno-exceptions`,
832  Cap'n Proto categorizes exceptions into "fatal" and "recoverable" varieties.  Fatal exceptions
833  cause the server to crash, while recoverable exceptions are handled by logging an error and
834  returning a "safe" garbage value.  Fatal is preferred in cases where it's unclear what kind of
835  garbage value would constitute "safe".  The more of the library you use, the higher the chance
836  that you will leave yourself open to the possibility that an attacker could trigger a fatal
837  exception somewhere.  If you enable exceptions, then you can catch the exception instead of
838  crashing, and return an error just to the attacker rather than to everyone using your server.
839
840  Basic parsing of Cap'n Proto messages shouldn't ever trigger fatal exceptions (assuming the
841  implementation is not buggy).  However, the dynamic API -- especially if you are loading schemas
842  controlled by the attacker -- is much more exception-happy.  If you cannot use exceptions, then
843  you are advised to avoid the dynamic API when dealing with untrusted data.
844
845* If you need to process schemas from untrusted sources, take them in binary format, not text.
846  The text parser is a much larger attack surface and not designed to be secure.  For instance,
847  as of this writing, it is trivial to deadlock the parser by simply writing a constant whose value
848  depends on itself.
849
850* Cap'n Proto automatically applies two artificial limits on messages for security reasons:
851  a limit on nesting dept, and a limit on total bytes traversed.
852
853  * The nesting depth limit is designed to prevent stack overflow when handling a deeply-nested
854    recursive type, and defaults to 64.  If your types aren't recursive, it is highly unlikely
855    that you would ever hit this limit, and even if they are recursive, it's still unlikely.
856
857  * The traversal limit is designed to defend against maliciously-crafted messages which use
858    pointer cycles or overlapping objects to make a message appear much larger than it looks off
859    the wire.  While cycles and overlapping objects are illegal, they are hard to detect reliably.
860    Instead, Cap'n Proto places a limit on how many bytes worth of objects you can _dereference_
861    before it throws an exception.  This limit is assessed every time you follow a pointer.  By
862    default, the limit is 64MiB (this may change in the future).  `StreamFdMessageReader` will
863    actually reject upfront any message which is larger than the traversal limit, even before you
864    start reading it.
865
866    If you need to write your code in such a way that you might frequently re-read the same
867    pointers, instead of increasing the traversal limit to the point where it is no longer useful,
868    consider simply copying the message into a new `MallocMessageBuilder` before starting.  Then,
869    the traversal limit will be enforced only during the copy.  There is no traversal limit on
870    objects once they live in a `MessageBuilder`, even if you use `.asReader()` to convert a
871    particular object's builder to the corresponding reader type.
872
873  Both limits may be increased using `capnp::ReaderOptions`, defined in `capnp/message.h`.
874
875* Remember that enums on the wire may have a numeric value that does not match any value defined
876  in the schema.  Your `switch()` statements must always have a safe default case.
877
878## Lessons Learned from Protocol Buffers
879
880The author of Cap'n Proto's C++ implementation also wrote (in the past) verison 2 of Google's
881Protocol Buffers.  As a result, Cap'n Proto's implementation benefits from a number of lessons
882learned the hard way:
883
884* Protobuf generated code is enormous due to the parsing and serializing code generated for every
885  class.  This actually poses a significant problem in practice -- there exist server binaries
886  containing literally hundreds of megabytes of compiled protobuf code.  Cap'n Proto generated code,
887  on the other hand, is almost entirely inlined accessors.  The only things that go into `.capnp.o`
888  files are default values for pointer fields (if needed, which is rare) and the encoded schema
889  (just the raw bytes of a Cap'n-Proto-encoded schema structure).  The latter could even be removed
890  if you don't use dynamic reflection.
891
892* The C++ Protobuf implementation used lots of dynamic initialization code (that runs before
893  `main()`) to do things like register types in global tables.  This proved problematic for
894  programs which linked in lots of protocols but needed to start up quickly.  Cap'n Proto does not
895  use any dynamic initializers anywhere, period.
896
897* The C++ Protobuf implementation makes heavy use of STL in its interface and implementation.
898  The proliferation of template instantiations gives the Protobuf runtime library a large footprint,
899  and using STL in the interface can lead to weird ABI problems and slow compiles.  Cap'n Proto
900  does not use any STL containers in its interface and makes sparing use in its implementation.
901  As a result, the Cap'n Proto runtime library is smaller, and code that uses it compiles quickly.
902
903* The in-memory representation of messages in Protobuf-C++ involves many heap objects.  Each
904  message (struct) is an object, each non-primitive repeated field allocates an array of pointers
905  to more objects, and each string may actually add two heap objects.  Cap'n Proto by its nature
906  uses arena allocation, so the entire message is allocated in a few contiguous segments.  This
907  means Cap'n Proto spends very little time allocating memory, stores messages more compactly, and
908  avoids memory fragmentation.
909
910* Related to the last point, Protobuf-C++ relies heavily on object reuse for performance.
911  Building or parsing into a newly-allocated Protobuf object is significantly slower than using
912  an existing one.  However, the memory usage of a Protobuf object will tend to grow the more times
913  it is reused, particularly if it is used to parse messages of many different "shapes", so the
914  objects need to be deleted and re-allocated from time to time.  All this makes tuning Protobufs
915  fairly tedious.  In contrast, enabling memory reuse with Cap'n Proto is as simple as providing
916  a byte buffer to use as scratch space when you build or read in a message.  Provide enough scratch
917  space to hold the entire message and Cap'n Proto won't allocate any memory.  Or don't -- since
918  Cap'n Proto doesn't do much allocation in the first place, the benefits of scratch space are
919  small.
920