1Flow Tutorial 2============= 3 4 * [Using Flow](#using-flow) 5 * [Keywords/primitives](#keywordsprimitives) 6 * [Promise, Future](#promise-future) 7 * [Network traversal](#network-traversal) 8 * [wait()](#wait) 9 * [ACTOR](#actor) 10 * [State Variables](#state-variables) 11 * [Void](#void) 12 * [PromiseStream<>, FutureStream<>](#promisestream-futurestream) 13 * [waitNext()](#waitnext) 14 * [choose / when](#choose--when) 15 * [Future composition](#future-composition) 16 * [Design Patterns](#design-patterns) 17 * [RPC](#rpc) 18 * [ACTOR return values](#actor-return-values) 19 * [“gotchas”](#gotchas) 20 * [Actor compiler](#actor-compiler) 21 * [Switch statements](#switch-statements) 22 * [try/catch with no wait()](#trycatch-with-no-wait) 23 * [ACTOR cancellation](#actor-cancellation) 24 * [Memory Management](#memory-management) 25 * [Reference Counting](#reference-counting) 26 * [Potential Gotchas](#potential-gotchas) 27 * [Reference Cycles](#reference-cycles) 28 * [Arenas](#arenas) 29 * [Potential Gotchas](#potential-gotchas-1) 30 * [Function Creating and Returning a non-Standalone Ref Object](#function-creating-and-returning-a-non-standalone-ref-object) 31 * [Assigning Returned Standalone Object to non Standalone Variable](#assigning-returned-standalone-object-to-non-standalone-variable) 32 * [Use of Standalone Objects in ACTOR Functions](#use-of-standalone-objects-in-actor-functions) 33 34# Using Flow 35 36Flow introduces some new keywords and flow controls. Combining these into workable units 37also introduces some new design patterns to C++ programmers. 38 39## Keywords/primitives 40 41The essence of Flow is the capability of passing messages asynchronously between 42components. The basic data types that connect asynchronous senders and receivers are 43`Promise<>` and `Future<>`. The sender holds a `Promise<X>` to, sometime in the future, deliver 44a value of type `X` to the holder of the `Future<X>`. A receiver, holding a `Future<X>`, at some point 45needs the `X` to continue computation, and invokes the `wait(Future<> f)` statement to pause 46until the value is delivered. To use the `wait()` statement, a function needs to be declared as an 47ACTOR function, a special flow keyword which directs the flow compiler to create the necessary 48internal callbacks, etc. Similarly, When a component wants to deal not with one asynchronously 49delivered value, but with a series, there are `PromiseStream<>` and `FutureStream<>`. These 50two constructs allow for “reliable delivery” of messages, and play an important role in the design 51patterns. 52 53### Promise<T>, Future<T> 54 55`Promise<T>` and `Future<T>` are intrinsically linked (they go in pairs) and are two wrappers 56around a construct called `SingleAssignmentVar`, a variable that can be set only once. A 57Promise is a handle to a `SingleAssignmentVar` that allows for a one-time set of the value; a 58Future is a read-only handle to the variable that only allows reading of the value. 59 60The following example uses these two simple types: 61 62```c++ 63Promise<int> p; 64Future<int> f = p.getFuture(); 65p.send( 4 ); 66printf( "%d\n", f.get() ); // f is already set 67``` 68 69### Network traversal 70 71Promises and futures can be used within a single process, but their real strength in a distributed 72system is that they can traverse the network. For example, one computer could create a 73promise/future pair, then send the promise to another computer over the network. The promise 74and future will still be connected, and when the promise is fulfilled by the remote computer, the 75original holder of the future will see the value appear. 76 77[TODO: network delivery guarantees] 78 79### wait() 80 81Wait allows for the calling code to pause execution while the value of a `Future` is set. This 82statement is called with a `Future<T>` as its parameter and returns a `T`; the eventual value of the 83`Future`. Errors that are generated in the code that is setting the `Future`, will be thrown from 84the location of the `wait()`, so `Error`s must be caught at this location. 85 86The following example shows a snippet (from an ACTOR) of waiting on a `Future`: 87 88```c++ 89Future<int> f = asyncCalculation(); // defined elsewhere 90int count = wait( f ); 91printf( "%d\n", count ); 92``` 93 94It is worth nothing that, although the function `wait()` is declared in [actorcompiler.h](actorcompiler.h), this 95“function” is compiled by the Actor Compiler into a complex set of integrated statements and 96callbacks. It is therefore never present in generated code or at link time. 97**Note** : because of the way that the actor compiler is built, `wait()` must always assign the 98resulting value to a _newly declared variable._ 99 100From 6.1, `wait()` on `Void` actors shouldn't assign the resulting value. So, the following code 101 102```c++ 103Future<Void> asyncTask(); //defined elsewhere 104wait(asyncTask()); 105``` 106 107becomes 108 109```c++ 110Future<Void> asyncTask(); //defined elsewhere 111wait(asyncTask()); 112``` 113 114### ACTOR 115 116The only code that can call the `wait()` function are functions that are themselves labeled with 117the “ACTOR” tag. This is the essential unit of asynchronous work that can be chained together 118to create complex message-passing systems. 119An actor, although declared as returning a `Future<T>`, simply returns a `T`. Because an actor 120may wait on the results of other actors, an actor must return either a `Future `or `void`. In most 121cases returning `void `is less advantageous than returning a `Future`, since there are 122implications for actor cancellation. See the Actor Cancellation section for details. 123 124The following simple actor function waits on the `Future` to be ready, when it is ready adds `offset` and returns the result: 125 126```c++ 127ACTOR Future<int> asyncAdd(Future<int> f, int offset) { 128 int value = wait( f ); 129 return value + offset; 130} 131``` 132 133### State Variables 134 135Since ACTOR-labeled functions are compiled into a c++ class and numerous supporting 136functions, the variable scoping rules that normally apply are altered. The differences arise out 137of the fact that control flow is broken at wait() statements. Generally the compiled code is 138broken into chunks at wait statements, so scoping variables so that they can be seen in multiple 139“chunks” requires the `state `keyword. 140The following function waits on two inputs and outputs the sum with an offset attached: 141 142```c++ 143ACTOR Future<int> asyncCalculation(Future<int> f1, Future<int> f2, int offset ) { 144 state int value1 = wait( f1 ); 145 int value2 = wait( f2 ); 146 return value1 + value2 + offset; 147} 148``` 149 150### Void 151 152The `Void `type is used as a signalling-only type for coordination of asynchronous processes. 153The following function waits on an input, send an output to a `Promise`, and signals completion: 154 155```c++ 156ACTOR Future<Void> asyncCalculation(Future<int> f, Promise<int> p, int offset ) { 157 int value = wait( f ); 158 p.send( value + offset ); 159 return Void(); 160} 161``` 162 163### PromiseStream<>, FutureStream<> 164 165PromiseStream and `FutureStream` are groupings of a series of asynchronous messages. 166 167 168These allow for two important features: multiplexing and network reliability, discussed later. 169They can be waited on with the `waitNext()` function. 170 171### waitNext() 172 173Like `wait()`, `waitNext()` pauses program execution and awaits the next value in a 174`FutureStream`. If there is a value ready in the stream, execution continues without delay. The 175following “server” waits on input, send an output to a `PromiseStream`: 176 177```c++ 178ACTOR void asyncCalculation(FutureStream<int> f, PromiseStream<int> p, int offset ) { 179 while( true ) { 180 int value = waitNext( f ); 181 p.send( value + offset ); 182 } 183} 184``` 185 186### choose / when 187 188The `choose / when` construct allows an Actor to wait for multiple `Future `events at once in a 189ordered and predictable way. Only the `when` associated with the first future to become ready 190will be executed. The following shows the general use of choose and when: 191 192```c++ 193choose { 194 when( int number = waitNext( futureStreamA ) ) { 195 // clause A 196 } 197 when( std::string text = wait( futureB ) ) { 198 // clause B 199 } 200} 201``` 202 203You can put this construct in a loop if you need multiple `when` clauses to execute. 204 205### Future composition 206 207Futures can be chained together with the result of one depending on the output of another. 208 209```c++ 210ACTOR Future<int> asyncAddition(Future<int> f, int offset ) { 211 int value = wait( f ); 212 return value + offset; 213} 214 215ACTOR Future<int> asyncDivision(Future<int> f, int divisor ) { 216 int value = wait( f ); 217 return value / divisor; 218} 219 220ACTOR Future<int> asyncCalculation( Future<int> f ) { 221 int value = wait( asyncDivision( 222 asyncAddition( f, 10 ), 2 ) ); 223 return value; 224} 225``` 226 227 228## Design Patterns 229 230### RPC 231 232Many of the “servers” in FoundationDB that communicate over the network expose their interfaces as a struct of PromiseStreams--one for each request type. For instance, a logical server that keeps a count could look like this: 233 234```c++ 235struct CountingServerInterface { 236 PromiseStream<int> addCount; 237 PromiseStream<int> subtractCount; 238 PromiseStream<Promise<int>> getCount; 239 240 // serialization code required for use on a network 241 template <class Ar> 242 void serialize( Ar& ar ) { 243 serializer(ar, addCount, subtractCount, getCount); 244 } 245}; 246``` 247 248Clients can then pass messages to the server with calls such as this: 249 250```c++ 251CountingServerInterface csi = ...; // comes from somewhere 252csi.addCount.send(5); 253csi.subtractCount.send(2); 254Promise<int> finalCount; 255csi.getCount.send(finalCount); 256int value = wait( finalCount.getFuture() ); 257``` 258 259There is even a utility function to take the place of the last three lines: [TODO: And is necessary 260when sending requests over a real network to ensure delivery] 261 262```c++ 263CountingServerInterface csi = ...; // comes from somewhere 264csi.addCount.send(5); 265csi.subtractCount.send(2); 266int value = wait( csi.getCount.getReply<int>() ); 267``` 268 269Canonically, a single server ACTOR that implements the interface is a loop with a choose 270statement between all of the request types: 271 272```c++ 273ACTOR void serveCountingServerInterface(CountingServerInterface csi) { 274 state int count = 0; 275 loop { 276 choose { 277 when (int x = waitNext(csi.addCount.getFuture())){ 278 count += x; 279 } 280 when (int x = waitNext(csi.subtractCount.getFuture())){ 281 count -= x; 282 } 283 when (Promise<int> r = waitNext(csi.getCount.getFuture())){ 284 r.send( count ); // goes to client 285 } 286 } 287 } 288} 289``` 290 291In this example, the add and subtract interfaces modify the count itself, stored with a state 292variable. The get interface is a bit more complicated, taking a `Promise<int>` instead of just an 293int. In the interface class, you can see a `PromiseStream<Promise<int>>`. This is a common 294construct that is analogous to sending someone a self-addressed envelope. You send a 295promise to a someone else, who then unpacks it and send the answer back to you, because 296you are holding the corresponding future. 297 298### ACTOR return values 299 300An actor can have only one returned Future, so there is a case that one actor wants to perform 301some operation more than once: 302 303```c++ 304ACTOR Future<Void> periodically(PromiseStream<Void> ps, int seconds) { 305 loop { 306 wait( delay( seconds ) ); 307 ps.send(Void()); 308 } 309} 310``` 311 312In this example, the `PromiseStream `is actually a way for the actor to return data from some 313operation that it ongoing. 314 315## “gotchas” 316 317### Actor compiler 318 319There are some things about the actor compiler that can confuse and may change over time 320 321#### Switch statements 322 323Do not use these with wait statements inside! 324 325#### try/catch with no wait() 326 327When a `try/catch` block does not `wait()` the blocks are still decomposed into separate 328functions. This means that variables that you want to access both before and after such a block 329will need to be declared state. 330 331### ACTOR cancellation 332 333When the reference to the returned `Future` of an actor is dropped, that actor will be cancelled. 334Cancellation of an actor means that any `wait()`s that were currently active (the callback was 335currently registered) will be delivered an exception (`actor_cancelled`). In almost every case 336this exception should not be caught, though there are cetainly exceptions! 337 338# Memory Management 339 340## Reference Counting 341 342The FoundationDB solution uses reference counting to manage the lifetimes of many of its 343constituent classes. In order for a class `T` to be reference counted, the following two globally 344defined methods must be defined (see [FastRef.h](FastRef.h)): 345 346 347```c++ 348void addref(T*); 349void delref(T*); 350``` 351 352The easiest way to implement these methods is by making your class a descendant of 353`ReferenceCounted`. 354 355NOTE: Any descendants of `ReferenceCounted` should either have virtual destructors or be 356sealed. If you fail to meet these criteria, then references to descendants of your class will never 357be deleted. 358 359If you choose not to inherit from `ReferenceCounted`, you will have to manage the reference 360count yourself. One way this can be done is to define `void addref()` and `void delref()` 361methods on your class, which will make it compatible with the existing global `addref` and 362`delref` methods. Otherwise, you will need to create the global `addref` and `delref` methods 363for your class, as mentioned above. In either case, you will need to manage the reference 364count on your object and delete it when the count reaches 0. Note that the reference count 365should usually be initialized to 1, as the `addRef(T*)` function is not called when the object is 366created. 367 368To create a reference counted instance of a class `T`, you instantiate a `Reference<T>` on the 369stack with a pointer to your `T` object: 370 371```c++ 372Reference<T> refCountedInstance(new T()); 373``` 374The `Reference<T>` class automatically calls addref on your `T` instance every time it is copied 375(such as by argument passing or assignment), but not when the object is initially created 376(consequently, `ReferenceCounted` classes are initialized with reference count 1). It will call 377`delref` on your `T` instance whenever a particular `Reference<T>` instance gets deleted (usually 378by going out of scope). When no more instances of `Reference<T>` holding a particular `T` 379instance exist, then that `T` instance will be destroyed. 380 381### Potential Gotchas 382 383#### Reference Cycles 384 385You must be cautious about creating reference cycles when using reference counting. For 386example, if two `Reference<T>` objects refer to each other, then without specific intervention 387their reference counts will never reach 0 and the objects will never be deleted. 388 389## Arenas 390 391 392In addition to using reference counting, the FoundationDB solution also uses memory pools to 393allocate buffers. In this scheme, buffers are allocated from a common pool, called an `Arena`, 394and remain valid for the entire lifetime of that `Arena`. When the `Arena` is destroyed, all of the 395memory it held for the buffers is deallocated along with it. As a general convention, types which 396can use these `Arenas` and do not manage their own memory are given the "`Ref`" suffix. When 397a `*Ref` object is being used, consideration should be given to how its buffers are being 398managed (much in the same way that you would consider memory management when you see 399a `T*`). 400 401As an example, consider the `StringRef` class. A `StringRef` is an object which contains a 402pointer to a sequence of bytes, but does not actually manage that buffer. Thus, if a `StringRef` 403is deallocated, the data remains intact. Conversely, if the data is deallocated, the `StringRef` 404becomes invalid. In order for the `StringRef` to manage its own buffer, we need to create an 405instance of the `Standalone<StringRef>` class: 406 407```c++ 408Standalone<StringRef> str("data"); 409``` 410 411A `Standalone<T>` object has its own arena (technically, it is an `Arena`), and for classes like 412`StringRef` which support the use of arenas, the memory buffers used by the class are 413allocated from that arena. `Standalone<T>` is also a subclass of `T`, and so for all other purposes 414operates just like a `T`. 415 416There are a number of classes which support the use of arenas, and some which have 417convenience types for their `Standalone` versions (not a complete list): 418 419| T | Standalone<T> alias | 420|:----------------:|:-------------------:| 421| StringRef | N/A | 422| LiteralStringRef | N/A | 423| KeyRef | Key | 424| ValueRef | Value | 425| KeyValueRef | KeyValue | 426| KeyRangeRef | KeyRange | 427| KeySelectorRef | KeySelector | 428| VectorRef | N/A | 429 430The `VectorRef<T>` class is an `std::vector`-like object which is used to manage a list of these 431`*Ref` objects. A `Standalone<VectorRef<T>>` has its own arena which can be used to store 432the buffers held by its constituents. In order for that to happen, one of the two deep insertion 433methods (`push_back_deep` or `append_deep`) should be used when placing items in the vector. 434The shallow insertion methods will hold the objects only; any arena-managed memory is not 435copied. Thus, the `Standalone<VectorRef<T>>` will hold the `T` objects without managing their 436memory. Note that the arena(s) used by the `VectorRef` need not be its own (and cannot be 437unless the `VectorRef` is a `Standalone` object), and are determined by arguments to the 438functions that insert items. 439 440`VectorRef<T>` can also be used with types besides the standard `Ref` types, in which case the 441deep copy methods should not be used. In this case, the `VectorRef<T>` object holds the items 442in an arena much like a normal vector would hold the items in its buffer. Again, the arena used 443by the `VectorRef<T>` need not be its own. 444 445When a `Standalone<T>` is copied (e.g. by argument passing or assignment) to another 446`Standalone<T>`, they will share the same memory. The actual memory contents of the arena 447are stored in a reference counted structure (`ArenaBlock`), so the memory will persist until all 448instances of `Arena` holding that memory are destroyed. If instead a `T` object is copied to a 449`Standalone<T>`, then its entire contents are copied into the arena of the new `Standalone<T>` 450object using a deep copy. Thus, it is generally more efficient to consistently use `*Ref` objects 451and manage the memory with something external, or to consistently use `Standalone<T>` 452objects (where assignments just increment reference counters) to avoid memory copies. 453 454### Potential Gotchas 455 456#### Function Creating and Returning a non-Standalone Ref Object 457 458A function which creates a `Ref` object should generally return a `Standalone` version of that 459object. Otherwise, make certain that the `Arena` on which that `Ref` object was created still exists 460when the caller uses the returned `Ref`. 461 462#### Assigning Returned Standalone Object to non Standalone Variable 463 464A caller which receives a `Standalone` return value should assign that return value to a 465`Standalone` variable. Consider the following example: 466 467```c++ 468Standalone<StringRef> foo() { 469 return Standalone<StringRef>("string"); 470} 471 472void bar() { 473 StringRef val = foo(); 474} 475``` 476 477When `val` is copy-assigned in `bar`, its data is stored in the `Arena` of the `StringRef` that was 478returned from `foo`. When this returned `StringRef` is subsequently deallocated, `val` will no 479longer be valid. 480 481 482#### Use of Standalone Objects in ACTOR Functions 483 484Special care needs to be taken when using using `Standalone` values in actor functions. 485Consider the following example: 486 487``` 488ACTOR Future<void> foo(StringRef param) 489{ 490 //Do something 491 return Void(); 492} 493 494ACTOR Future<Void> bar() 495{ 496 Standalone<StringRef> str("string"); 497 wait(foo(str)); 498 return Void(); 499} 500``` 501 502Although it appears at first glance that `bar` keeps the `Arena` for `str` alive during the call to `foo`, 503it will actually go out of scope in the class generated by the actor compiler. As a result, `param` in 504`foo` will become invalid. To prevent this, either declare `param` to be of type 505`Standalone<StringRef>` or make `str` a state variable. 506