• Home
  • History
  • Annotate
Name Date Size #Lines LOC

..03-May-2022-

docs/H21-Jul-2017-9,9428,382

pkg/H21-Jul-2017-164114

raet/H21-Jul-2017-37,29328,819

raet.egg-info/H03-May-2022-1312

scripts/H21-Jul-2017-4328

systest/H21-Jul-2017-1,3531,087

.gitignoreH A D13-Mar-2017214 2120

.pylintrcH A D03-Mar-201510.6 KiB344233

.testing.pylintrcH A D03-Mar-20151.5 KiB7062

ChangeLog.mdH A D21-Jul-201710.7 KiB663433

LICENSEH A D19-Feb-2016691 1712

PKG-INFOH A D21-Jul-2017506 1312

README.mdH A D02-Jun-201518.1 KiB474342

setup.cfgH A D21-Jul-201759 64

setup.pyH A D13-Mar-20172.6 KiB8262

README.md

1# RAET (Reliable Asynchronous Event Transport) Protocol
2
3## Motivation
4
5Modern large scale distributed application architectures, wherein components are
6distributed across the internet on multiple hosts and multiple CPU cores, are often
7based on a messaging or event bus that allows the various distributed components
8to communicate asynchronously with each other. Typically the messaging bus is
9some form of messaging queue service such as AMQP or ZeroMQ. The message bus supports
10what is commonly referred to as a publish/subscribe methodology for information
11exchange.
12
13While there are many advantages to a full featured message queuing service,
14one of the disadvantages is the inability to manage performance at scale.
15
16A message queuing service performs two distinct but complementary functions.
17
18- The first is asynchronous transport of messages over the internet.
19- The second is message queue management, that is, the identification, tracking,
20storage, and distribution of messages between publishers and subscribers via queues.
21
22One of the advantages of a message queuing service for many applications is that
23the service hides behind an API, the complexities of queue management from the clients.
24The disadvantage is that at scale, where the volume of messages, the
25timing of messages, and the associated demands on memory, network, and cpu capacity
26become critical, the client has little ability to tune the service for performance.
27Often MQ services become bottlenecks for the distributed application.
28The more complicated MQ services, like AMQP, tend to be unreliable under load.
29
30Separating the function of network transport of asynchrounous event from the
31function of message queue management allows independant tuning at scale of each function.
32
33Most if not all of the MQ services are based on TCP/IP for transport.
34TCP/IP adds significant latency to the network communications and is therefore
35not well suited for the asynchronous nature of distibuted event driven application
36communications. This is primarily due to the way TCP/IP handles connection setup
37and teardown as well as failed connections in order to support streams. Fundamentally
38TCP/IP is optomized for sending large contiguous data streams not many small
39aynchronous events or messages. While not a problem for small scale systems,
40the differences in the associated traffic characteristics can become problematic
41at scale.
42
43Because UDP/IP has lower latency and is connectionless, it is much better suited
44to many small asynchronous messages and scales better. The drawback of bare UDP/IP
45is that it is not reliable. What is needed, therefore, is a tuned transport
46protocol that adds reliability to UDP/IP without sacrificing latency and scalability.
47A transactioned protocol, is much more appropriate for providing reliablity to
48asynchronous event transport than a streaming protocol.
49
50Moreover, because most MQ services are based on TCP/IP they tend to also use
51HTTP and therefore TLS/SSL for secure communications. While using HTTP provides
52easy integration with web based systems, it can become problematic for high performant systems
53Furthermore, TLS is also problematic as a security system both from performance
54and vulnerabilty aspects.
55
56Elliptic Curve Cryptography, on the other hand, provides increases in security
57with lower performance requirements relative to over other approaches.
58LibSodium provides an open source Elliptic Curve Cryptographic library with support
59for both authentication and encryption. The CurveCP protocol is based on LibSodium
60and provides a handshake protocol for bootstrapping secure network exchanges of information.
61
62Finally, one of the best ways to manage and fine tune processor resources
63(cpu, memory, network) in distributed concurrent event driven applications is to use
64something called micro-threads. A microthread is typically an in-language feature
65that allows logical concurrency with no more overhead than a function call.
66Micro threading uses cooperative multi-tasking instead of threads and/or processes
67and avoids many of the complexities of resource contention, context switching,
68and interprocess communications while providing much higher total performance.
69
70Because all the cooperative micro-threads run in one process, a simple micro-threaded
71application is limited to one CPU core. To enable full utilization of all CPU
72cores, the application needs to be able to run at least one process per CPU core.
73This requires same host inter-process communications. But unlike the conventional
74approach to multi-processing  where there is of one process per logical concurrent
75function, a micro-threaded multi-process application has instead one micro-thread
76per logical concurrent function and the total number of micro-threads
77is distributed amoungst a minimal number of processes, no more than the number of
78cpu cores. This optimizes the use of the cpu power while minimizes the overhead of
79process context switching.
80
81An example of a framework that uses this type of micro-threaded but multi-process
82architecture is Erlang. Indeed, the success of the Erlang model provided
83support for the viability of the RAET approach.
84Indeed, one might ask, why not use Erlang? Unfortunately, the Erlang ecosystem is
85somewhat limited in comparison to Python's and the language itself uses what one
86might describe as a very unfortunate syntax.
87One of the design objectives behine RAET was to leverage existing Python expertise
88and the richness of the Python ecosystem but still be able to develop distributed
89applications using a micro-threaded multi-process architectural model. The goal was
90to combine the best of both worlds.
91
92RAET is designed to provide secure reliable scalable asynchronous message/event
93transport over the internet in a micro-threaded multi-process application framework
94that uses UDP for interhost communication and LibSodium for authentication, encryption
95and the CurveCP handshake for secure bootstrap.
96
97The queue management and micro-threaded application support is provided by Ioflo.
98RAET is a complementary project to Ioflo in that RAET enables multiple Ioflo
99applications to work together over a network as part of a distributed application.
100
101The primary use case and motivating problem that resulted in the development of RAET
102was the need to enable SaltStack to scale better. SaltStack is a remote execution
103and configuration management platform written in Python. SaltStack uses ZeroMQ (0MQ)
104as its message bus or message queuing service. ZeroMQ is based on TCP/IP so suffers from
105the aforementioned latency and non-asynchronicity issues of TCP/IP based architectures.
106Moreover because ZeroMQ integrates queue management and transport in a monolithic way
107with special "sockets", tuning the performance of the queuing independent of the transport
108at scale becomes problematic. Tracing down bugs can also be problematic.
109
110
111## Installation
112
113Current raet is provided as a PyPi package. To install on a unix based system
114use pip.
115
116``` bash
117
118$ pip install raet
119
120
121```
122
123on OS X
124
125``` bash
126$ sudo pip install raet
127
128```
129
130
131
132## Introduction
133
134Currently RAET supports two types of communication.
135
136- Host to host communication over UDP/IP sockets
137- Same host interprocess communication over Unix Domain (UXD) Sockets
138
139The architecture of a RAET based application is shown in the figure below:
140
141![Diagram 1](docs/images/RaetMetaphor.png?raw=true)
142
143##Naming Metaphor for Components
144
145The following naming metaphor is designed to consistent but not conflicting with Ioflo
146
147### Road, Estates, Main Estate
148
149- The UDP channel is  a “Road"
150- The members of a Road are “Estates”  (as in real estate lots that front the road)
151- Each Estate has a unique UDP Host Port address “ha” , a unique  string “name”
152    and unique numerical ID “eid".
153- One Estate on the Road is the “Main” Estate
154- The Main Estate is responsible for allowing other estates to join the Road
155    via the Join (Key Exchange)  and Allow (CurveCP) transactions
156- The Main Estate is also responsible for routing messages between other Estates
157
158### Lane, Yards, Main Yard
159
160- Within each Estate may be a “Lane”.  This is a UXD channel.
161- The members of a Lane are “Yards”  (as in subdivision plots within the Estate)
162- Each Yard on a Lane has a unique UXD file name host address “ha’, and a unique string “name”.
163    There is also a numerical Yard ID “yid" that the class uses to generate yard
164    names but it is not an attribute of the Yard instance.
165- The Lane name is also used with the Yard Name to form a unique Filename that
166    is the ha of the UXD
167- One Yard on a Lane is  the “Main” Yard
168- The Main Yard is responsible for forming the Lane and permitting other Yards
169    to be on the lane. There is yet no formal process for this.
170    Currently there is a flag that will drop packets from any Yard that is not
171    already in the list of Yards maintained by the Main yard.
172    Also file permissions can be used to prevent spurious Yards from communicating
173    with the Main Yard.
174- The Main Yard is responsible for routing messages between other yards on the Lane
175
176
177## IoFlo Execution
178
179- Each Estate UDP interface is run via a RoadStack (UDP sockets)
180    which is run within the context of an IoFlo House
181    (so think of the House that runs the UDP Stack as the Manor House of the Estate)
182- Each Yard UXD interface is run via a LaneStack (Unix domain socket)
183    which is run within the context of an IoFlo House
184    (so think of Houses that run UXD stacks as accessory Houses (Tents, Shacks) on the Estate)
185- The "Manor" House is special in that it runs both the UDP stack for the Estate
186     and also runs the UXD Stack for the Main Yard
187- The House that runs the Main Estate UDP Stack can be thought of as Mayor’s House
188- Within the context of a House is a Data Store. Shares in the Store are Addressed
189    by the unique Share Name which is a dotted path
190
191## Routing
192
193Given the Ioflo execution architecture described above, routing is performed as follows:
194
195- In order to address a specific Estate, the Estate Name is required
196- In order to address a specific Yard within an Estate, the Yard Name is required
197- In order to address a specific Queue within a House, the Share Name is required
198
199The UDP stack maps Estate Name to UDP HA and Estate ID
200The UXD stack maps Yard Name to UXD HA
201The Store of any IoFlo behavior maps Share Name to Share reference
202
203Therefore Routing
204from: a source identified by
205    a queue in a source Share,
206    in a source Yard,
207    in a source Estate
208to: a destination identified by
209    a queue, in a destination Share,
210    in a destination Yard,
211    in a destination Estate
212
213requires two Triples, one for Source and one for Destination
214
215Source
216(Estate Name, Yard Name, Share Name)
217
218Destination
219(Estate Name, Yard Name, Share Name)
220
221If any element of the Triple is None or Empty then a Default is used.
222
223Below is an example of a Message Body that has the Routing information it it.
224
225
226```python
227    estate = 'minion1'
228    stack0 = stacking.StackUxd(name='lord', lanename='cherry', yid=0)
229    stack1 = stacking.StackUxd(name='serf', lanename='cherry', yid=1)
230    yard = yarding.Yard( name=stack0.yard.name, prefix='cherry')
231    stack1.addRemoteYard(yard)
232
233    src = (estate, stack1.yard.name, None)
234    dst = (estate, stack0.yard.name, None)
235    route = odict(src=src, dst=dst)
236    msg = odict(route=route, stuff="Serf to my lord. Feed me!")
237    stack1.transmit(msg=msg)
238
239    timer = Timer(duration=0.5)
240    timer.restart()
241    while not timer.expired:
242        stack0.serviceAll()
243        stack1.serviceAll()
244
245
246    lord Received Message
247    {
248        'route':
249        {
250            'src': ['minion1', 'yard1', None],
251            'dst': ['minion1', 'yard0', None]
252        },
253        'stuff': 'Serf to my lord. Feed me!'
254    }
255````
256
257
258
259## Details of UDP/IP Raet Protocol
260
261The UDP Raet protocol is based on a coding metaphor naming convention, that is, of
262estates attached to a road. The core objects are provided in the following package:
263raet.road
264
265### Road Raet Production UDP/IP Ports
266
267Manor Estate 4505
268Other Estates 4510
269
270### Packet Data Format
271
272The data used to initialize a packet is an ordered dict with several fields
273most of the fields are shared with the header data format below so only the
274unique fields are shown here.
275
276### Unique Packet data fields
277
278    sh: source host ip address (ipv4)
279    sp: source ip port
280    dh: destination host ip address (ipv4)
281    dp: destination host ip port
282
283### Header Data Format.
284The .data in the packet header is an ordered dict  which is used to either
285create a packet to transmit
286or holds the field from a received packet.
287What fields are included in a header is dependent on the header kind.
288
289### Header encoding
290
291There are three header encoding formats currently supported.
292
293- RAET Native. This is an minimized ascii test format that optimizes the tradeoff between
294easy readability and size. This is the default.
295
296- JSON. This is the most verbose format but has the advantage of compatibility.
297
298- Binary. This is not yet implemented. Once the protocol reaches a more mature state
299and its not likely that there will be any header changes (or very infrequent) then
300a binary format that minimizes size will be provided.
301
302When the head kind is json = 0, then certain optimizations are used to minimize
303the header length.
304- The header field keys are two bytes long
305- If a header field value is the default then the field is not included
306- Lengths are encoded as hex strings
307- The flags are encoded as a double char hex string in field 'fg'
308
309
310### Header Data Fields
311
312    ri: raet id Default 'RAET'
313    vn: Version (Version) Default 0
314    pk: Packet Kind (PcktKind)
315    pl: Packet Length (PcktLen)
316    hk: Header kind   (HeadKind) Default 0
317    hl: Header length (HeadLen) Default 0
318
319    se: Source Estate ID (SEID)
320    de: Destination Estate ID (DEID)
321    cf: Correspondent Flag (CrdtFlag) Default 0
322    bf: BroadCast Flag (BcstFlag)  Default 0
323
324    si: Session ID (SID) Default 0
325    ti: Transaction ID (TID) Default 0
326    tk: Transaction Kind (TrnsKind)
327
328    dt: Datetime Stamp  (Datetime) Default 0
329    oi: Order index (OrdrIndx)   Default 0
330
331    wf: Waiting Ack Flag    (WaitFlag) Default 0
332        Next segment or ordered packet is waiting for ack to this packet
333    ml: Message Length (MsgLen)  Default 0
334        Length of message only (unsegmented)
335    sn: Segment Number (SgmtNum) Default 0
336    sc: Segment Count  (SgmtCnt) Default 1
337    sf: Segment Flag  (SgmtFlag) Default 0
338        This packet is part of a segmented message
339    af: All Flag (AllFlag) Default 0
340        Resend all segments not just one
341
342    bk: Body kind   (BodyKind) Default 0
343    ck: Coat kind   (CoatKind) Default 0
344    fk: Footer kind   (FootKind) Default 0
345    fl: Footer length (FootLen) Default 0
346
347    fg: flags  packed (Flags) Default '00' hs
348         2 char Hex string with bits (0, 0, af, sf, 0, wf, bf, cf)
349         Zeros are TBD flags
350
351
352### Body Data Format
353
354The Body .data is a Mapping that is serialized using either JSON or MSGPACK
355
356### Packet Parts
357
358Each packet has 4 parts some of which may be empty. These are:
359- Head
360- Body
361- Coat
362- Tail
363
364The Head is manditory and provides the header fields that are needed to process the
365packet.
366
367The Tail provides the authentication signature that is used to verify the source of
368the packet and that its contents have not been tampered with.
369
370The Body is the contents of the packet. Some packets such as Acks and Nacks don't
371need a body. The Body is a serialized Python dictionary typically and ordered dictionary
372so that parsing and debugging has a consistent view of the ordering of the fields
373in the body.
374
375The Coat is the encrypted version of the body. The encryption type is CurveCP based.
376If the Coat is provided then the Body is encapsulated within the Coat Part.
377
378
379
380### Header Details
381
382#### JSON Encoding
383
384Header is the ASCII Safe JSON encoding of a Python ordered dictionary.
385Header termination is an empty line given by double pair of carriage-return linefeed
386characters.
387
388/r/n/r/n
38910 13 10 13
390ADAD
3911010 1101 1010 1101
392
393Carriage-return and newline characters cannot appear in a JSON encoded
394string unless they are escaped with backslash, so the 4 byte combination is illegal
395in valid JSON that does not have multi-byte unicode characters so it makes it a
396uniquely identifiable header termination.
397
398These means the header must be ascii safe  so no multibyte utf-8 strings are
399allowed in the header.
400
401#### Native Encoding
402
403The header consists of newline delimited lines. Each header line consists
404of a two character field identifier followed by a space followed by the value
405of the field as ascii hex encoded binary followed by newline.
406The Header end is indicated by a blank line, that is, a double newline character.
407Example
408
409#### Binary Encoding
410
411The header consists of defined set of fixed length fields
412
413
414### Session
415
416Session is important for security. Want one session opened and then multiple
417transactions within session.
418
419Session ID
420SID
421sid
422si
423
424
425### Session Bootstrap
426
427
428
429## Layering:
430
431OSI Layers
432
4337: Application: Format: Data (Stack to Application interface buffering etc)
4346: Presentation: Format: Data (Encrypt-Decrypt convert to machine independent format)
4355: Session: Format: Data (Interhost communications. Authentication. Groups)
4364: Transport: Format: Segments (Reliable delivery of Message, Transactions, Segmentation, Error checking)
4373: Network: Format: Packets/Datagrams (Addressing Routing)
4382: Link: Format: Frames ( Reliable per frame communications connection, Media access controller )
4391: Physical: Bits (Transciever communication connection not reliable)
440
441- Link is hidden from Raet
442
443- Network is IP host address and UDP Port
444
445- Transport is Raet transaction and packet authentication vis tail signature that
446   provide reliable transport.
447
448- Session is session id key exchange for signing. Grouping is Road
449
450- Presentation is Encrypt-Decrypt Body and Serialize-Deserialize Body
451
452- Application is Body data dictionary
453
454Packet signing could technically be in either the Transport or Session layers.
455
456## UXD Message
457
458RAET UXD Messages are limited in size to the same maximum (pre segmented)
459RAET UDP message size (about 16 Mb)
460
461UXD Messages have the following Format
462Header followed by serialized message body dict
463currently only JSON has been implemented.
464
4651) JSON Header:
466“RAET\njson\n\n”
467Followed by a jsonified  message body dict
468
4692) msgpack Header:
470“RAET\npack\n\n”
471Followed by a msgpackified   message body dict
472
473
474