xref: /openbsd/usr.bin/rsync/rsync.5 (revision 68843726)
1.\"	$OpenBSD: rsync.5,v 1.14 2023/04/12 08:32:27 claudio Exp $
2.\"
3.\" Copyright (c) 2019 Kristaps Dzonsons <kristaps@bsd.lv>
4.\"
5.\" Permission to use, copy, modify, and distribute this software for any
6.\" purpose with or without fee is hereby granted, provided that the above
7.\" copyright notice and this permission notice appear in all copies.
8.\"
9.\" THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES
10.\" WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
11.\" MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR
12.\" ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
13.\" WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
14.\" ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
15.\" OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
16.\"
17.Dd $Mdocdate: April 12 2023 $
18.Dt RSYNC 5
19.Os
20.Sh NAME
21.Nm rsync
22.Nd rsync wire protocol
23.Sh DESCRIPTION
24The
25.Nm
26protocol described in this relates to the BSD-licensed
27.Xr openrsync 1 ,
28a re-implementation of the GPL-licensed reference utility
29.Xr rsync 1 .
30It is compatible with version 27 of the reference.
31.Pp
32In this document, the
33.Qq client process
34refers to the utility as run on the operator's local computer.
35The
36.Qq server process
37is run either on the local or remote computer, depending upon the
38command-line given file locations.
39.Pp
40There are a number of options in the protocol that are dictated by command-line
41flags.
42These will be noted as
43.Fl D
44for devices,
45.Fl g
46for group ids,
47.Fl l
48for links,
49.Fl n
50for dry-run,
51.Fl o
52for user ids,
53.Fl r
54for recursion,
55.Fl v
56for verbose, and
57.Fl -delete
58for deletion (before).
59.Ss Data types
60The binary protocol encodes all data in little-endian format.
61Integers are signed 32-bit, shorts are signed 16-bit, bytes are unsigned
628-bit.
63A long is variable-length.
64For values less than the maximum integer, the value is transmitted and
65read as a 32-bit integer.
66For values greater, the value is transmitted first as a maximum integer,
67then a 64-bit signed integer.
68.Pp
69There are three types of checksums: long (slow), short (fast), and
70whole-file.
71The fast checksum is a derivative of Adler-32.
72The slow checksum is MD4,
73made over the checksum seed first (serialised in little-endian format),
74then the data.
75The whole-file applies MD4 to the file first, then the checksum seed at
76the end (also serialised in little-endian format).
77.Ss Multiplexing
78Most
79.Nm
80transmissions are wrapped in a multiplexing envelope protocol.
81It is composed as follows:
82.Pp
83.Bl -enum -compact
84.It
85envelope header (4 bytes)
86.It
87envelope payload (arbitrary length)
88.El
89.Pp
90The first byte of the envelope header consists of a tag.
91If the tag is 7, the payload is normal data.
92Otherwise, the payload is out-of-band server messages.
93If the tag is 1, it is an error on the sender's part and must trigger an
94exit.
95This limits message payloads to 24 bit integer size,
96.Li 0x00ffffff .
97.Pp
98The only data not using this envelope are the initial handshake between
99client and server.
100.Ss File list
101A central part of the protocol is the file list, which is generated by
102the sender.
103It consists of all files that must be sent to the receiver, either
104explicitly as given or recursively generated.
105.Pp
106The file list itself consists of filenames and attributes (mode, time,
107size, etc.).
108Filenames must be relative to the destination root and not be absolute
109or contain backtracking.
110So if a file is given to the sender as
111.Pa ../../foo/bar ,
112it must be sent as
113.Pa foo/bar .
114.Pp
115The file list should be cleaned of inappropriate files prior to sending.
116For example, if
117.Fl l
118is not specified, symbolic links may be omitted.
119Directory entries without
120.Fl r
121may also be omitted.
122Duplicates may be omitted.
123.Pp
124The receiver
125.Em must not
126assume that the file list is clean.
127It should not omit inappropriate files from the file list (which would
128affect the indexing), but may omit them during processing.
129.Pp
130Prior to be sent from sender to receiver, and upon being received, the
131file list must be lexicographically sorted such as with
132.Xr strcmp 3 .
133Subsequent references to the file are by index in the sorted list.
134.Ss Client process
135The client can operate in sender or receiver mode depending upon the
136command-line source and destination.
137.Pp
138If the destination directory (sink) is remote, the client is in sender
139mode: the client will push its data to the server.
140If the source file is remote, it is in receiver mode: the server pushes
141to the client.
142If neither are remote, the client operates in sender mode.
143These are all mutually exclusive.
144.Pp
145When the client starts, regardless its mode, it first handshakes the
146server.
147This exchange is
148.Em not
149multiplexed.
150.Pp
151.Bl -enum -compact
152.It
153send local version (integer)
154.It
155receive remote version (integer)
156.It
157receive random seed (integer)
158.El
159.Pp
160Following this, the client multiplexes when reading from the server.
161Transmissions sent from client to server are not multiplexed.
162It then enters the
163.Sx Update exchange
164protocol.
165.Ss Server process
166The server can operate in sender or receiver mode depending upon how the
167client starts the server.
168This may be directly from the parent process (when invoked for local
169files) or indirectly via a remote shell.
170.Pp
171When in sender mode, the server pushes data to the client.
172(This is equivalent to receiver mode for the client.)
173In receiver, the opposite is true.
174.Pp
175When the server starts, regardless the mode, it first handshakes the
176client.
177This exchange is
178.Em not
179multiplexed.
180.Pp
181.Bl -enum -compact
182.It
183send local version (integer)
184.It
185receive remote version (integer)
186.It
187send random seed (integer)
188.El
189.Pp
190Following this, the server multiplexes when writing to the client.
191(Transmissions received from the client are not multiplexed.)
192It then enters the
193.Sx Update exchange
194protocol.
195.Ss Update exchange
196When the client or server is in sender mode, it begins by conditionally
197sending the exclusion list.
198At this time, this is always empty.
199.Pp
200.Bl -enum -compact
201.It
202if
203.Fl -delete
204and the client, exclusion list zero (integer)
205.El
206.Pp
207It then sends the
208.Sx File list .
209Prior to being sent, the file list should be lexicographically sorted.
210.Pp
211.Bl -enum -compact
212.It
213status byte (integer)
214.It
215inherited filename length (optional, byte)
216.It
217filename length (integer or byte)
218.It
219file (byte array)
220.It
221file length (long)
222.It
223file modification time (optional, time_t, integer)
224.It
225file mode (optional, mode_t, integer)
226.It
227if
228.Fl o ,
229the user id (integer)
230.It
231if
232.Fl g ,
233the group id (integer)
234.It
235if a special file and
236.Fl D ,
237the device
238.Dq rdev
239type (integer)
240.It
241if a symbolic link and
242.Fl l ,
243the link target's length (integer)
244.It
245if a symbolic link and
246.Fl l ,
247the link target (byte array)
248.El
249.Pp
250The status byte may consist of the following bits and determines which
251of the optional fields are transmitted.
252.Pp
253.Bl -tag -compact -width Ds
254.It 0x01
255A top-level directory.
256(Only applies to directory files.)
257If specified, the matching local directory is for deletions.
258.It 0x02
259Do not send the file mode: it is a repeat of the last file's mode.
260.It 0x08
261Like
262.Li 0x02 ,
263but for the user id.
264.It 0x10
265Like
266.Li 0x02 ,
267but for the group id.
268.It 0x20
269Inherit some of the prior file name.
270Enables the inherited filename length transmission.
271.It 0x40
272Use full integer length for file name.
273Otherwise, use only the byte length.
274.It 0x80
275Do not send the file modification time: it is a repeat of the last
276file's.
277.El
278.Pp
279If the status byte is zero, the file-list has terminated.
280.Pp
281If
282.Fl o
283has been specified, the sender sends the list of all users encountered
284in the file list.
285Identifier zero
286.Pq Qq root
287is never transmitted, as it would prematurely end the list.
288This list may be incomplete or empty: the server is not obligated to
289properly fill it in with all relevant users.
290.Pp
291.Bl -enum -compact
292.It
293user identifier or zero to indicate end of set (integer)
294.It
295non-zero length of user name (byte)
296.It
297user name (prior length)
298.El
299.Pp
300The same sequence is then sent for groups if
301.Fl g
302has been specified.
303.Pp
304The sender then sends any IO error values, which for
305.Xr openrsync 1
306is always zero.
307.Pp
308.Bl -enum -compact
309.It
310constant zero (integer)
311.El
312.Pp
313The server sender then reads the exclusion list, which is always zero.
314.Pp
315.Bl -enum -compact
316.It
317if server, constant zero (integer)
318.El
319.Pp
320Following that, the sender receives data regarding the receiver's copy
321of the file list contents.
322This data is not ordered in any way.
323Each of these requests starts as follows:
324.Pp
325.Bl -enum -compact
326.It
327file index or -1 to signal a change of phase (integer)
328.El
329.Pp
330The phase starts in phase 1, then proceeds to phase 2, and phase 3
331signals an end of transmission (no subsequent blocks).
332If a phase change occurs, the sender must write back the -1 constant
333integer value and increment its phase state.
334.Pp
335Blocks are read as follows:
336.Pp
337.Bl -enum -compact
338.It
339block index (integer)
340.El
341.Pp
342In
343.Pq Fl n
344mode, the sender may immediately write back the index (integer) to skip
345the following.
346.Pp
347.Bl -enum -compact
348.It
349number of blocks (integer)
350.It
351block length in the file (integer)
352.It
353long checksum length (integer)
354.It
355terminal (remainder) block length (integer)
356.El
357.Pp
358And for each block:
359.Pp
360.Bl -enum -compact
361.It
362short checksum (integer)
363.It
364long checksum (bytes of checksum length)
365.El
366.Pp
367The client then compares the two files, block by block, and updates the
368server with mismatches as follows.
369.Pp
370.Bl -enum -compact
371.It
372file index (integer)
373.It
374number of blocks (integer)
375.It
376block length (integer)
377.It
378long checksum length (integer)
379.It
380remainder block length (integer)
381.El
382.Pp
383Then for each block:
384.Pp
385.Bl -enum -compact
386.It
387data chunk size (integer)
388.It
389data chunk (bytes)
390.It
391block index subsequent to chunk or zero for finished (integer)
392.El
393.Pp
394Following this sequence, the sender sends the following:
395.Pp
396.Bl -enum -compact
397.It
398whole-file long checksum (16 bytes)
399.El
400.Pp
401The sender then either handles the next queued file or, if the receiver
402has written a phase change, the phase change step.
403.Pp
404If the sender is the server and
405.Fl v
406has been specified, the sender must send statistics.
407.Pp
408.Bl -enum -compact
409.It
410total bytes read (long)
411.It
412total bytes written (long)
413.It
414total size of files (long)
415.El
416.Pp
417Finally, the sender must read a final constant-value integer.
418.Pp
419.Bl -enum -compact
420.It
421end-of-sequence -1 value (integer)
422.El
423.Pp
424If in receiver mode, the inverse above (write instead of read, read
425instead of write) is performed.
426.Pp
427The receiver begins by conditionally writing, then reading, the
428exclusion list count, which is always zero.
429.Pp
430.Bl -enum -compact
431.It
432if client, send zero (integer)
433.It
434if receiver and
435.Fl -delete ,
436read zero (integer)
437.El
438.Pp
439The receiver then proceeds with reading the
440.Sx File list
441as already
442defined.
443Following the list, the receiver reads the IO error, which must be zero.
444.Pp
445.Bl -enum -compact
446.It
447constant zero (integer)
448.El
449.Pp
450The receiver must then sort the file names lexicographically.
451.Pp
452If there are no files in the file list at this time, the receiver must
453exit prior to sending per-file data.
454It then proceeds with the file blocks.
455.Pp
456For file blocks, the receiver must look at each file that is not up to
457date, defined by having the same file size and timestamp, and send it to
458the server.
459Symbolic links and directory entries are never sent to the server.
460.Pp
461After the second phase has completed and prior to writing the
462end-of-data signal, the client receiver reads statistics.
463This is only performed with
464.Pq Fl v .
465.Pp
466.Bl -enum -compact
467.It
468total bytes read (long)
469.It
470total bytes written (long)
471.It
472total size of files (long)
473.El
474.Pp
475Finally, the receiver must send the constant end-of-sequence marker.
476.Pp
477.Bl -enum -compact
478.It
479end-of-sequence -1 value (integer)
480.El
481.Ss Sender and receiver asynchrony
482The sender and receiver need not work in lockstep.
483The receiver may send file update requests as quickly as it parses them,
484and respond to the sender's update notices on demand.
485Similarly, the sender may read as many update requests as it can, and
486service them in any order it wishes.
487.Pp
488The sender and receiver synchronise state only at the end of phase.
489.Pp
490The reference
491.Xr rsync 1
492takes advantage of this with a two-process receiver, one for sending
493update requests (the generator) and another for receiving.
494.Xr openrsync 1
495uses an event-loop model instead.
496.\" .Sh CONTEXT
497.\" For section 9 functions only.
498.\" .Sh RETURN VALUES
499.\" For sections 2, 3, and 9 function return values only.
500.\" .Sh ENVIRONMENT
501.\" For sections 1, 6, 7, and 8 only.
502.\" .Sh FILES
503.\" .Sh EXIT STATUS
504.\" For sections 1, 6, and 8 only.
505.\" .Sh EXAMPLES
506.\" .Sh DIAGNOSTICS
507.\" For sections 1, 4, 6, 7, 8, and 9 printf/stderr messages only.
508.\" .Sh ERRORS
509.\" For sections 2, 3, 4, and 9 errno settings only.
510.Sh SEE ALSO
511.Xr openrsync 1 ,
512.Xr rsync 1 ,
513.Xr rsyncd 5
514.\" .Sh STANDARDS
515.\" .Sh HISTORY
516.\" .Sh AUTHORS
517.\" .Sh CAVEATS
518.Sh BUGS
519Time values are sent as 32-bit integers.
520.Pp
521When in server mode
522.Em and
523when communicating to a client with a newer protocol (>27), the phase
524change integer (-1) acknowledgement must be sent twice by the sender.
525The is probably a bug in the reference implementation.
526