1This file will describe the design, layouts, and file formats of a
2libsvn_fs_x repository.
3
4Since FSX is still in a very early phase of its development, all sections
5either subject to major change or simply "TBD".
6
7Design
8------
9
10TBD.
11
12Similar to FSFS format 7 but using a radically different on-disk format.
13
14In FSFS, each committed revision is represented as an immutable file
15containing the new node-revisions, contents, and changed-path
16information for the revision, plus a second, changeable file
17containing the revision properties.
18
19To reduce the size of the on-disk representation, revision data gets
20packed, i.e. multiple revision files get combined into a single pack
21file of smaller total size.  The same strategy is applied to revprops.
22
23In-progress transactions are represented with a prototype rev file
24containing only the new text representations of files (appended to as
25changed file contents come in), along with a separate file for each
26node-revision, directory representation, or property representation
27which has been changed or added in the transaction.  During the final
28stage of the commit, these separate files are marshalled onto the end
29of the prototype rev file to form the immutable revision file.
30
31Layout of the FS directory
32--------------------------
33
34The layout of the FS directory (the "db" subdirectory of the
35repository) is:
36
37  revs/               Subdirectory containing revs
38    <shard>/          Shard directory, if sharding is in use (see below)
39      <revnum>        File containing rev <revnum>
40    <shard>.pack/     Pack directory, if the repo has been packed (see below)
41      pack            Pack file, if the repository has been packed (see below)
42      manifest        Pack manifest file, if a pack file exists (see below)
43  revprops/           Subdirectory containing rev-props
44    <shard>/          Shard directory, if sharding is in use (see below)
45      <revnum>        File containing rev-props for <revnum>
46    <shard>.pack/     Pack directory, if the repo has been packed (see below)
47      <rev>.<count>   Pack file, if the repository has been packed (see below)
48      manifest        Pack manifest file, if a pack file exists (see below)
49    revprops.db       SQLite database of the packed revision properties
50  transactions/       Subdirectory containing transactions
51    <txnid>.txn/      Directory containing transaction <txnid>
52  txn-protorevs/      Subdirectory containing transaction proto-revision files
53    <txnid>.rev       Proto-revision file for transaction <txnid>
54    <txnid>.rev-lock  Write lock for proto-rev file
55  txn-current         File containing the next transaction key
56  locks/              Subdirectory containing locks
57    <partial-digest>/ Subdirectory named for first 3 letters of an MD5 digest
58      <digest>        File containing locks/children for path with <digest>
59  current             File specifying current revision and next node/copy id
60  fs-type             File identifying this filesystem as an FSFS filesystem
61  write-lock          Empty file, locked to serialise writers
62  pack-lock           Empty file, locked to serialise 'svnadmin pack' (f. 7+)
63  txn-current-lock    Empty file, locked to serialise 'txn-current'
64  uuid                File containing the UUID of the repository
65  format              File containing the format number of this filesystem
66  fsx.conf            Configuration file
67  min-unpacked-rev    File containing the oldest revision not in a pack file
68  min-unpacked-revprop File containing the oldest revision of unpacked revprop
69  rep-cache.db        SQLite database mapping rep checksums to locations
70
71Files in the revprops directory are in the hash dump format used by
72svn_hash_write.
73
74The format of the "current" file is a single line of the form
75"<youngest-revision>\n" giving the youngest revision for the
76repository.
77
78The "write-lock" file is an empty file which is locked before the
79final stage of a commit and unlocked after the new "current" file has
80been moved into place to indicate that a new revision is present.  It
81is also locked during a revprop propchange while the revprop file is
82read in, mutated, and written out again.  Furthermore, it will be used
83to serialize the repository structure changes during 'svnadmin pack'
84(see also next section).  Note that readers are never blocked by any
85operation - writers must ensure that the filesystem is always in a
86consistent state.
87
88The "pack-lock" file is an empty file which is locked before an 'svnadmin
89pack' operation commences.  Thus, only one process may attempt to modify
90the repository structure at a time while other processes may still read
91and write (commit) to the repository during most of the pack procedure.
92It is only available with format 7 and newer repositories.  Older formats
93use the global write-lock instead which disables commits completely
94for the duration of the pack process.
95
96The "txn-current" file is a file with a single line of text that
97contains only a base-36 number.  The current value will be used in the
98next transaction name, along with the revision number the transaction
99is based on.  This sequence number ensures that transaction names are
100not reused, even if the transaction is aborted and a new transaction
101based on the same revision is begun.  The only operation that FSFS
102performs on this file is "get and increment"; the "txn-current-lock"
103file is locked during this operation.
104
105"fsx.conf" is a configuration file in the standard Subversion/Python
106config format.  It is automatically generated when you create a new
107repository; read the generated file for details on what it controls.
108
109When representation sharing is enabled, the filesystem tracks
110representation checksum and location mappings using a SQLite database in
111"rep-cache.db".  The database has a single table, which stores the sha1
112hash text as the primary key, mapped to the representation revision, offset,
113size and expanded size.  This file is only consulted during writes and never
114during reads.  Consequently, it is not required, and may be removed at an
115abritrary time, with the subsequent loss of rep-sharing capabilities for
116revisions written thereafter.
117
118Filesystem formats
119------------------
120
121TBD.
122
123The "format" file defines what features are permitted within the
124filesystem, and indicates changes that are not backward-compatible.
125It serves the same purpose as the repository file of the same name.
126
127So far, there is only format 1.
128
129
130Node-revision IDs
131-----------------
132
133A node-rev ID consists of the following three fields:
134
135    node_revision_id ::= node_id '.' copy_id '.' txn_id
136
137At this level, the form of the ID is the same as for BDB - see the
138section called "ID's" in <../libsvn_fs_base/notes/structure>.
139
140In order to support efficient lookup of node-revisions by their IDs
141and to simplify the allocation of fresh node-IDs during a transaction,
142we treat the fields of a node-rev ID in new and interesting ways.
143
144Within a new transaction:
145
146  New node-revision IDs assigned within a transaction have a txn-id
147  field of the form "t<txnid>".
148
149  When a new node-id or copy-id is assigned in a transaction, the ID
150  used is a "_" followed by a base36 number unique to the transaction.
151
152Within a revision:
153
154  Within a revision file, node-revs have a txn-id field of the form
155  "r<rev>/<offset>", to support easy lookup. The <offset> is the (ASCII
156  decimal) number of bytes from the start of the revision file to the
157  start of the node-rev.
158
159  During the final phase of a commit, node-revision IDs are rewritten
160  to have repository-wide unique node-ID and copy-ID fields, and to have
161  "r<rev>/<offset>" txn-id fields.
162
163  This uniqueness is done by changing a temporary
164  id of "_<base36>" to "<base36>-<rev>".  Note that this means that the
165  originating revision of a line of history or a copy can be determined
166  by looking at the node ID.
167
168The temporary assignment of node-ID and copy-ID fields has
169implications for svn_fs_compare_ids and svn_fs_check_related.  The ID
170_1.0.t1 is not related to the ID _1.0.t2 even though they have the
171same node-ID, because temporary node-IDs are restricted in scope to
172the transactions they belong to.
173
174Copy-IDs and copy roots
175-----------------------
176
177Copy-IDs are assigned in the same manner as they are in the BDB
178implementation:
179
180  * A node-rev resulting from a creation operation (with no copy
181    history) receives the copy-ID of its parent directory.
182
183  * A node-rev resulting from a copy operation receives a fresh
184    copy-ID, as one would expect.
185
186  * A node-rev resulting from a modification operation receives a
187    copy-ID depending on whether its predecessor derives from a
188    copy operation or whether it derives from a creation operation
189    with no intervening copies:
190
191      - If the predecessor does not derive from a copy, the new
192        node-rev receives the copy-ID of its parent directory.  If the
193        node-rev is being modified through its created-path, this will
194        be the same copy-ID as the predecessor node-rev has; however,
195        if the node-rev is being modified through a copied ancestor
196        directory (i.e. we are performing a "lazy copy"), this will be
197        a different copy-ID.
198
199      - If the predecessor derives from a copy and the node-rev is
200        being modified through its created-path, the new node-rev
201        receives the copy-ID of the predecessor.
202
203      - If the predecessor derives from a copy and the node-rev is not
204        being modified through its created path, the new node-rev
205        receives a fresh copy-ID.  This is called a "soft copy"
206        operation, as distinct from a "true copy" operation which was
207        actually requested through the svn_fs interface.  Soft copies
208        exist to ensure that the same <node-ID,copy-ID> pair is not
209        used twice within a transaction.
210
211Unlike the BDB implementation, we do not have a "copies" table.
212Instead, each node-revision record contains a "copyroot" field
213identifying the node-rev resulting from the true copy operation most
214proximal to the node-rev.  If the node-rev does not itself derive from
215a copy operation, then the copyroot field identifies the copy of an
216ancestor directory; if no ancestor directories derive from a copy
217operation, then the copyroot field identifies the root directory of
218rev 0.
219
220Revision file format
221--------------------
222
223TBD
224
225A revision file contains a concatenation of various kinds of data:
226
227  * Text and property representations
228  * Node-revisions
229  * The changed-path data
230
231That data is aggregated in compressed containers with a binary on-disk
232representation.
233
234Transaction layout
235------------------
236
237A transaction directory has the following layout:
238
239  props                      Transaction props
240  props-final                Final transaction props (optional)
241  next-ids                   Next temporary node-ID and copy-ID
242  changes                    Changed-path information so far
243  node.<nid>.<cid>           New node-rev data for node
244  node.<nid>.<cid>.props     Props for new node-rev, if changed
245  node.<nid>.<cid>.children  Directory contents for node-rev
246  <sha1>                     Text representation of that sha1
247
248  txn-protorevs/rev          Prototype rev file with new text reps
249  txn-protorevs/rev-lock     Lockfile for writing to the above
250
251The prototype rev file is used to store the text representations as
252they are received from the client.  To ensure that only one client is
253writing to the file at a given time, the "rev-lock" file is locked for
254the duration of each write.
255
256The three kinds of props files are all in hash dump format.  The "props"
257file will always be present.  The "node.<nid>.<cid>.props" file will
258only be present if the node-rev properties have been changed.  The
259"props-final" only exists while converting the transaction into a revision.
260
261The <sha1> files' content is that of text rep references:
262"<rev> <offset> <length> <size> <digest>"
263They will be written for text reps in the current transaction and be
264used to eliminate duplicate reps within that transaction.
265
266The "next-ids" file contains a single line "<next-temp-node-id>
267<next-temp-copy-id>\n" giving the next temporary node-ID and copy-ID
268assignments (without the leading underscores).  The next node-ID is
269also used as a uniquifier for representations which may share the same
270underlying rep.
271
272The "children" file for a node-revision begins with a copy of the hash
273dump representation of the directory entries from the old node-rev (or
274a dump of the empty hash for new directories), and then an incremental
275hash dump entry for each change made to the directory.
276
277The "changes" file contains changed-path entries in the same form as
278the changed-path entries in a rev file, except that <id> and <action>
279may both be "reset" (in which case <text-mod> and <prop-mod> are both
280always "false") to indicate that all changes to a path should be
281considered undone.  Reset entries are only used during the final merge
282phase of a transaction.  Actions in the "changes" file always contain
283a node kind.
284
285The node-rev files have the same format as node-revs in a revision
286file, except that the "text" and "props" fields are augmented as
287follows:
288
289  * The "props" field may have the value "-1" if properties have
290    been changed and are contained in a "props" file within the
291    node-rev subdirectory.
292
293  * For directory node-revs, the "text" field may have the value
294    "-1" if entries have been changed and are contained in a
295    "contents" file in the node-rev subdirectory.
296
297  * For the directory node-rev representing the root of the
298    transaction, the "is-fresh-txn-root" field indicates that it has
299    not been made mutable yet (see Issue #2608).
300
301  * For file node-revs, the "text" field may have the value "-1
302    <offset> <length> <size> <digest>" if the text representation is
303    within the prototype rev file.
304
305  * The "copyroot" field may have the value "-1 <created-path>" if the
306    copy root of the node-rev is part of the transaction in process.
307
308
309Locks layout
310------------
311
312Locks in FSX are stored in serialized hash format in files whose
313names are MD5 digests of the FS path which the lock is associated
314with.  For the purposes of keeping directory inode usage down, these
315digest files live in subdirectories of the main lock directory whose
316names are the first 3 characters of the digest filename.
317
318Also stored in the digest file for a given FS path are pointers to
319other digest files which contain information associated with other FS
320paths that are beneath our path (an immediate child thereof, or a
321grandchild, or a great-grandchild, ...).
322
323To answer the question, "Does path FOO have a lock associated with
324it?", one need only generate the MD5 digest of FOO's
325absolute-in-the-FS path (say, 3b1b011fed614a263986b5c4869604e8), look
326for a file located like so:
327
328   /path/to/repos/locks/3b1/3b1b011fed614a263986b5c4869604e8
329
330And then see if that file contains lock information.
331
332To inquire about locks on children of the path FOO, you would
333reference the same path as above, but look for a list of children in
334that file (instead of lock information).  Children are listed as MD5
335digests, too, so you would simply iterate over those digests and
336consult the files they reference for lock information.
337