1This file will describe the design, layouts, and file formats of a 2libsvn_fs_x repository. 3 4Since FSX is still in a very early phase of its development, all sections 5either subject to major change or simply "TBD". 6 7Design 8------ 9 10TBD. 11 12Similar to FSFS format 7 but using a radically different on-disk format. 13 14In FSFS, each committed revision is represented as an immutable file 15containing the new node-revisions, contents, and changed-path 16information for the revision, plus a second, changeable file 17containing the revision properties. 18 19To reduce the size of the on-disk representation, revision data gets 20packed, i.e. multiple revision files get combined into a single pack 21file of smaller total size. The same strategy is applied to revprops. 22 23In-progress transactions are represented with a prototype rev file 24containing only the new text representations of files (appended to as 25changed file contents come in), along with a separate file for each 26node-revision, directory representation, or property representation 27which has been changed or added in the transaction. During the final 28stage of the commit, these separate files are marshalled onto the end 29of the prototype rev file to form the immutable revision file. 30 31Layout of the FS directory 32-------------------------- 33 34The layout of the FS directory (the "db" subdirectory of the 35repository) is: 36 37 revs/ Subdirectory containing revs 38 <shard>/ Shard directory, if sharding is in use (see below) 39 <revnum> File containing rev <revnum> 40 <shard>.pack/ Pack directory, if the repo has been packed (see below) 41 pack Pack file, if the repository has been packed (see below) 42 manifest Pack manifest file, if a pack file exists (see below) 43 revprops/ Subdirectory containing rev-props 44 <shard>/ Shard directory, if sharding is in use (see below) 45 <revnum> File containing rev-props for <revnum> 46 <shard>.pack/ Pack directory, if the repo has been packed (see below) 47 <rev>.<count> Pack file, if the repository has been packed (see below) 48 manifest Pack manifest file, if a pack file exists (see below) 49 revprops.db SQLite database of the packed revision properties 50 transactions/ Subdirectory containing transactions 51 <txnid>.txn/ Directory containing transaction <txnid> 52 txn-protorevs/ Subdirectory containing transaction proto-revision files 53 <txnid>.rev Proto-revision file for transaction <txnid> 54 <txnid>.rev-lock Write lock for proto-rev file 55 txn-current File containing the next transaction key 56 locks/ Subdirectory containing locks 57 <partial-digest>/ Subdirectory named for first 3 letters of an MD5 digest 58 <digest> File containing locks/children for path with <digest> 59 current File specifying current revision and next node/copy id 60 fs-type File identifying this filesystem as an FSFS filesystem 61 write-lock Empty file, locked to serialise writers 62 pack-lock Empty file, locked to serialise 'svnadmin pack' (f. 7+) 63 txn-current-lock Empty file, locked to serialise 'txn-current' 64 uuid File containing the UUID of the repository 65 format File containing the format number of this filesystem 66 fsx.conf Configuration file 67 min-unpacked-rev File containing the oldest revision not in a pack file 68 min-unpacked-revprop File containing the oldest revision of unpacked revprop 69 rep-cache.db SQLite database mapping rep checksums to locations 70 71Files in the revprops directory are in the hash dump format used by 72svn_hash_write. 73 74The format of the "current" file is a single line of the form 75"<youngest-revision>\n" giving the youngest revision for the 76repository. 77 78The "write-lock" file is an empty file which is locked before the 79final stage of a commit and unlocked after the new "current" file has 80been moved into place to indicate that a new revision is present. It 81is also locked during a revprop propchange while the revprop file is 82read in, mutated, and written out again. Furthermore, it will be used 83to serialize the repository structure changes during 'svnadmin pack' 84(see also next section). Note that readers are never blocked by any 85operation - writers must ensure that the filesystem is always in a 86consistent state. 87 88The "pack-lock" file is an empty file which is locked before an 'svnadmin 89pack' operation commences. Thus, only one process may attempt to modify 90the repository structure at a time while other processes may still read 91and write (commit) to the repository during most of the pack procedure. 92It is only available with format 7 and newer repositories. Older formats 93use the global write-lock instead which disables commits completely 94for the duration of the pack process. 95 96The "txn-current" file is a file with a single line of text that 97contains only a base-36 number. The current value will be used in the 98next transaction name, along with the revision number the transaction 99is based on. This sequence number ensures that transaction names are 100not reused, even if the transaction is aborted and a new transaction 101based on the same revision is begun. The only operation that FSFS 102performs on this file is "get and increment"; the "txn-current-lock" 103file is locked during this operation. 104 105"fsx.conf" is a configuration file in the standard Subversion/Python 106config format. It is automatically generated when you create a new 107repository; read the generated file for details on what it controls. 108 109When representation sharing is enabled, the filesystem tracks 110representation checksum and location mappings using a SQLite database in 111"rep-cache.db". The database has a single table, which stores the sha1 112hash text as the primary key, mapped to the representation revision, offset, 113size and expanded size. This file is only consulted during writes and never 114during reads. Consequently, it is not required, and may be removed at an 115abritrary time, with the subsequent loss of rep-sharing capabilities for 116revisions written thereafter. 117 118Filesystem formats 119------------------ 120 121TBD. 122 123The "format" file defines what features are permitted within the 124filesystem, and indicates changes that are not backward-compatible. 125It serves the same purpose as the repository file of the same name. 126 127So far, there is only format 1. 128 129 130Node-revision IDs 131----------------- 132 133A node-rev ID consists of the following three fields: 134 135 node_revision_id ::= node_id '.' copy_id '.' txn_id 136 137At this level, the form of the ID is the same as for BDB - see the 138section called "ID's" in <../libsvn_fs_base/notes/structure>. 139 140In order to support efficient lookup of node-revisions by their IDs 141and to simplify the allocation of fresh node-IDs during a transaction, 142we treat the fields of a node-rev ID in new and interesting ways. 143 144Within a new transaction: 145 146 New node-revision IDs assigned within a transaction have a txn-id 147 field of the form "t<txnid>". 148 149 When a new node-id or copy-id is assigned in a transaction, the ID 150 used is a "_" followed by a base36 number unique to the transaction. 151 152Within a revision: 153 154 Within a revision file, node-revs have a txn-id field of the form 155 "r<rev>/<offset>", to support easy lookup. The <offset> is the (ASCII 156 decimal) number of bytes from the start of the revision file to the 157 start of the node-rev. 158 159 During the final phase of a commit, node-revision IDs are rewritten 160 to have repository-wide unique node-ID and copy-ID fields, and to have 161 "r<rev>/<offset>" txn-id fields. 162 163 This uniqueness is done by changing a temporary 164 id of "_<base36>" to "<base36>-<rev>". Note that this means that the 165 originating revision of a line of history or a copy can be determined 166 by looking at the node ID. 167 168The temporary assignment of node-ID and copy-ID fields has 169implications for svn_fs_compare_ids and svn_fs_check_related. The ID 170_1.0.t1 is not related to the ID _1.0.t2 even though they have the 171same node-ID, because temporary node-IDs are restricted in scope to 172the transactions they belong to. 173 174Copy-IDs and copy roots 175----------------------- 176 177Copy-IDs are assigned in the same manner as they are in the BDB 178implementation: 179 180 * A node-rev resulting from a creation operation (with no copy 181 history) receives the copy-ID of its parent directory. 182 183 * A node-rev resulting from a copy operation receives a fresh 184 copy-ID, as one would expect. 185 186 * A node-rev resulting from a modification operation receives a 187 copy-ID depending on whether its predecessor derives from a 188 copy operation or whether it derives from a creation operation 189 with no intervening copies: 190 191 - If the predecessor does not derive from a copy, the new 192 node-rev receives the copy-ID of its parent directory. If the 193 node-rev is being modified through its created-path, this will 194 be the same copy-ID as the predecessor node-rev has; however, 195 if the node-rev is being modified through a copied ancestor 196 directory (i.e. we are performing a "lazy copy"), this will be 197 a different copy-ID. 198 199 - If the predecessor derives from a copy and the node-rev is 200 being modified through its created-path, the new node-rev 201 receives the copy-ID of the predecessor. 202 203 - If the predecessor derives from a copy and the node-rev is not 204 being modified through its created path, the new node-rev 205 receives a fresh copy-ID. This is called a "soft copy" 206 operation, as distinct from a "true copy" operation which was 207 actually requested through the svn_fs interface. Soft copies 208 exist to ensure that the same <node-ID,copy-ID> pair is not 209 used twice within a transaction. 210 211Unlike the BDB implementation, we do not have a "copies" table. 212Instead, each node-revision record contains a "copyroot" field 213identifying the node-rev resulting from the true copy operation most 214proximal to the node-rev. If the node-rev does not itself derive from 215a copy operation, then the copyroot field identifies the copy of an 216ancestor directory; if no ancestor directories derive from a copy 217operation, then the copyroot field identifies the root directory of 218rev 0. 219 220Revision file format 221-------------------- 222 223TBD 224 225A revision file contains a concatenation of various kinds of data: 226 227 * Text and property representations 228 * Node-revisions 229 * The changed-path data 230 231That data is aggregated in compressed containers with a binary on-disk 232representation. 233 234Transaction layout 235------------------ 236 237A transaction directory has the following layout: 238 239 props Transaction props 240 props-final Final transaction props (optional) 241 next-ids Next temporary node-ID and copy-ID 242 changes Changed-path information so far 243 node.<nid>.<cid> New node-rev data for node 244 node.<nid>.<cid>.props Props for new node-rev, if changed 245 node.<nid>.<cid>.children Directory contents for node-rev 246 <sha1> Text representation of that sha1 247 248 txn-protorevs/rev Prototype rev file with new text reps 249 txn-protorevs/rev-lock Lockfile for writing to the above 250 251The prototype rev file is used to store the text representations as 252they are received from the client. To ensure that only one client is 253writing to the file at a given time, the "rev-lock" file is locked for 254the duration of each write. 255 256The three kinds of props files are all in hash dump format. The "props" 257file will always be present. The "node.<nid>.<cid>.props" file will 258only be present if the node-rev properties have been changed. The 259"props-final" only exists while converting the transaction into a revision. 260 261The <sha1> files' content is that of text rep references: 262"<rev> <offset> <length> <size> <digest>" 263They will be written for text reps in the current transaction and be 264used to eliminate duplicate reps within that transaction. 265 266The "next-ids" file contains a single line "<next-temp-node-id> 267<next-temp-copy-id>\n" giving the next temporary node-ID and copy-ID 268assignments (without the leading underscores). The next node-ID is 269also used as a uniquifier for representations which may share the same 270underlying rep. 271 272The "children" file for a node-revision begins with a copy of the hash 273dump representation of the directory entries from the old node-rev (or 274a dump of the empty hash for new directories), and then an incremental 275hash dump entry for each change made to the directory. 276 277The "changes" file contains changed-path entries in the same form as 278the changed-path entries in a rev file, except that <id> and <action> 279may both be "reset" (in which case <text-mod> and <prop-mod> are both 280always "false") to indicate that all changes to a path should be 281considered undone. Reset entries are only used during the final merge 282phase of a transaction. Actions in the "changes" file always contain 283a node kind. 284 285The node-rev files have the same format as node-revs in a revision 286file, except that the "text" and "props" fields are augmented as 287follows: 288 289 * The "props" field may have the value "-1" if properties have 290 been changed and are contained in a "props" file within the 291 node-rev subdirectory. 292 293 * For directory node-revs, the "text" field may have the value 294 "-1" if entries have been changed and are contained in a 295 "contents" file in the node-rev subdirectory. 296 297 * For the directory node-rev representing the root of the 298 transaction, the "is-fresh-txn-root" field indicates that it has 299 not been made mutable yet (see Issue #2608). 300 301 * For file node-revs, the "text" field may have the value "-1 302 <offset> <length> <size> <digest>" if the text representation is 303 within the prototype rev file. 304 305 * The "copyroot" field may have the value "-1 <created-path>" if the 306 copy root of the node-rev is part of the transaction in process. 307 308 309Locks layout 310------------ 311 312Locks in FSX are stored in serialized hash format in files whose 313names are MD5 digests of the FS path which the lock is associated 314with. For the purposes of keeping directory inode usage down, these 315digest files live in subdirectories of the main lock directory whose 316names are the first 3 characters of the digest filename. 317 318Also stored in the digest file for a given FS path are pointers to 319other digest files which contain information associated with other FS 320paths that are beneath our path (an immediate child thereof, or a 321grandchild, or a great-grandchild, ...). 322 323To answer the question, "Does path FOO have a lock associated with 324it?", one need only generate the MD5 digest of FOO's 325absolute-in-the-FS path (say, 3b1b011fed614a263986b5c4869604e8), look 326for a file located like so: 327 328 /path/to/repos/locks/3b1/3b1b011fed614a263986b5c4869604e8 329 330And then see if that file contains lock information. 331 332To inquire about locks on children of the path FOO, you would 333reference the same path as above, but look for a list of children in 334that file (instead of lock information). Children are listed as MD5 335digests, too, so you would simply iterate over those digests and 336consult the files they reference for lock information. 337