1Additional Notes 2---------------- 3 4Here are misc. notes about topics that are maybe not covered in enough detail in the usage section. 5 6.. _chunker-params: 7 8``--chunker-params`` 9~~~~~~~~~~~~~~~~~~~~ 10 11The chunker params influence how input files are cut into pieces (chunks) 12which are then considered for deduplication. They also have a big impact on 13resource usage (RAM and disk space) as the amount of resources needed is 14(also) determined by the total amount of chunks in the repository (see 15:ref:`cache-memory-usage` for details). 16 17``--chunker-params=10,23,16,4095`` results in a fine-grained deduplication| 18and creates a big amount of chunks and thus uses a lot of resources to manage 19them. This is good for relatively small data volumes and if the machine has a 20good amount of free RAM and disk space. 21 22``--chunker-params=19,23,21,4095`` (default) results in a coarse-grained 23deduplication and creates a much smaller amount of chunks and thus uses less 24resources. This is good for relatively big data volumes and if the machine has 25a relatively low amount of free RAM and disk space. 26 27If you already have made some archives in a repository and you then change 28chunker params, this of course impacts deduplication as the chunks will be 29cut differently. 30 31In the worst case (all files are big and were touched in between backups), this 32will store all content into the repository again. 33 34Usually, it is not that bad though: 35 36- usually most files are not touched, so it will just re-use the old chunks 37 it already has in the repo 38- files smaller than the (both old and new) minimum chunksize result in only 39 one chunk anyway, so the resulting chunks are same and deduplication will apply 40 41If you switch chunker params to save resources for an existing repo that 42already has some backup archives, you will see an increasing effect over time, 43when more and more files have been touched and stored again using the bigger 44chunksize **and** all references to the smaller older chunks have been removed 45(by deleting / pruning archives). 46 47If you want to see an immediate big effect on resource usage, you better start 48a new repository when changing chunker params. 49 50For more details, see :ref:`chunker_details`. 51 52 53``--noatime / --noctime`` 54~~~~~~~~~~~~~~~~~~~~~~~~~ 55 56You can use these ``borg create`` options to not store the respective timestamp 57into the archive, in case you do not really need it. 58 59Besides saving a little space for the not archived timestamp, it might also 60affect metadata stream deduplication: if only this timestamp changes between 61backups and is stored into the metadata stream, the metadata stream chunks 62won't deduplicate just because of that. 63 64``--nobsdflags`` 65~~~~~~~~~~~~~~~~ 66 67You can use this to not query and store (or not extract and set) bsdflags - 68in case you don't need them or if they are broken somehow for your fs. 69 70On Linux, dealing with the bsflags needs some additional syscalls. 71Especially when dealing with lots of small files, this causes a noticeable 72overhead, so you can use this option also for speeding up operations. 73 74``--umask`` 75~~~~~~~~~~~ 76 77If you use ``--umask``, make sure that all repository-modifying borg commands 78(create, delete, prune) that access the repository in question use the same 79``--umask`` value. 80 81If multiple machines access the same repository, this should hold true for all 82of them. 83 84``--read-special`` 85~~~~~~~~~~~~~~~~~~ 86 87The ``--read-special`` option is special - you do not want to use it for normal 88full-filesystem backups, but rather after carefully picking some targets for it. 89 90The option ``--read-special`` triggers special treatment for block and char 91device files as well as FIFOs. Instead of storing them as such a device (or 92FIFO), they will get opened, their content will be read and in the backup 93archive they will show up like a regular file. 94 95Symlinks will also get special treatment if (and only if) they point to such 96a special file: instead of storing them as a symlink, the target special file 97will get processed as described above. 98 99One intended use case of this is backing up the contents of one or multiple 100block devices, like e.g. LVM snapshots or inactive LVs or disk partitions. 101 102You need to be careful about what you include when using ``--read-special``, 103e.g. if you include ``/dev/zero``, your backup will never terminate. 104 105Restoring such files' content is currently only supported one at a time via 106``--stdout`` option (and you have to redirect stdout to where ever it shall go, 107maybe directly into an existing device file of your choice or indirectly via 108``dd``). 109 110To some extent, mounting a backup archive with the backups of special files 111via ``borg mount`` and then loop-mounting the image files from inside the mount 112point will work. If you plan to access a lot of data in there, it likely will 113scale and perform better if you do not work via the FUSE mount. 114 115Example 116+++++++ 117 118Imagine you have made some snapshots of logical volumes (LVs) you want to backup. 119 120.. note:: 121 122 For some scenarios, this is a good method to get "crash-like" consistency 123 (I call it crash-like because it is the same as you would get if you just 124 hit the reset button or your machine would abruptly and completely crash). 125 This is better than no consistency at all and a good method for some use 126 cases, but likely not good enough if you have databases running. 127 128Then you create a backup archive of all these snapshots. The backup process will 129see a "frozen" state of the logical volumes, while the processes working in the 130original volumes continue changing the data stored there. 131 132You also add the output of ``lvdisplay`` to your backup, so you can see the LV 133sizes in case you ever need to recreate and restore them. 134 135After the backup has completed, you remove the snapshots again. 136 137:: 138 139 $ # create snapshots here 140 $ lvdisplay > lvdisplay.txt 141 $ borg create --read-special /path/to/repo::arch lvdisplay.txt /dev/vg0/*-snapshot 142 $ # remove snapshots here 143 144Now, let's see how to restore some LVs from such a backup. 145 146:: 147 148 $ borg extract /path/to/repo::arch lvdisplay.txt 149 $ # create empty LVs with correct sizes here (look into lvdisplay.txt). 150 $ # we assume that you created an empty root and home LV and overwrite it now: 151 $ borg extract --stdout /path/to/repo::arch dev/vg0/root-snapshot > /dev/vg0/root 152 $ borg extract --stdout /path/to/repo::arch dev/vg0/home-snapshot > /dev/vg0/home 153 154 155.. _append_only_mode: 156 157Append-only mode 158~~~~~~~~~~~~~~~~ 159 160A repository can be made "append-only", which means that Borg will never overwrite or 161delete committed data (append-only refers to the segment files, but borg will also 162reject to delete the repository completely). This is useful for scenarios where a 163backup client machine backups remotely to a backup server using ``borg serve``, since 164a hacked client machine cannot delete backups on the server permanently. 165 166To activate append-only mode, set ``append_only`` to 1 in the repository config: 167 168:: 169 170 borg config /path/to/repo append_only 1 171 172Note that you can go back-and-forth between normal and append-only operation with 173``borg config``; it's not a "one way trip." 174 175In append-only mode Borg will create a transaction log in the ``transactions`` file, 176where each line is a transaction and a UTC timestamp. 177 178In addition, ``borg serve`` can act as if a repository is in append-only mode with 179its option ``--append-only``. This can be very useful for fine-tuning access control 180in ``.ssh/authorized_keys``: 181 182:: 183 184 command="borg serve --append-only ..." ssh-rsa <key used for not-always-trustable backup clients> 185 command="borg serve ..." ssh-rsa <key used for backup management> 186 187Running ``borg init`` via a ``borg serve --append-only`` server will *not* create 188an append-only repository. Running ``borg init --append-only`` creates an append-only 189repository regardless of server settings. 190 191Example 192+++++++ 193 194Suppose an attacker remotely deleted all backups, but your repository was in append-only 195mode. A transaction log in this situation might look like this: 196 197:: 198 199 transaction 1, UTC time 2016-03-31T15:53:27.383532 200 transaction 5, UTC time 2016-03-31T15:53:52.588922 201 transaction 11, UTC time 2016-03-31T15:54:23.887256 202 transaction 12, UTC time 2016-03-31T15:55:54.022540 203 transaction 13, UTC time 2016-03-31T15:55:55.472564 204 205From your security logs you conclude the attacker gained access at 15:54:00 and all 206the backups where deleted or replaced by compromised backups. From the log you know 207that transactions 11 and later are compromised. Note that the transaction ID is the 208name of the *last* file in the transaction. For example, transaction 11 spans files 6 209to 11. 210 211In a real attack you'll likely want to keep the compromised repository 212intact to analyze what the attacker tried to achieve. It's also a good idea to make this 213copy just in case something goes wrong during the recovery. Since recovery is done by 214deleting some files, a hard link copy (``cp -al``) is sufficient. 215 216The first step to reset the repository to transaction 5, the last uncompromised transaction, 217is to remove the ``hints.N``, ``index.N`` and ``integrity.N`` files in the repository (these 218files are always expendable). In this example N is 13. 219 220Then remove or move all segment files from the segment directories in ``data/`` starting 221with file 6:: 222 223 rm data/**/{6..13} 224 225That's all to do in the repository. 226 227If you want to access this rollbacked repository from a client that already has 228a cache for this repository, the cache will reflect a newer repository state 229than what you actually have in the repository now, after the rollback. 230 231Thus, you need to clear the cache:: 232 233 borg delete --cache-only repo 234 235The cache will get rebuilt automatically. Depending on repo size and archive 236count, it may take a while. 237 238You also will need to remove ~/.config/borg/security/REPOID/manifest-timestamp. 239 240Drawbacks 241+++++++++ 242 243As data is only appended, and nothing removed, commands like ``prune`` or ``delete`` 244won't free disk space, they merely tag data as deleted in a new transaction. 245 246Be aware that as soon as you write to the repo in non-append-only mode (e.g. prune, 247delete or create archives from an admin machine), it will remove the deleted objects 248permanently (including the ones that were already marked as deleted, but not removed, 249in append-only mode). Automated edits to the repository (such as a cron job running 250``borg prune``) will render append-only mode moot if data is deleted. 251 252Even if an archive appears to be available, it is possible an attacker could delete 253just a few chunks from an archive and silently corrupt its data. While in append-only 254mode, this is reversible, but ``borg check`` should be run before a writing/pruning 255operation on an append-only repository to catch accidental or malicious corruption:: 256 257 # run without append-only mode 258 borg check --verify-data repo 259 260Aside from checking repository & archive integrity you may want to also manually check 261backups to ensure their content seems correct. 262 263Further considerations 264++++++++++++++++++++++ 265 266Append-only mode is not respected by tools other than Borg. ``rm`` still works on the 267repository. Make sure that backup client machines only get to access the repository via 268``borg serve``. 269 270Ensure that no remote access is possible if the repository is temporarily set to normal mode 271for e.g. regular pruning. 272 273Further protections can be implemented, but are outside of Borg's scope. For example, 274file system snapshots or wrapping ``borg serve`` to set special permissions or ACLs on 275new data files. 276 277SSH batch mode 278~~~~~~~~~~~~~~ 279 280When running Borg using an automated script, ``ssh`` might still ask for a password, 281even if there is an SSH key for the target server. Use this to make scripts more robust:: 282 283 export BORG_RSH='ssh -oBatchMode=yes' 284