1Storage Daemon Design 2===================== 3 4This chapter is intended to be a technical discussion of the Storage 5daemon services and as such is not targeted at end users but rather at 6developers and system administrators that want or need to know more of 7the working details of **Bareos**. 8 9This document is somewhat out of date. 10 11SD Design Introduction 12---------------------- 13 14The Bareos Storage daemon provides storage resources to a Bareos 15installation. An individual Storage daemon is associated with a physical 16permanent storage device (for example, a tape drive, CD writer, tape 17changer or jukebox, etc.), and may employ auxiliary storage resources 18(such as space on a hard disk file system) to increase performance 19and/or optimize use of the permanent storage medium. 20 21Any number of storage daemons may be run on a given machine; each 22associated with an individual storage device connected to it, and Bareos 23operations may employ storage daemons on any number of hosts connected 24by a network, local or remote. The ability to employ remote storage 25daemons (with appropriate security measures) permits automatic off-site 26backup, possibly to publicly available backup repositories. 27 28SD Development Outline 29---------------------- 30 31In order to provide a high performance backup and restore solution that 32scales to very large capacity devices and networks, the storage daemon 33must be able to extract as much performance from the storage device and 34network with which it interacts. In order to accomplish this, storage 35daemons will eventually have to sacrifice simplicity and painless 36portability in favor of techniques which improve performance. My goal in 37designing the storage daemon protocol and developing the initial 38prototype storage daemon is to provide for these additions in the 39future, while implementing an initial storage daemon which is very 40simple and portable to almost any POSIX-like environment. This original 41storage daemon (and its evolved descendants) can serve as a portable 42solution for non-demanding backup requirements (such as single servers 43of modest size, individual machines, or small local networks), while 44serving as the starting point for development of higher performance 45configurable derivatives which use techniques such as POSIX threads, 46shared memory, asynchronous I/O, buffering to high-speed intermediate 47media, and support for tape changers and jukeboxes. 48 49SD Connections and Sessions 50--------------------------- 51 52A client connects to a storage server by initiating a conventional TCP 53connection. The storage server accepts the connection unless its maximum 54number of connections has been reached or the specified host is not 55granted access to the storage server. Once a connection has been opened, 56the client may make any number of Query requests, and/or initiate (if 57permitted), one or more Append sessions (which transmit data to be 58stored by the storage daemon) and/or Read sessions (which retrieve data 59from the storage daemon). 60 61Most requests and replies sent across the connection are simple ASCII 62strings, with status replies prefixed by a four digit status code for 63easier parsing. Binary data appear in blocks stored and retrieved from 64the storage. Any request may result in a single-line status reply of 65“3201 Notificationpending”, which indicates the client must send a 66“Query notification” request to retrieve one or more notifications 67posted to it. Once the notifications have been returned, the client may 68then resubmit the request which resulted in the 3201 status. 69 70The following descriptions omit common error codes, yet to be defined, 71which can occur from most or many requests due to events like media 72errors, restarting of the storage daemon, etc. These details will be 73filled in, along with a comprehensive list of status codes along with 74which requests can produce them in an update to this document. 75 76SD Append Requests 77~~~~~~~~~~~~~~~~~~ 78 79append open session = <JobId> [ <Password> ] 80 A data append session is opened with the Job ID given by *JobId* 81 with client password (if required) given by *Password*. If the 82 session is successfully opened, a status of 3000 OK is returned with 83 a “ticket = \ *number*” reply used to identify subsequent messages 84 in the session. If too many sessions are open, or a conflicting 85 session (for example, a read in progress when simultaneous read and 86 append sessions are not permitted), a status of “3502 Volume busy” 87 is returned. If no volume is mounted, or the volume mounted cannot 88 be appended to, a status of “3503 Volume not mounted” is returned. 89append data = <ticket-number> 90 If the append data is accepted, a status of 3000 OK data address = 91 <IPaddress> port = <port> is returned, where the IPaddress and port 92 specify the IP address and port number of the data channel. Error 93 status codes are 3504 Invalid ticket number and 94 3505 Session aborted, the latter of which indicates the entire 95 append session has failed due to a daemon or media error. 96 97 Once the File daemon has established the connection to the data 98 channel opened by the Storage daemon, it will transfer a header 99 packet followed by any number of data packets. The header packet is 100 of the form: 101 102 file-index> <stream-id> <info> 103 104 The details are specified in the section of this document. 105 106\*append abort session = <ticket-number> 107 The open append session with ticket *ticket-number* is aborted; any 108 blocks not yet written to permanent media are discarded. Subsequent 109 attempts to append data to the session will receive an error status 110 of 3505Session aborted. 111append end session = <ticket-number> 112 The open append session with ticket *ticket-number* is marked 113 complete; no further blocks may be appended. The storage daemon will 114 give priority to saving any buffered blocks from this session to 115 permanent media as soon as possible. 116append close session = <ticket-number> 117 The append session with ticket *ticket* is closed. This message does 118 not receive an 3000 OK reply until all of the content of the session 119 are stored on permanent media, at which time said reply is given, 120 followed by a list of volumes, from first to last, which contain 121 blocks from the session, along with the first and last file and 122 block on each containing session data and the volume session key 123 identifying data from that session in lines with the following 124 format: 125 126 Volume-id> <start-file> <start-block> <end-file> <end-block> 127 <volume-session-id>where *Volume-id* is the volume label, 128 *start-file* and *start-block* are the file and block containing the 129 first data from that session on the volume, *end-file* and 130 *end-block* are the file and block with the last data from the 131 session on the volume and *volume-session-id* is the volume session 132 ID for blocks from the session stored on that volume. 133 134SD Read Requests 135~~~~~~~~~~~~~~~~ 136 137Read open session = <JobId> <Volume-id> <start-file> <start-block> <end-file> <end-block> <volume-session-id> <password> 138 where *Volume-id* is the volume label, *start-file* and 139 *start-block* are the file and block containing the first data from 140 that session on the volume, *end-file* and *end-block* are the file 141 and block with the last data from the session on the volume and 142 *volume-session-id* is the volume session ID for blocks from the 143 session stored on that volume. 144 145 If the session is successfully opened, a status of 146 147 \`\` 148 149 is returned with a reply used to identify subsequent messages in the 150 session. If too many sessions are open, or a conflicting session 151 (for example, an append in progress when simultaneous read and 152 append sessions are not permitted), a status of ”3502 Volume busy“ 153 is returned. If no volume is mounted, or the volume mounted cannot 154 be appended to, a status of ”3503 Volume not mounted“ is returned. 155 If no block with the given volume session ID and the correct client 156 ID number appears in the given first file and block for the volume, 157 a status of ”3505 Session notfound\`\` is returned. 158 159Read data = <Ticket> > <Block> 160 The specified Block of data from open read session with the 161 specified Ticket number is returned, with a status of 3000 OK 162 followed by a ”Length = \ *size*\ “ line giving the length in bytes 163 of the block data which immediately follows. Blocks must be 164 retrieved in ascending order, but blocks may be skipped. If a block 165 number greater than the largest stored on the volume is requested, a 166 status of ”3201 End of volume“ is returned. If a block number 167 greater than the largest in the file is requested, a status of 168 ”3401 End of file\`\` is returned. 169Read close session = <Ticket> 170 The read session with Ticket number is closed. A read session may be 171 closed at any time; you needn’t read all its blocks before closing 172 it. 173 174 175SD Data Structures 176------------------ 177 178In the Storage daemon, there is a Device resource (i.e. from conf file) 179that describes each physical device. When the physical device is used it 180is controled by the DEVICE structure (defined in dev.h), and typically 181refered to as dev in the C++ code. Anyone writing or reading a physical 182device must ultimately get a lock on the DEVICE structure – this 183controls the device. However, multiple Jobs (defined by a JCR structure 184src/jcr.h) can be writing a physical DEVICE at the same time (of course 185they are sequenced by locking the DEVICE structure). There are a lot of 186job dependent “device” variables that may be different for each Job such 187as spooling (one job may spool and another may not, and when a job is 188spooling, it must have an i/o packet open, each job has its own record 189and block structures, …), so there is a device control record or DCR 190that is the primary way of interfacing to the physical device. The DCR 191contains all the job specific data as well as a pointer to the Device 192resource (DEVRES structure) and the physical DEVICE structure. 193 194Now if a job is writing to two devices (it could be writing two separate 195streams to the same device), it must have two DCRs. Today, the code only 196permits one. This won’t be hard to change, but it is new code. 197 198Today three jobs (threads), two physical devices each job writes to only 199one device: 200 201:: 202 203 Job1 -> DCR1 -> DEVICE1 204 Job2 -> DCR2 -> DEVICE1 205 Job3 -> DCR3 -> DEVICE2 206 207To be implemented three jobs, three physical devices, but job1 is 208writing simultaneously to three devices: 209 210:: 211 212 Job1 -> DCR1 -> DEVICE1 213 -> DCR4 -> DEVICE2 214 -> DCR5 -> DEVICE3 215 Job2 -> DCR2 -> DEVICE1 216 Job3 -> DCR3 -> DEVICE2 217 218 Job = job control record 219 DCR = Job contorl data for a specific device 220 DEVICE = Device only control data 221