xref: /dragonfly/sbin/jscan/jscan.8 (revision 6e278935)
1.\" Copyright (c) 2004,2005 The DragonFly Project.  All rights reserved.
2.\"
3.\" This code is derived from software contributed to The DragonFly Project
4.\" by Matthew Dillon <dillon@backplane.com>
5.\"
6.\"
7.\" Redistribution and use in source and binary forms, with or without
8.\" modification, are permitted provided that the following conditions
9.\" are met:
10.\"
11.\" 1. Redistributions of source code must retain the above copyright
12.\"    notice, this list of conditions and the following disclaimer.
13.\" 2. Redistributions in binary form must reproduce the above copyright
14.\"    notice, this list of conditions and the following disclaimer in
15.\"    the documentation and/or other materials provided with the
16.\"    distribution.
17.\" 3. Neither the name of The DragonFly Project nor the names of its
18.\"    contributors may be used to endorse or promote products derived
19.\"    from this software without specific, prior written permission.
20.\"
21.\" THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
22.\" ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
23.\" LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
24.\" FOR A PARTICULAR PURPOSE ARE DISCLAIMED.  IN NO EVENT SHALL THE
25.\" COPYRIGHT HOLDERS OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
26.\" INCIDENTAL, SPECIAL, EXEMPLARY OR CONSEQUENTIAL DAMAGES (INCLUDING,
27.\" BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
28.\" LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED
29.\" AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
30.\" OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT
31.\" OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
32.\" SUCH DAMAGE.
33.\"
34.Dd February 8, 2009
35.Dt JSCAN 8
36.Os
37.Sh NAME
38.Nm jscan
39.Nd journal file processing program
40.Sh SYNOPSIS
41.Nm
42.Op Fl 2dfuvF
43.Op Fl c Ar count[k,m,g,t]
44.Op Fl D Ar directory
45.Op Fl m Ar mirror_transid_file/none
46.Op Fl o/O Ar output_transid_file/none
47.Op Fl s Ar size[k,m,g,t]
48.Op Fl w/W Ar journal_prefix
49.Op Ar journal_prefix/file
50.Sh DESCRIPTION
51The
52.Nm
53utility scans journal file or input stream for the purposes of debugging
54dumps, restoration, undo, mirroring, and other journaling features.
55.Bl -tag -width indent
56.It Fl 2
57Implement the full-duplex acknowledgement protocol on the input descriptor.
58Note that shell pipes are full-duplex and can be used with this option.
59.It Fl c Ar count
60Specify the number of transaction records which should be scanned, then exit.
61This option is typically used along with
62.Fl m
63to limit the amount of work that
64.Nm
65does, giving you the ability to incrementally run a mirror forwards or
66backwards.  It is not usually used when piping in a live journal, but it
67can be.
68.It Fl d
69Display the contents of the journaling file or stream in a human readable
70format on stderr.  Note that stdout is used only for
71.Fl o .
72.It Fl f
73.Nm
74will sleep for 5 seconds and loop when it hits EOF on file or prefix
75set input rather than exit.  This option is typically used when running
76on an input file or prefix set which is live (being written to by
77another
78.Nm
79instance).
80.It Fl D Ar directory
81Specify the base directory for the mirroring option.
82.It Fl m Ar mirror_transid_file/none
83Generate a mirror in the directory specified by
84.Fl D
85or, if not specified, the current directory.
86The
87.Ar mirror_transid_file
88will be used to track the transaction id representing the current
89synchronization point for the mirror.  The keyword
90.Ar none
91may be specified if no tracking file is desired.  However, if no tracking
92file is specified it will not be possible to roll the mirror forwards or
93backwards or restart the journaling stream being used to generate the mirror.
94.Pp
95It is important to note that journaling streams can contain meta-transactions
96representing huge, multi-gigabyte operations.  If the journaling data is
97not being recorded to regular files via
98.Fl w/W
99it is possible that
100.Nm
101could run itself out of memory trying to record the meta-transactions.
102In addition, the mirror would not be restartable.  If the journaling data
103is being recorded via
104.Fl w/W
105and a mirroring transaction id file is being kept, the mirror can be
106restarted.
107.Pp
108While it is possible to run a journaling stream directly into a mirror,
109it is more typical to file the jornaling stream with
110.Fl w
111and catch the mirror up as a batch job with the journaling file set prefix
112specified as the input every so often.  This way the system operator can
113use other
114.Nm
115commands to, for example, run a mirror backwards and forwards in time.
116.It Fl o/O Ar output_transid_file/none
117Generate a journaling stream on stdout using the specified file to track
118the transaction id to help with restarts.
119The
120.Fl o
121option indicates a half-duplex output stream while the
122.Fl O
123option indicates a full-duplex (ACK protocol) output stream.
124.Pp
125This option is not really designed to output to regular files because it
126does NOT necessarily weed out duplicate records.  When both the input
127stream and output stream are full-duplex and
128.Fl w/W
129is not specified,
130.Nm
131acts as a stateless transceiver and the input stream is not acked until
132an ack is received from the output stream.
133.Pp
134This option is most typically used in conjunction with
135.Fl w/W .
136In this case the ACK protocol is handled independently for the input side
137and the output side uses the journaling data recorded by
138.Fl w/W
139as a buffer.
140.Pp
141In half-duplex output mode the output transaction id file is updated
142after a raw transaction record has been successfully written to stdout.
143In full-duplex output mode the file is only updated with ACK data returned
144on the stdout descriptor.
145.Pp
146As with the
147.Fl m
148option, you can combine
149.Fl o
150in a journaling pipe with other options, but if you are trying to use it
151as a buffer it may be better to have it separately pull its data off of
152a journaling file set generated via
153.Fl w .
154.It Fl s Ar size
155Change the size limit for rotating files created via
156.Fl w .
157The default is 100M.  Values are in bytes or may be suffixed with k,
158m, or g.
159If a raw transaction causes the file's size limit to be exceeded, a new file
160will be created.  If a raw transaction is, in-whole, larger than the file's
161size limit, the raw transaction will still be fully written to the file before
162a new file is created.  Raw transactions are typically limited to the size
163of the source system's memory FIFO.  This option is typically used to size
164journaling files to fit onto the appropriate backup media or to provide
165bite-sized chunks for other programs to ingest.
166.Pp
167When restarting a journal, a new sequence number will always be chosen for
168the resumption of data recording.  No existing file will be appended to when
169.Nm
170is reinvoked.
171.It Fl u
172Will cause the journal to be scanned backwards (requires seekable media).
173Transactions will be dumped in reverse order.  If mirroring, the UNDO
174data will be executed.  If not specified, 1 hour's worth of data will be
175undone.  Can only be used with a journaling file or journaling prefix
176as the input.
177.It Fl v
178Increase verbosity on stderr.  This option is primarily used for debugging.
179.It Fl w Ar prefix
180The received journaling stream is recorded in journaling files named
181.Ar <prefix>.<seq>
182and the current transaction id is tracked in a file named
183.Ar <prefix>.transid .
184A journaling file is closed out and a new file with the next sequence
185number is created once the file surpasses 100MB.
186.Pp
187This option is robust across restarts.  The current transaction id
188will be read and the input stream will be skipped until it is reached.
189If the input is a journaling file or prefix set,
190.Nm
191will be able to quickly seek to the restart point.
192.Pp
193NOTE: If
194you are generating a mirror with the same command via
195.Fl m ,
196and the journaling data input is a stream rather than a file or prefix
197set, you must use
198.Fl w/W
199if you want the mirror to be restartable.  This is because while we can
200pick up the transaction id where we left off, that raw transaction id may
201have cut a larger meta-transaction in half and the mirroring code will
202not be able to access the whole of the transaction unless it has a file
203or prefix set to work with.
204.It Fl W Ar prefix
205Similar to
206.Fl w
207except that the journaling files created are strictly temporary and will
208be deleted once they exceed the size limit AND the related meta-transactions
209have been completed.
210.Pp
211If combined with
212.Fl m ,
213the meta-transactions are considered to be completed only when the mirror
214finishes executing them.  It is possible for several sequence number files
215to build if a particularly large meta-transaction is coming down the pipe.
216.Pp
217If combined with
218.Fl o/O ,
219the meta-transactions are considered to be completed when the data has
220been successfully written out to the pipe in half duplex mode, or when
221the ACK has been received in full-duplex mode.
222.Pp
223If both
224.Fl m
225and
226.Fl w/W
227is used, the journaling data files are only deleted when both actions
228no longer need the data.
229.It Fl F
230Forces
231.Nm
232to
233.Fn fsync
234after updating a journaling file prior to acknowledging the
235data or updating a transaction-id-tracking file.  If specified twice,
236.Nm
237will also
238.Fn fsync
239after updating the transaction-id-tracking file.
240.It Ar journal_prefix/file
241Specify the input to
242.Nm .
243This can be a journaling file set prefix
244or it can be a plain file.  If no input file is specified, stdin is
245assumed.  Note that when generating a mirror from a stdin stream, the
246mirror will not be restartable unless
247.Fl w/W
248is also used.
249.El
250.Sh OPERATIONAL NOTES
251It is often important to be able to quickly stage journaled data through
252a dedicated backup machine on a LAN.  There are several places where data
253can be buffered and staged out.
254.Pp
255The machine generating the journal typically buffers several megabytes of
256journal data in the kernel.  This local machine can pipe that data to
257.Nm
258or some other locally run program to add another buffering stage, or you
259can directly attach a TCP connection to the kernel's journaling output.
260.Pp
261The LAN backup box typically buffers gigabytes worth of data by running
262multiple jscans.  The
263.Nm
264on the receiving end of the TCP or pipe (for
265example, via ssh) typically records the data via the
266.Fl w
267option, and then runs other
268.Nm
269programs from scripts or cron to take that data and copy it to your
270off-site backup machine.  Other
271.Nm
272programs may use the same data
273set to generate mirrors or other backup streams.
274.Pp
275It should be noted that if
276.Fl w/W
277is specified, both mirroring mode and output mode will internally
278fork the program once the appropriate synchronization point has been reached,
279effectively decoupling their operation, and read all of their data via
280the journaling files written out by the master program.  In particular,
281blockages in the mirroring and output code will not affect our ability
282to buffer the journaling input data via
283.Fl w/W .
284If
285.Fl w/W
286is not specified then neither the mirroring or output modes will fork.  Under
287these conditions, if the input is a stream rather than a file
288.Nm
289will be forced to buffer meta-transactions (for mirroring) entirely in
290memory, which could present a serious problem since a single meta-transaction
291can exceed a gigabyte (e.g. if someone were to do a single
292.Fn write
293system call writing a gigabyte all in one go).
294.Sh SEE ALSO
295.Xr mountctl 8
296.Sh CAVEATS
297This utility is currently under construction and not all features have been
298implemented yet.
299In fact, most have not.
300.Sh HISTORY
301The
302.Nm
303utility first appeared in
304.Dx 1.3 .
305