xref: /freebsd/share/man/man9/devstat.9 (revision aa0a1e58)
1.\"
2.\" Copyright (c) 1998, 1999 Kenneth D. Merry.
3.\" All rights reserved.
4.\"
5.\" Redistribution and use in source and binary forms, with or without
6.\" modification, are permitted provided that the following conditions
7.\" are met:
8.\" 1. Redistributions of source code must retain the above copyright
9.\"    notice, this list of conditions and the following disclaimer.
10.\" 2. Redistributions in binary form must reproduce the above copyright
11.\"    notice, this list of conditions and the following disclaimer in the
12.\"    documentation and/or other materials provided with the distribution.
13.\" 3. The name of the author may not be used to endorse or promote products
14.\"    derived from this software without specific prior written permission.
15.\"
16.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
17.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
18.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
19.\" ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
20.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
21.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
22.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
23.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
24.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
25.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
26.\" SUCH DAMAGE.
27.\"
28.\" $FreeBSD$
29.\"
30.Dd May 22, 1998
31.Dt DEVSTAT 9
32.Os
33.Sh NAME
34.Nm devstat ,
35.Nm devstat_add_entry ,
36.Nm devstat_end_transaction ,
37.Nm devstat_end_transaction_bio ,
38.Nm devstat_remove_entry ,
39.Nm devstat_start_transaction
40.Nd kernel interface for keeping device statistics
41.Sh SYNOPSIS
42.In sys/devicestat.h
43.Ft void
44.Fo devstat_add_entry
45.Fa "struct devstat *ds"
46.Fa "const char *dev_name"
47.Fa "int unit_number"
48.Fa "u_int32_t block_size"
49.Fa "devstat_support_flags flags"
50.Fa "devstat_type_flags device_type"
51.Fa "devstat_priority priority"
52.Fc
53.Ft void
54.Fn devstat_remove_entry "struct devstat *ds"
55.Ft void
56.Fn devstat_start_transaction "struct devstat *ds"
57.Ft void
58.Fo devstat_end_transaction
59.Fa "struct devstat *ds"
60.Fa "u_int32_t bytes"
61.Fa "devstat_tag_type tag_type"
62.Fa "devstat_trans_flags flags"
63.Fc
64.Ft void
65.Fo devstat_end_transaction_bio
66.Fa "struct devstat *ds"
67.Fa "struct bio *bp"
68.Fc
69.Sh DESCRIPTION
70The devstat subsystem is an interface for recording device
71statistics, as its name implies.
72The idea is to keep reasonably detailed
73statistics while utilizing a minimum amount of CPU time to record them.
74Thus, no statistical calculations are actually performed in the kernel
75portion of the
76.Nm
77code.
78Instead, that is left for user programs to handle.
79.Pp
80.Fn devstat_add_entry
81registers a device with the
82.Nm
83subsystem.
84The caller is expected to have already allocated \fBand zeroed\fR
85the devstat structure before calling this function.
86.Fn devstat_add_entry
87takes several arguments:
88.Bl -tag -width device_type
89.It ds
90The
91.Va devstat
92structure, allocated and zeroed by the client.
93.It dev_name
94The device name, e.g.\& da, cd, sa.
95.It unit_number
96Device unit number.
97.It block_size
98Block size of the device, if supported.
99If the device does not support a
100block size, or if the blocksize is unknown at the time the device is added
101to the
102.Nm
103list, it should be set to 0.
104.It flags
105Flags indicating operations supported or not supported by the device.
106See below for details.
107.It device_type
108The device type.
109This is broken into three sections: base device type
110(e.g.\& direct access, CDROM, sequential access), interface type (IDE, SCSI
111or other) and a pass-through flag to indicate pas-through devices.
112See below for a complete list of types.
113.It priority
114The device priority.
115The priority is used to determine how devices are
116sorted within
117.Nm devstat Ns 's
118list of devices.
119Devices are sorted first by priority (highest to lowest),
120and then by attach order.
121See below for a complete list of available
122priorities.
123.El
124.Pp
125.Fn devstat_remove_entry
126removes a device from the
127.Nm
128subsystem.
129It takes the devstat structure for the device in question as
130an argument.
131The
132.Nm
133generation number is incremented and the number of devices is decremented.
134.Pp
135.Fn devstat_start_transaction
136registers the start of a transaction with the
137.Nm
138subsystem.
139The busy count is incremented with each transaction start.
140When a device goes from idle to busy, the system uptime is recorded in the
141.Va start_time
142field of the
143.Va devstat
144structure.
145.Pp
146.Fn devstat_end_transaction
147registers the end of a transaction with the
148.Nm
149subsystem.
150It takes four arguments:
151.Bl -tag -width tag_type
152.It ds
153The
154.Va devstat
155structure for the device in question.
156.It bytes
157The number of bytes transferred in this transaction.
158.It tag_type
159Transaction tag type.
160See below for tag types.
161.It flags
162Transaction flags indicating whether the transaction was a read, write, or
163whether no data was transferred.
164.El
165.Pp
166.Fn devstat_end_transaction_bio
167is a wrapper for
168.Fn devstat_end_transaction
169which pulls all the information from a
170.Va "struct bio"
171which is ready for biodone().
172.Pp
173The
174.Va devstat
175structure is composed of the following fields:
176.Bl -tag -width dev_creation_time
177.It dev_links
178Each
179.Va devstat
180structure is placed in a linked list when it is registered.
181The
182.Va dev_links
183field contains a pointer to the next entry in the list of
184.Va devstat
185structures.
186.It device_number
187The device number is a unique identifier for each device.
188The device
189number is incremented for each new device that is registered.
190The device
191number is currently only a 32-bit integer, but it could be enlarged if
192someone has a system with more than four billion device arrival events.
193.It device_name
194The device name is a text string given by the registering driver to
195identify itself.
196(e.g.\&
197.Dq da ,
198.Dq cd ,
199.Dq sa ,
200etc.)
201.It unit_number
202The unit number identifies the particular instance of the peripheral driver
203in question.
204.It bytes_written
205This is the number of bytes that have been written to the device.
206This number is currently an unsigned 64 bit integer.
207This will hopefully
208eliminate the counter wrap that would come very quickly on some systems if
20932 bit integers were used.
210.It bytes_read
211This is the number of bytes that have been read from the device.
212.It bytes_freed
213This is the number of bytes that have been freed/erased on the device.
214.It num_reads
215This is the number of reads from the device.
216.It num_writes
217This is the number of writes to the device.
218.It num_frees
219This is the number of free/erase operations on the device.
220.It num_other
221This is the number of transactions to the device which are neither reads or
222writes.
223For instance,
224.Tn SCSI
225drivers often send a test unit ready command to
226.Tn SCSI
227devices.
228The test unit ready command does not read or write any data.
229It merely causes the device to return its status.
230.It busy_count
231This is the current number of outstanding transactions for the device.
232This should never go below zero, and on an idle device it should be zero.
233If either one of these conditions is not true, it indicates a problem in
234the way
235.Fn devstat_start_transaction
236and
237.Fn devstat_end_transaction
238are being called in client code.
239There should be one and only one
240transaction start event and one transaction end event for each transaction.
241.It block_size
242This is the block size of the device, if the device has a block size.
243.It tag_types
244This is an array of counters to record the number of various tag types that
245are sent to a device.
246See below for a list of tag types.
247.It dev_creation_time
248This is the time, as reported by
249.Fn getmicrotime
250that the device was registered.
251.It busy_time
252This is the amount of time that the device busy count has been greater than
253zero.
254This is only updated when the busy count returns to zero.
255.It start_time
256This is the time, as reported by
257.Fn getmicrouptime
258that the device busy count went from zero to one.
259.It last_comp_time
260This is the time as reported by
261.Fn getmicrouptime
262that a transaction last completed.
263It is used along with
264.Va start_time
265to calculate the device busy time.
266.It flags
267These flags indicate which statistics measurements are supported by a
268particular device.
269These flags are primarily intended to serve as an aid
270to userland programs that decipher the statistics.
271.It device_type
272This is the device type.
273It consists of three parts: the device type
274(e.g.\& direct access, CDROM, sequential access, etc.), the interface (IDE,
275SCSI or other) and whether or not the device in question is a pass-through
276driver.
277See below for a complete list of device types.
278.It priority
279This is the priority.
280This is the first parameter used to determine where
281to insert a device in the
282.Nm
283list.
284The second parameter is attach order.
285See below for a list of
286available priorities.
287.El
288.Pp
289Each device is given a device type.
290Pass-through devices have the same
291underlying device type and interface as the device they provide an
292interface for, but they also have the pass-through flag set.
293The base
294device types are identical to the
295.Tn SCSI
296device type numbers, so with
297.Tn SCSI
298peripherals, the device type returned from an inquiry is usually ORed with
299the
300.Tn SCSI
301interface type and the pass-through flag if appropriate.
302The device type
303flags are as follows:
304.Bd -literal -offset indent
305typedef enum {
306	DEVSTAT_TYPE_DIRECT	= 0x000,
307	DEVSTAT_TYPE_SEQUENTIAL	= 0x001,
308	DEVSTAT_TYPE_PRINTER	= 0x002,
309	DEVSTAT_TYPE_PROCESSOR	= 0x003,
310	DEVSTAT_TYPE_WORM	= 0x004,
311	DEVSTAT_TYPE_CDROM	= 0x005,
312	DEVSTAT_TYPE_SCANNER	= 0x006,
313	DEVSTAT_TYPE_OPTICAL	= 0x007,
314	DEVSTAT_TYPE_CHANGER	= 0x008,
315	DEVSTAT_TYPE_COMM	= 0x009,
316	DEVSTAT_TYPE_ASC0	= 0x00a,
317	DEVSTAT_TYPE_ASC1	= 0x00b,
318	DEVSTAT_TYPE_STORARRAY	= 0x00c,
319	DEVSTAT_TYPE_ENCLOSURE	= 0x00d,
320	DEVSTAT_TYPE_FLOPPY	= 0x00e,
321	DEVSTAT_TYPE_MASK	= 0x00f,
322	DEVSTAT_TYPE_IF_SCSI	= 0x010,
323	DEVSTAT_TYPE_IF_IDE	= 0x020,
324	DEVSTAT_TYPE_IF_OTHER	= 0x030,
325	DEVSTAT_TYPE_IF_MASK	= 0x0f0,
326	DEVSTAT_TYPE_PASS	= 0x100
327} devstat_type_flags;
328.Ed
329.Pp
330Devices have a priority associated with them, which controls roughly where
331they are placed in the
332.Nm
333list.
334The priorities are as follows:
335.Bd -literal -offset indent
336typedef enum {
337	DEVSTAT_PRIORITY_MIN	= 0x000,
338	DEVSTAT_PRIORITY_OTHER	= 0x020,
339	DEVSTAT_PRIORITY_PASS	= 0x030,
340	DEVSTAT_PRIORITY_FD	= 0x040,
341	DEVSTAT_PRIORITY_WFD	= 0x050,
342	DEVSTAT_PRIORITY_TAPE	= 0x060,
343	DEVSTAT_PRIORITY_CD	= 0x090,
344	DEVSTAT_PRIORITY_DISK	= 0x110,
345	DEVSTAT_PRIORITY_ARRAY	= 0x120,
346	DEVSTAT_PRIORITY_MAX	= 0xfff
347} devstat_priority;
348.Ed
349.Pp
350Each device has associated with it flags to indicate what operations are
351supported or not supported.
352The
353.Va devstat_support_flags
354values are as follows:
355.Bl -tag -width DEVSTAT_NO_ORDERED_TAGS
356.It DEVSTAT_ALL_SUPPORTED
357Every statistic type is supported by the device.
358.It DEVSTAT_NO_BLOCKSIZE
359This device does not have a blocksize.
360.It DEVSTAT_NO_ORDERED_TAGS
361This device does not support ordered tags.
362.It DEVSTAT_BS_UNAVAILABLE
363This device supports a blocksize, but it is currently unavailable.
364This
365flag is most often used with removable media drives.
366.El
367.Pp
368Transactions to a device fall into one of three categories, which are
369represented in the
370.Va flags
371passed into
372.Fn devstat_end_transaction .
373The transaction types are as follows:
374.Bd -literal -offset indent
375typedef enum {
376	DEVSTAT_NO_DATA	= 0x00,
377	DEVSTAT_READ	= 0x01,
378	DEVSTAT_WRITE	= 0x02,
379	DEVSTAT_FREE	= 0x03
380} devstat_trans_flags;
381.Ed
382.Pp
383There are four possible values for the
384.Va tag_type
385argument to
386.Fn devstat_end_transaction :
387.Bl -tag -width DEVSTAT_TAG_ORDERED
388.It DEVSTAT_TAG_SIMPLE
389The transaction had a simple tag.
390.It DEVSTAT_TAG_HEAD
391The transaction had a head of queue tag.
392.It DEVSTAT_TAG_ORDERED
393The transaction had an ordered tag.
394.It DEVSTAT_TAG_NONE
395The device does not support tags.
396.El
397.Pp
398The tag type values correspond to the lower four bits of the
399.Tn SCSI
400tag definitions.
401In CAM, for instance, the
402.Va tag_action
403from the CCB is ORed with 0xf to determine the tag type to pass in to
404.Fn devstat_end_transaction .
405.Pp
406There is a macro,
407.Dv DEVSTAT_VERSION
408that is defined in
409.In sys/devicestat.h .
410This is the current version of the
411.Nm
412subsystem, and it should be incremented each time a change is made that
413would require recompilation of userland programs that access
414.Nm
415statistics.
416Userland programs use this version, via the
417.Va kern.devstat.version
418.Nm sysctl
419variable to determine whether they are in sync with the kernel
420.Nm
421structures.
422.Sh SEE ALSO
423.Xr systat 1 ,
424.Xr devstat 3 ,
425.Xr iostat 8 ,
426.Xr rpc.rstatd 8 ,
427.Xr vmstat 8
428.Sh HISTORY
429The
430.Nm
431statistics system appeared in
432.Fx 3.0 .
433.Sh AUTHORS
434.An Kenneth Merry Aq ken@FreeBSD.org
435.Sh BUGS
436There may be a need for
437.Fn spl
438protection around some of the
439.Nm
440list manipulation code to ensure, for example, that the list of devices
441is not changed while someone is fetching the
442.Va kern.devstat.all
443.Nm sysctl
444variable.
445.Pp
446It is impossible with the current
447.Nm
448architecture to accurately measure time per transaction.
449The only feasible
450way to accurately measure time per transaction would be to record a
451timestamp for every transaction.
452This measurement is probably not
453worthwhile for most people as it would adversely affect the performance of
454the system and cost space to store the timestamps for individual
455transactions.
456