1.\" $NetBSD: sysctl.9,v 1.17 2010/05/16 05:18:35 jruoho Exp $ 2.\" 3.\" Copyright (c) 2004 The NetBSD Foundation, Inc. 4.\" All rights reserved. 5.\" 6.\" This code is derived from software contributed to The NetBSD Foundation 7.\" by Andrew Brown. 8.\" 9.\" Redistribution and use in source and binary forms, with or without 10.\" modification, are permitted provided that the following conditions 11.\" are met: 12.\" 1. Redistributions of source code must retain the above copyright 13.\" notice, this list of conditions and the following disclaimer. 14.\" 2. Redistributions in binary form must reproduce the above copyright 15.\" notice, this list of conditions and the following disclaimer in the 16.\" documentation and/or other materials provided with the distribution. 17.\" 18.\" THIS SOFTWARE IS PROVIDED BY THE NETBSD FOUNDATION, INC. AND CONTRIBUTORS 19.\" ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED 20.\" TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR 21.\" PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE FOUNDATION OR CONTRIBUTORS 22.\" BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR 23.\" CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF 24.\" SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS 25.\" INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN 26.\" CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) 27.\" ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE 28.\" POSSIBILITY OF SUCH DAMAGE. 29.\" 30.Dd May 16, 2010 31.Dt SYSCTL 9 32.Os 33.Sh NAME 34.Nm sysctl 35.Nd system variable control interfaces 36.Sh SYNOPSIS 37.In sys/param.h 38.In sys/sysctl.h 39.Pp 40Primary external interfaces: 41.Ft void 42.Fn sysctl_init void 43.Ft int 44.Fn sysctl_lock "struct lwp *l" "void *oldp" "size_t savelen" 45.Ft int 46.Fn sysctl_dispatch "const int *name" "u_int namelen" "void *oldp" \ 47"size_t *oldlenp" "const void *newp" "size_t newlen" "const int *oname" \ 48"struct lwp *l" "const struct sysctlnode *rnode" 49.Ft void 50.Fn sysctl_unlock "struct lwp *l" 51.Ft int 52.Fn sysctl_createv "struct sysctllog **log" "int cflags" \ 53"const struct sysctlnode **rnode" "const struct sysctlnode **cnode" \ 54"int flags" "int type" "const char *namep" "const char *desc" \ 55"sysctlfn func" "u_quad_t qv" "void *newp" "size_t newlen" ... 56.Ft int 57.Fn sysctl_destroyv "struct sysctlnode *rnode" ... 58.Ft void 59.Fn sysctl_free "struct sysctlnode *rnode" 60.Ft void 61.Fn sysctl_teardown "struct sysctllog **" 62.Ft int 63.Fn old_sysctl "int *name" "u_int namelen" "void *oldp" \ 64"size_t *oldlenp" "void *newp" "size_t newlen" "struct lwp *l" 65.Pp 66Core internal functions: 67.Ft int 68.Fn sysctl_locate "struct lwp *l" "const int *name" "u_int namelen" \ 69"const struct sysctlnode **rnode" "int *nip" 70.Ft int 71.Fn sysctl_lookup "const int *name" "u_int namelen" "void *oldp" \ 72"size_t *oldlenp" "const void *newp" "size_t newlen" "const int *oname" \ 73"struct lwp *l" "const struct sysctlnode *rnode" 74.Ft int 75.Fn sysctl_create "const int *name" "u_int namelen" "void *oldp" \ 76"size_t *oldlenp" "const void *newp" "size_t newlen" "const int *oname" \ 77"struct lwp *l" "const struct sysctlnode *rnode" 78.Ft int 79.Fn sysctl_destroy "const int *name" "u_int namelen" "void *oldp" \ 80"size_t *oldlenp" "const void *newp" "size_t newlen" "const int *oname" \ 81"struct lwp *l" "const struct sysctlnode *rnode" 82.Ft int 83.Fn sysctl_query "const int *name" "u_int namelen" "void *oldp" \ 84"size_t *oldlenp" "const void *newp" "size_t newlen" "const int *oname" \ 85"struct lwp *l" "const struct sysctlnode *rnode" 86.Pp 87Simple 88.Dq helper 89functions: 90.Ft int 91.Fn sysctl_needfunc "const int *name" "u_int namelen" "void *oldp" \ 92"size_t *oldlenp" "const void *newp" "size_t newlen" "const int *oname" \ 93"struct lwp *l" "const struct sysctlnode *rnode" 94.Ft int 95.Fn sysctl_notavail "const int *name" "u_int namelen" "void *oldp" \ 96"size_t *oldlenp" "const void *newp" "size_t newlen" "const int *oname" \ 97"struct lwp *l" "const struct sysctlnode *rnode" 98.Ft int 99.Fn sysctl_null "const int *name" "u_int namelen" "void *oldp" \ 100"size_t *oldlenp" "const void *newp" "size_t newlen" "const int *oname" \ 101"struct lwp *l" "const struct sysctlnode *rnode" 102.Sh DESCRIPTION 103The SYSCTL subsystem instruments a number of kernel tunables and other 104data structures via a simple MIB-like interface, primarily for 105consumption by userland programs, but also for use internally by the 106kernel. 107.Sh LOCKING 108All operations on the SYSCTL tree must be protected by acquiring the 109main SYSCTL lock. 110The only functions that can be called when the lock is not held are 111.Fn sysctl_lock , 112.Fn sysctl_createv , 113.Fn sysctl_destroyv , 114and 115.Fn old_sysctl . 116All other functions require the tree to be locked. 117This is to prevent other users of the tree from moving nodes around 118during an add operation, or from destroying nodes or subtrees that are 119actively being used. 120The lock is acquired by calling 121.Fn sysctl_lock 122with a pointer to the process's lwp 123.Fa l 124.Dv ( NULL 125may be passed to all functions as the lwp pointer if no lwp is 126appropriate, though any changes made via 127.Fn sysctl_create , 128.Fn sysctl_destroy , 129.Fn sysctl_lookup , 130or by any helper function will be done with effective superuser 131privileges). 132.Pp 133The 134.Fa oldp 135and 136.Fa savelen 137arguments are a pointer to and the size of the memory region the 138caller will be using to collect data from SYSCTL. 139These may also be 140.Dv NULL 141and 0, respectively. 142.Pp 143The memory region will be locked via 144.Fn uvm_vslock 145if it is a region in userspace. 146The address and size of the region are recorded so that when the 147SYSCTL lock is to be released via 148.Fn sysctl_unlock , 149only the lwp pointer 150.Fa l 151is required. 152.Sh LOOKUPS 153Once the lock has been acquired, it is typical to call 154.Fn sysctl_dispatch 155to handle the request. 156.Fn sysctl_dispatch 157will examine the contents of 158.Fa name , 159an array of integers at least 160.Fa namelen 161long, which is to be located in kernel space, in order to determine 162which function to call to handle the specific request. 163.Pp 164The following algorithm is used by 165.Fn sysctl_dispatch 166to determine the function to call: 167.Pp 168.Bl -bullet -offset indent 169.It 170Scan the tree using 171.Fn sysctl_locate . 172.It 173If the node returned has a 174.Dq helper 175function, call it. 176.It 177If the requested node was found but has no function, call 178.Fn sysctl_lookup . 179.It 180If the node was not found and 181.Fa name 182specifies one of 183.Fn sysctl_query , 184.Fn sysctl_create , 185or 186.Fn sysctl_destroy , 187call the appropriate function. 188.It 189If none of these options applies and no other error was yet recorded, 190return 191.Er EOPNOTSUPP . 192.Pp 193.El 194The 195.Fa oldp 196and 197.Fa oldlenp 198arguments to 199.Fn sysctl_dispatch , 200as with all the other core functions, describe an area into which the 201current or requested value may be copied. 202.Fa oldp 203may or may not be a pointer into userspace (as dictated by whether 204.Fa l 205is 206.Dv NULL 207or not). 208.Fa oldlenp 209is a 210.No non- Ns Dv NULL 211pointer to a size_t. 212.Fa newp 213and 214.Fa newlen 215describe an area where the new value for the request may be found; 216.Fa newp 217may also be a pointer into userspace. 218The 219.Fa oname 220argument is a 221.No non- Ns Dv NULL 222pointer to the base of the request currently 223being processed. 224By simple arithmetic on 225.Fa name , 226.Fa namelen , 227and 228.Fa oname , 229one can easily determine the entire original request and 230.Fa namelen 231values, if needed. 232The 233.Fa rnode 234value, as passed to 235.Fn sysctl_dispatch 236represents the root of the tree into which the current request is to 237be dispatched. 238If 239.Dv NULL , 240the main tree will be used. 241.Pp 242The 243.Fn sysctl_locate 244function scans a tree for the node most specific to a request. 245If the pointer referenced by 246.Fa rnode 247is not 248.Dv NULL , 249the tree indicated is searched, otherwise the main tree 250will be used. 251The address of the most relevant node will be returned via 252.Fa rnode 253and the number of MIB entries consumed will be returned via 254.Fa nip , 255if it is not 256.Dv NULL . 257.Pp 258The 259.Fn sysctl_lookup 260function takes the same arguments as 261.Fn sysctl_dispatch 262with the caveat that the value for 263.Fa namelen 264must be zero in order to indicate that the node referenced by the 265.Fa rnode 266argument is the one to which the lookup is being applied. 267.Sh CREATION AND DESTRUCTION OF NODES 268New nodes are created and destroyed by the 269.Fn sysctl_create 270and 271.Fn sysctl_destroy 272functions. 273These functions take the same arguments as 274.Fn sysctl_dispatch 275with the additional requirement that the 276.Fa namelen 277argument must be 1 and the 278.Fa name 279argument must point to an integer valued either 280.Dv CTL_CREATE 281or 282.Dv CTL_CREATESYM 283when creating a new node, or 284.Dv CTL_DESTROY 285when destroying 286a node. 287.Pp 288The 289.Fa newp 290and 291.Fa newlen 292arguments should point to a copy of the node to be created or 293destroyed. 294If the create or destroy operation was successful, a copy of the node 295created or destroyed will be placed in the space indicated by 296.Fa oldp 297and 298.Fa oldlenp . 299If the create operation fails because of a conflict with an existing 300node, a copy of that node will be returned instead. 301.Pp 302In order to facilitate the creation and destruction of nodes from a 303given tree by kernel subsystems, the functions 304.Fn sysctl_createv 305and 306.Fn sysctl_destroyv 307are provided. 308These functions take care of the overhead of filling in the contents 309of the create or destroy request, dealing with locking, locating the 310appropriate parent node, etc. 311.Pp 312The arguments to 313.Fn sysctl_createv 314are used to construct the new node. 315If the 316.Fa log 317argument is not 318.Dv NULL , 319a 320.Em sysctllog 321structure will be allocated and the pointer referenced 322will be changed to address it. 323The same log may be used for any number of nodes, provided they are 324all inserted into the same tree. 325This allows for a series of nodes to be created and later removed from 326the tree in a single transaction (via 327.Fn sysctl_teardown ) 328without the need for any record 329keeping on the caller's part. 330.Pp 331The 332.Fa cflags 333argument is currently unused and must be zero. 334The 335.Fa rnode 336argument must either be 337.Dv NULL 338or a valid pointer to a reference to the root of the tree into which 339the new node must be placed. 340If it is 341.Dv NULL , 342the main tree will be used. 343It is illegal for 344.Fa rnode 345to refer to a 346.Dv NULL 347pointer. 348If the 349.Fa cnode 350argument is not 351.Dv NULL , 352on return it will be adjusted to point to the address of the new node. 353.Pp 354The 355.Fa flags 356and 357.Fa type 358arguments are combined into the 359.Fa sysctl_flags 360field, and the current value for 361.Dv SYSCTL_VERSION 362is added in. 363The following types are defined: 364.Bl -tag -width ".Dv CTLTYPE_STRING " -offset indent 365.It Dv CTLTYPE_NODE 366A node intended to be a parent for other nodes. 367.It Dv CTLTYPE_INT 368A signed integer. 369.It Dv CTLTYPE_STRING 370A NUL-terminated string. 371.It Dv CTLTYPE_QUAD 372An unsigned 64-bit integer. 373.It Dv CTLTYPE_STRUCT 374A structure. 375.It Dv CTLTYPE_BOOL 376A boolean. 377.El 378.Pp 379The 380.Fa namep 381argument is copied into the 382.Fa sysctl_name 383field and must be less than 384.Dv SYSCTL_NAMELEN 385characters in length. 386The string indicated by 387.Fa desc 388will be copied if the 389.Dv CTLFLAG_OWNDESC 390flag is set, and will be used as the node's description. 391.Pp 392Two additional remarks: 393.Bl -enum -offset indent 394.It 395The 396.Dv CTLFLAG_PERMANENT 397flag can only be set from SYSCTL setup routines (see 398.Sx SETUP FUNCTIONS ) 399as called by 400.Fn sysctl_init . 401.It 402If 403.Fn sysctl_destroyv 404attempts to delete a node that does not own its own description (and 405is not marked as permanent), but the deletion fails, the description 406will be copied and 407.Fn sysctl_destroyv 408will set the 409.Dv CTLFLAG_OWNDESC 410flag. 411.El 412.Pp 413The 414.Fa func 415argument is the name of a 416.Dq helper 417function (see 418.Sx HELPER FUNCTIONS AND MACROS ) . 419If the 420.Dv CTLFLAG_IMMEDIATE 421flag is set, the 422.Fa qv 423argument will be interpreted as the initial value for the new 424.Dq int 425or 426.Dq quad 427node. 428This flag does not apply to any other type of node. 429The 430.Fa newp 431and 432.Fa newlen 433arguments describe the data external to SYSCTL that is to be 434instrumented. 435One of 436.Fa func , 437.Fa qv 438and the 439.Dv CTLFLAG_IMMEDIATE 440flag, or 441.Fa newp 442and 443.Fa newlen 444must be given for nodes that instrument data, otherwise an error is 445returned. 446.Pp 447The remaining arguments are a list of integers specifying the path 448through the MIB to the node being created. 449The list must be terminated by the 450.Dv CTL_EOL 451value. 452The penultimate value in the list may be 453.Dv CTL_CREATE 454if a dynamic MIB entry is to be made for this node. 455.Fn sysctl_createv 456specifically does not support 457.Dv CTL_CREATESYM , 458since setup routines are 459expected to be able to use the in-kernel 460.Xr ksyms 4 461interface to discover the location of the data to be instrumented. 462If the node to be created matches a node that already exists, a return 463code of 0 is given, indicating success. 464.Pp 465When using 466.Fn sysctl_destroyv 467to destroy a given node, the 468.Fa rnode 469argument, if not 470.Dv NULL , 471is taken to be the root of the tree from which 472the node is to be destroyed, otherwise the main tree is used. 473The rest of the arguments are a list of integers specifying the path 474through the MIB to the node being destroyed. 475If the node being destroyed does not exist, a successful return code 476is given. 477Nodes marked with the 478.Dv CTLFLAG_PERMANENT 479flag cannot be destroyed. 480.Sh HELPER FUNCTIONS AND MACROS 481Helper functions are invoked with the same common argument set as 482.Fn sysctl_dispatch 483except that the 484.Fa rnode 485argument will never be 486.Dv NULL . 487It will be set to point to the node that corresponds most closely to 488the current request. 489Helpers are forbidden from modifying the node they are passed; they 490should instead copy the structure if changes are required in order to 491effect access control or other checks. 492The 493.Dq helper 494prototype and function that needs to ensure that a newly assigned 495value is within a certain range (presuming external data) would look 496like the following: 497.Pp 498.Bd -literal -offset indent -compact 499static int sysctl_helper(SYSCTLFN_PROTO); 500 501static int 502sysctl_helper(SYSCTLFN_ARGS) 503{ 504 struct sysctlnode node; 505 int t, error; 506 507 node = *rnode; 508 node.sysctl_data = \*[Am]t; 509 error = sysctl_lookup(SYSCTLFN_CALL(\*[Am]node)); 510 if (error || newp == NULL) 511 return (error); 512 513 if (t \*[Lt] 0 || t \*[Gt] 20) 514 return (EINVAL); 515 516 *(int*)rnode-\*[Gt]sysctl_data = t; 517 return (0); 518} 519.Ed 520.Pp 521The use of the 522.Dv SYSCTLFN_PROTO , 523.Dv SYSCTLFN_ARGS, and 524.Dv SYSCTLFN_CALL 525 macros ensure that all arguments are passed properly. 526The single argument to the 527.Dv SYSCTLFN_CALL 528macro is the pointer to the node being examined. 529.Pp 530Three basic helper functions are available for use. 531.Fn sysctl_needfunc 532will emit a warning to the system console whenever it is invoked and 533provides a simplistic read-only interface to the given node. 534.Fn sysctl_notavail 535will forward 536.Dq queries 537to 538.Fn sysctl_query 539so that subtrees can be discovered, but will return 540.Er EOPNOTSUPP 541for any other condition. 542.Fn sysctl_null 543specifically ignores any arguments given, sets the value indicated by 544.Fa oldlenp 545to zero, and returns success. 546.Sh SETUP FUNCTIONS 547Though nodes can be added to the SYSCTL tree at any time, in order to 548add nodes during the kernel bootstrap phase, a proper 549.Dq setup 550function must be used. 551Setup functions are declared using the 552.Dv SYSCTL_SETUP 553macro, which takes the name of the function and a short string 554description of the function as arguments. 555.Po 556See the 557.Dv SYSCTL_DEBUG_SETUP 558kernel configuration in 559.Xr options 4 . 560.Pc 561The address of the function is added to a list of functions that 562.Fn sysctl_init 563traverses during initialization. 564.Pp 565Setup functions do not have to add nodes to the main tree, but can set 566up their own trees for emulation or other purposes. 567Emulations that require use of a main tree but with some nodes changed 568to suit their own purposes can arrange to overlay a sparse private 569tree onto their main tree by making the 570.Fa e_sysctlovly 571member of their struct emul definition point to the overlaid tree. 572.Pp 573Setup functions should take care to create all nodes from the root 574down to the subtree they are creating, since the order in which setup 575functions are called is arbitrary (the order in which setup functions 576are called is only determined by the ordering of the object files as 577passed to the linker when the kernel is built). 578.Sh MISCELLANEOUS FUNCTIONS 579.Fn sysctl_init 580is called early in the kernel bootstrap process. 581It initializes the SYSCTL lock, calls all the registered setup 582functions, and marks the tree as permanent. 583.Pp 584.Fn sysctl_free 585will unconditionally delete any and all nodes below the given node. 586Its intended use is for the deletion of entire trees, not subtrees. 587If a subtree is to be removed, 588.Fn sysctl_destroy 589or 590.Fn sysctl_destroyv 591should be used to ensure that nodes not owned by the sub-system being 592deactivated are not mistakenly destroyed. 593The SYSCTL lock must be held when calling this function. 594.Pp 595.Fn sysctl_teardown 596unwinds a 597.Em sysctllog 598and deletes the nodes in the opposite order in 599which they were created. 600.Pp 601.Fn old_sysctl 602provides an interface similar to the old SYSCTL implementation, with 603the exception that access checks on a per-node basis are performed if 604the 605.Fa l 606argument is 607.No non- Ns Dv NULL . 608If called with a 609.Dv NULL 610argument, the values for 611.Fa newp 612and 613.Fa oldp 614are interpreted as kernel addresses, and access is performed as for 615the superuser. 616.Sh NOTES 617It is expected that nodes will be added to (or removed from) the tree 618during the following stages of a machine's lifetime: 619.Pp 620.Bl -bullet -compact 621.It 622initialization -- when the kernel is booting 623.It 624autoconfiguration -- when devices are being probed at boot time 625.It 626.Dq plug and play 627device attachment -- when a PC-Card, USB, or other device is plugged 628in or attached 629.It 630module initialization -- when a module is being loaded 631.It 632.Dq run-time 633-- when a process creates a node via the 634.Xr sysctl 3 635interface 636.El 637.Pp 638Nodes marked with 639.Dv CTLFLAG_PERMANENT 640can only be added to a tree during the first or initialization phase, 641and can never be removed. 642The initialization phase terminates when the main tree's root is 643marked with the 644.Dv CTLFLAG_PERMANENT 645flag. 646Once the main tree is marked in this manner, no nodes can be added to 647any tree that is marked with 648.Dv CTLFLAG_READONLY 649at its root, and no nodes can be added at all if the main tree's root 650is so marked. 651.Pp 652Nodes added by device drivers, modules, and at device insertion time can 653be added to (and removed from) 654.Dq read-only 655parent nodes. 656.Pp 657Nodes created by processes can only be added to 658.Dq writable 659parent nodes. 660See 661.Xr sysctl 3 662for a description of the flags that are allowed to be used by 663when creating nodes. 664.Sh SEE ALSO 665.Xr sysctl 3 666.Sh HISTORY 667The dynamic SYSCTL implementation first appeared in 668.Nx 2.0 . 669.Sh AUTHORS 670.An Andrew Brown 671.Aq atatat@NetBSD.org 672designed and implemented the dynamic SYSCTL implementation. 673