1.\" 2.\" Copyright (c) 2006, 2007 3.\" The DragonFly Project. All rights reserved. 4.\" 5.\" Redistribution and use in source and binary forms, with or without 6.\" modification, are permitted provided that the following conditions 7.\" are met: 8.\" 9.\" 1. Redistributions of source code must retain the above copyright 10.\" notice, this list of conditions and the following disclaimer. 11.\" 2. Redistributions in binary form must reproduce the above copyright 12.\" notice, this list of conditions and the following disclaimer in 13.\" the documentation and/or other materials provided with the 14.\" distribution. 15.\" 3. Neither the name of The DragonFly Project nor the names of its 16.\" contributors may be used to endorse or promote products derived 17.\" from this software without specific, prior written permission. 18.\" 19.\" THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS 20.\" ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT 21.\" LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS 22.\" FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE 23.\" COPYRIGHT HOLDERS OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, 24.\" INCIDENTAL, SPECIAL, EXEMPLARY OR CONSEQUENTIAL DAMAGES (INCLUDING, 25.\" BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; 26.\" LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED 27.\" AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, 28.\" OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT 29.\" OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF 30.\" SUCH DAMAGE. 31.\" 32.Dd June 20, 2015 33.Dt VKERNEL 7 34.Os 35.Sh NAME 36.Nm vkernel , 37.Nm vcd , 38.Nm vkd , 39.Nm vke 40.Nd virtual kernel architecture 41.Sh SYNOPSIS 42.Cd "platform vkernel64 # for 64 bit vkernels" 43.Cd "device vcd" 44.Cd "device vkd" 45.Cd "device vke" 46.Pp 47.Pa /var/vkernel/boot/kernel/kernel 48.Op Fl hdsUv 49.Op Fl c Ar file 50.Op Fl e Ar name Ns = Ns Li value : Ns Ar name Ns = Ns Li value : Ns ... 51.Op Fl i Ar file 52.Op Fl I Ar interface Ns Op Ar :address1 Ns Oo Ar :address2 Oc Ns Oo Ar /netmask Oc Ns Oo Ar =mac Oc 53.Op Fl l Ar cpulock 54.Op Fl m Ar size 55.Op Fl n Ar numcpus Ns Op Ar :lbits Ns Oo Ar :cbits Oc 56.Op Fl p Ar pidfile 57.Op Fl r Ar file Ns Op Ar :serno 58.Op Fl R Ar file Ns Op Ar :serno 59.Sh DESCRIPTION 60The 61.Nm 62architecture allows for running 63.Dx 64kernels in userland. 65.Pp 66The following options are available: 67.Bl -tag -width ".Fl m Ar size" 68.It Fl c Ar file 69Specify a readonly CD-ROM image 70.Ar file 71to be used by the kernel, with the first 72.Fl c 73option defining 74.Li vcd0 , 75the second one 76.Li vcd1 , 77and so on. 78The first 79.Fl r , 80.Fl R , 81or 82.Fl c 83option specified on the command line will be the boot disk. 84The CD9660 filesystem is assumed when booting from this media. 85.It Fl d 86Disables hardware pagetable for 87.Nm . 88.It Fl e Ar name Ns = Ns Li value : Ns Ar name Ns = Ns Li value : Ns ... 89Specify an environment to be used by the kernel. 90This option can be specified more than once. 91.It Fl h 92Shows a list of available options, each with a short description. 93.It Fl i Ar file 94Specify a memory image 95.Ar file 96to be used by the virtual kernel. 97If no 98.Fl i 99option is given, the kernel will generate a name of the form 100.Pa /var/vkernel/memimg.XXXXXX , 101with the trailing 102.Ql X Ns s 103being replaced by a sequential number, e.g.\& 104.Pa memimg.000001 . 105.It Fl I Ar interface Ns Op Ar :address1 Ns Oo Ar :address2 Oc Ns Oo Ar /netmask Oc Ns Oo Ar =MAC Oc 106Create a virtual network device, with the first 107.Fl I 108option defining 109.Li vke0 , 110the second one 111.Li vke1 , 112and so on. 113.Pp 114The 115.Ar interface 116argument is the name of a 117.Xr tap 4 118device node or the path to a 119.Xr vknetd 8 120socket. 121The 122.Pa /dev/ 123path prefix does not have to be specified and will be automatically prepended 124for a device node. 125Specifying 126.Cm auto 127will pick the first unused 128.Xr tap 4 129device. 130.Pp 131The 132.Ar address1 133and 134.Ar address2 135arguments are the IP addresses of the 136.Xr tap 4 137and 138.Nm vke 139interfaces. 140Optionally, 141.Ar address1 142may be of the form 143.Li bridge Ns Em X 144in which case the 145.Xr tap 4 146interface is added to the specified 147.Xr bridge 4 148interface. 149The 150.Nm vke 151address is not assigned until the interface is brought up in the guest. 152.Pp 153The 154.Ar netmask 155argument applies to all interfaces for which an address is specified. 156.Pp 157The 158.Ar MAC 159argument is the MAC address of the 160.Xr vke 4 161interface. 162If not specified, a pseudo-random one will be generated. 163.Pp 164When running multiple vkernels it is often more convenient to simply 165connect to a 166.Xr vknetd 8 167socket and let vknetd deal with the tap and/or bridge. 168An example of this would be 169.Pa /var/run/vknet:0.0.0.0:10.2.0.2/16 . 170.It Fl l Ar cpulock 171Specify which, if any, real CPUs to lock virtual CPUs to. 172.Ar cpulock 173is one of 174.Cm any , 175.Cm map Ns Op Ns , Ns Ar startCPU , 176or 177.Ar CPU . 178.Pp 179.Cm any 180does not map virtual CPUs to real CPUs. 181This is the default. 182.Pp 183.Cm map Ns Op Ns , Ns Ar startCPU 184maps each virtual CPU to a real CPU starting with real CPU 0 or 185.Ar startCPU 186if specified. 187.Pp 188.Ar CPU 189locks all virtual CPUs to the real CPU specified by 190.Ar CPU . 191.It Fl m Ar size 192Specify the amount of memory to be used by the kernel in bytes, 193.Cm K 194.Pq kilobytes , 195.Cm M 196.Pq megabytes 197or 198.Cm G 199.Pq gigabytes . 200Lowercase versions of 201.Cm K , M , 202and 203.Cm G 204are allowed. 205.It Fl n Ar numcpus Ns Op Ar :lbits Ns Oo Ar :cbits Oc 206.Ar numcpus 207specifies the number of CPUs you wish to emulate. 208Up to 16 CPUs are supported with 2 being the default unless otherwise 209specified. 210.Pp 211.Ar lbits 212specifies the number of bits within APICID(=CPUID) needed for representing 213the logical ID. 214Controls the number of threads/core (0 bits - 1 thread, 1 bit - 2 threads). 215This parameter is optional (mandatory only if 216.Ar cbits 217is specified). 218.Pp 219.Ar cbits 220specifies the number of bits within APICID(=CPUID) needed for representing 221the core ID. 222Controls the number of core/package (0 bits - 1 core, 1 bit - 2 cores). 223This parameter is optional. 224.It Fl p Ar pidfile 225Specify a pidfile in which to store the process ID. 226Scripts can use this file to locate the vkernel pid for the purpose of 227shutting down or killing it. 228.Pp 229The vkernel will hold a lock on the pidfile while running. 230Scripts may test for the lock to determine if the pidfile is valid or 231stale so as to avoid accidentally killing a random process. 232Something like '/usr/bin/lockf -ks -t 0 pidfile echo -n' may be used 233to test the lock. 234A non-zero exit code indicates that the pidfile represents a running 235vkernel. 236.Pp 237An error is issued and the vkernel exits if this file cannot be opened for 238writing or if it is already locked by an active vkernel process. 239.It Fl r Ar file Ns Op Ar :serno 240Specify a R/W disk image 241.Ar file 242to be used by the kernel, with the first 243.Fl r 244option defining 245.Li vkd0 , 246the second one 247.Li vkd1 , 248and so on. 249A serial number for the virtual disk can be specified in 250.Ar serno . 251.Pp 252The first 253.Fl r 254or 255.Fl c 256option specified on the command line will be the boot disk. 257.It Fl R Ar file Ns Op Ar :serno 258Works like 259.Fl r 260but treats the disk image as copy-on-write. This allows 261a private copy of the image to be modified but does not 262modify the image file. The image file will not be locked 263in this situation and multiple vkernels can run off the 264same image file if desired. 265.Pp 266Since modifications are thrown away, any data you wish 267to retain across invocations needs to be exported over 268the network prior to shutdown. 269This gives you the flexibility to mount the disk image 270either read-only or read-write depending on what is 271convenient. 272However, keep in mind that when mounting a COW image 273read-write, modifications will eat system memory and 274swap space until the vkernel is shut down. 275.It Fl s 276Boot into single-user mode. 277.It Fl U 278Enable writing to kernel memory and module loading. 279By default, those are disabled for security reasons. 280.It Fl v 281Turn on verbose booting. 282.El 283.Sh DEVICES 284A number of virtual device drivers exist to supplement the virtual kernel. 285.Ss Disk device 286The 287.Nm vkd 288driver allows for up to 16 289.Xr vn 4 290based disk devices. 291The root device will be 292.Li vkd0 293(see 294.Sx EXAMPLES 295for further information on how to prepare a root image). 296.Ss CD-ROM device 297The 298.Nm vcd 299driver allows for up to 16 virtual CD-ROM devices. 300Basically this is a read only 301.Nm vkd 302device with a block size of 2048. 303.Ss Network interface 304The 305.Nm vke 306driver supports up to 16 virtual network interfaces which are associated with 307.Xr tap 4 308devices on the host. 309For each 310.Nm vke 311device, the per-interface read only 312.Xr sysctl 3 313variable 314.Va hw.vke Ns Em X Ns Va .tap_unit 315holds the unit number of the associated 316.Xr tap 4 317device. 318.Pp 319By default, half of the total mbuf clusters available is distributed equally 320among all the vke devices up to 256. 321This can be overridden with the tunable 322.Va hw.vke.max_ringsize . 323Take into account the number passed will be aligned to the lower power of two. 324.Sh SIGNALS 325The virtual kernel only enables 326.Dv SIGQUIT 327and 328.Dv SIGTERM 329while operating in regular console mode. 330Sending 331.Ql \&^\e 332.Pq Dv SIGQUIT 333to the virtual kernel causes the virtual kernel to enter its internal 334.Xr ddb 4 335debugger and re-enable all other terminal signals. 336Sending 337.Dv SIGTERM 338to the virtual kernel triggers a clean shutdown by passing a 339.Dv SIGUSR2 340to the virtual kernel's 341.Xr init 8 342process. 343.Sh DEBUGGING 344It is possible to directly gdb the virtual kernel's process. 345It is recommended that you do a 346.Ql handle SIGSEGV noprint 347to ignore page faults processed by the virtual kernel itself and 348.Ql handle SIGUSR1 noprint 349to ignore signals used for simulating inter-processor interrupts. 350.Sh PROFILING 351To compile a vkernel with profiling support, the 352.Va CONFIGARGS 353variable needs to be used to pass 354.Fl p 355to 356.Xr config 8 . 357.Bd -literal 358cd /usr/src 359make -DNO_MODULES CONFIGARGS=-p buildkernel KERNCONF=VKERNEL64 360.Ed 361.Sh FILES 362.Bl -tag -width ".It Pa /sys/config/VKERNEL64" -compact 363.It Pa /dev/vcdX 364.Nm vcd 365device nodes 366.It Pa /dev/vkdX 367.Nm vkd 368device nodes 369.It Pa /sys/config/VKERNEL64 370.El 371.Pp 372.Nm 373configuration file, for 374.Xr config 8 . 375.Sh CONFIGURATION FILES 376Your virtual kernel is a complete 377.Dx 378system, but you might not want to run all the services a normal kernel runs. 379Here is what a typical virtual kernel's 380.Pa /etc/rc.conf 381file looks like, with some additional possibilities commented out. 382.Bd -literal 383hostname="vkernel" 384network_interfaces="lo0 vke0" 385ifconfig_vke0="DHCP" 386sendmail_enable="NO" 387#syslog_enable="NO" 388blanktime="NO" 389.Ed 390.Sh BOOT DRIVE SELECTION 391You can override the default boot drive selection and filesystem 392using a kernel environment variable. Note that the filesystem 393selected must be compiled into the vkernel and not loaded as 394a module. You need to escape some quotes around the variable data 395to avoid mis-interpretation of the colon in the 396.Fl e 397option. For example: 398.Pp 399.Fl e 400vfs.root.mountfrom=\\"hammer:vkd0s1d\\" 401.Sh DISKLESS OPERATION 402To boot a 403.Nm 404from a NFS root, a number of tunables need to be set: 405.Bl -tag -width indent 406.It Va boot.netif.ip 407IP address to be set in the vkernel interface. 408.It Va boot.netif.netmask 409Netmask for the IP to be set. 410.It Va boot.netif.name 411Network interface name inside the vkernel. 412.It Va boot.nfsroot.server 413Host running 414.Xr nfsd 8 . 415.It Va boot.nfsroot.path 416Host path where a world and distribution 417targets are properly installed. 418.El 419.Pp 420See an example on how to boot a diskless 421.Nm 422in the 423.Sx EXAMPLES 424section. 425.Sh EXAMPLES 426A couple of steps are necessary in order to prepare the system to build and 427run a virtual kernel. 428.Ss Setting up the filesystem 429The 430.Nm 431architecture needs a number of files which reside in 432.Pa /var/vkernel . 433Since these files tend to get rather big and the 434.Pa /var 435partition is usually of limited size, we recommend the directory to be 436created in the 437.Pa /home 438partition with a link to it in 439.Pa /var : 440.Bd -literal 441mkdir -p /home/var.vkernel/boot 442ln -s /home/var.vkernel /var/vkernel 443.Ed 444.Pp 445Next, a filesystem image to be used by the virtual kernel has to be 446created and populated (assuming world has been built previously). 447If the image is created on a UFS filesystem you might want to pre-zero it. 448On a HAMMER filesystem you should just truncate-extend to the image size 449as HAMMER does not re-use data blocks already present in the file. 450.Bd -literal 451vnconfig -c -S 2g -T vn0 /var/vkernel/rootimg.01 452disklabel -r -w vn0s0 auto 453disklabel -e vn0s0 # add `a' partition with fstype `4.2BSD' 454newfs /dev/vn0s0a 455mount /dev/vn0s0a /mnt 456cd /usr/src 457make installworld DESTDIR=/mnt 458cd etc 459make distribution DESTDIR=/mnt 460echo '/dev/vkd0s0a / ufs rw 1 1' >/mnt/etc/fstab 461echo 'proc /proc procfs rw 0 0' >>/mnt/etc/fstab 462.Ed 463.Pp 464Edit 465.Pa /mnt/etc/ttys 466and replace the 467.Li console 468entry with the following line and turn off all other gettys. 469.Bd -literal 470console "/usr/libexec/getty Pc" cons25 on secure 471.Ed 472.Pp 473Replace 474.Li \&Pc 475with 476.Li al.Pc 477if you would like to automatically log in as root. 478.Pp 479Then, unmount the disk. 480.Bd -literal 481umount /mnt 482vnconfig -u vn0 483.Ed 484.Ss Compiling the virtual kernel 485In order to compile a virtual kernel use the 486.Li VKERNEL64 487kernel configuration file residing in 488.Pa /sys/config 489(or a configuration file derived thereof): 490.Bd -literal 491cd /usr/src 492make -DNO_MODULES buildkernel KERNCONF=VKERNEL64 493make -DNO_MODULES installkernel KERNCONF=VKERNEL64 DESTDIR=/var/vkernel 494.Ed 495.Ss Enabling virtual kernel operation 496A special 497.Xr sysctl 8 , 498.Va vm.vkernel_enable , 499must be set to enable 500.Nm 501operation: 502.Bd -literal 503sysctl vm.vkernel_enable=1 504.Ed 505.Ss Configuring the network on the host system 506In order to access a network interface of the host system from the 507.Nm , 508you must add the interface to a 509.Xr bridge 4 510device which will then be passed to the 511.Fl I 512option: 513.Bd -literal 514kldload if_bridge.ko 515kldload if_tap.ko 516ifconfig bridge0 create 517ifconfig bridge0 addm re0 # assuming re0 is the host's interface 518ifconfig bridge0 up 519.Ed 520.Ss Running the kernel 521Finally, the virtual kernel can be run: 522.Bd -literal 523cd /var/vkernel 524\&./boot/kernel/kernel -m 64m -r rootimg.01 -I auto:bridge0 525.Ed 526.Pp 527You can issue the 528.Xr reboot 8 , 529.Xr halt 8 , 530or 531.Xr shutdown 8 532commands from inside a virtual kernel. 533After doing a clean shutdown the 534.Xr reboot 8 535command will re-exec the virtual kernel binary while the other two will 536cause the virtual kernel to exit. 537.Ss Diskless operation 538Booting a 539.Nm 540with a 541.Xr vknetd 8 542network configuration: 543.Bd -literal 544\&./boot/kernel/kernel -m 64m -m -i memimg.0000 -I /var/run/vknet 545 -e boot.netif.ip=172.1.0.4 546 -e boot.netif.netmask=255.255.0.0 547 -e boot.netif.name=vke0 548 -e boot.nfsroot.server=172.1.0.1 549 -e boot.nfsroot.path=/home/vkernel/vkdiskless 550.Ed 551.Sh BUILDING THE WORLD UNDER A VKERNEL 552The virtual kernel platform does not have all the header files expected 553by a world build, so the easiest thing to do right now is to specify a 554pc64 (in a 64 bit vkernel) target when building the world under a virtual 555kernel, like this: 556.Bd -literal 557vkernel# make MACHINE_PLATFORM=pc64 buildworld 558vkernel# make MACHINE_PLATFORM=pc64 installworld 559.Ed 560.Sh SEE ALSO 561.Xr vknet 1 , 562.Xr bridge 4 , 563.Xr ifmedia 4 , 564.Xr tap 4 , 565.Xr vn 4 , 566.Xr sysctl.conf 5 , 567.Xr build 7 , 568.Xr config 8 , 569.Xr disklabel 8 , 570.Xr ifconfig 8 , 571.Xr vknetd 8 , 572.Xr vnconfig 8 573.Rs 574.%A Aggelos Economopoulos 575.%D March 2007 576.%T "A Peek at the DragonFly Virtual Kernel" 577.Re 578.Sh HISTORY 579Virtual kernels were introduced in 580.Dx 1.7 . 581.Sh AUTHORS 582.An -nosplit 583.An Matt Dillon 584thought up and implemented the 585.Nm 586architecture and wrote the 587.Nm vkd 588device driver. 589.An Sepherosa Ziehau 590wrote the 591.Nm vke 592device driver. 593This manual page was written by 594.An Sascha Wildner . 595