xref: /dragonfly/share/man/man7/vkernel.7 (revision 71990c18)
1.\"
2.\" Copyright (c) 2006, 2007
3.\"	The DragonFly Project.  All rights reserved.
4.\"
5.\" Redistribution and use in source and binary forms, with or without
6.\" modification, are permitted provided that the following conditions
7.\" are met:
8.\"
9.\" 1. Redistributions of source code must retain the above copyright
10.\"    notice, this list of conditions and the following disclaimer.
11.\" 2. Redistributions in binary form must reproduce the above copyright
12.\"    notice, this list of conditions and the following disclaimer in
13.\"    the documentation and/or other materials provided with the
14.\"    distribution.
15.\" 3. Neither the name of The DragonFly Project nor the names of its
16.\"    contributors may be used to endorse or promote products derived
17.\"    from this software without specific, prior written permission.
18.\"
19.\" THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
20.\" ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
21.\" LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
22.\" FOR A PARTICULAR PURPOSE ARE DISCLAIMED.  IN NO EVENT SHALL THE
23.\" COPYRIGHT HOLDERS OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
24.\" INCIDENTAL, SPECIAL, EXEMPLARY OR CONSEQUENTIAL DAMAGES (INCLUDING,
25.\" BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
26.\" LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED
27.\" AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
28.\" OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT
29.\" OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
30.\" SUCH DAMAGE.
31.\"
32.Dd June 20, 2015
33.Dt VKERNEL 7
34.Os
35.Sh NAME
36.Nm vkernel ,
37.Nm vcd ,
38.Nm vkd ,
39.Nm vke
40.Nd virtual kernel architecture
41.Sh SYNOPSIS
42.Cd "platform vkernel64 # for 64 bit vkernels"
43.Cd "device vcd"
44.Cd "device vkd"
45.Cd "device vke"
46.Pp
47.Pa /var/vkernel/boot/kernel/kernel
48.Op Fl hdsUv
49.Op Fl c Ar file
50.Op Fl e Ar name Ns = Ns Li value : Ns Ar name Ns = Ns Li value : Ns ...
51.Op Fl i Ar file
52.Op Fl I Ar interface Ns Op Ar :address1 Ns Oo Ar :address2 Oc Ns Oo Ar /netmask Oc Ns Oo Ar =mac Oc
53.Op Fl l Ar cpulock
54.Op Fl m Ar size
55.Op Fl n Ar numcpus Ns Op Ar :lbits Ns Oo Ar :cbits Oc
56.Op Fl p Ar pidfile
57.Op Fl r Ar file Ns Op Ar :serno
58.Op Fl R Ar file Ns Op Ar :serno
59.Sh DESCRIPTION
60The
61.Nm
62architecture allows for running
63.Dx
64kernels in userland.
65.Pp
66The following options are available:
67.Bl -tag -width ".Fl m Ar size"
68.It Fl c Ar file
69Specify a readonly CD-ROM image
70.Ar file
71to be used by the kernel, with the first
72.Fl c
73option defining
74.Li vcd0 ,
75the second one
76.Li vcd1 ,
77and so on.
78The first
79.Fl r ,
80.Fl R ,
81or
82.Fl c
83option specified on the command line will be the boot disk.
84The CD9660 filesystem is assumed when booting from this media.
85.It Fl d
86Disables hardware pagetable for
87.Nm .
88.It Fl e Ar name Ns = Ns Li value : Ns Ar name Ns = Ns Li value : Ns ...
89Specify an environment to be used by the kernel.
90This option can be specified more than once.
91.It Fl h
92Shows a list of available options, each with a short description.
93.It Fl i Ar file
94Specify a memory image
95.Ar file
96to be used by the virtual kernel.
97If no
98.Fl i
99option is given, the kernel will generate a name of the form
100.Pa /var/vkernel/memimg.XXXXXX ,
101with the trailing
102.Ql X Ns s
103being replaced by a sequential number, e.g.\&
104.Pa memimg.000001 .
105.It Fl I Ar interface Ns Op Ar :address1 Ns Oo Ar :address2 Oc Ns Oo Ar /netmask Oc Ns Oo Ar =MAC Oc
106Create a virtual network device, with the first
107.Fl I
108option defining
109.Li vke0 ,
110the second one
111.Li vke1 ,
112and so on.
113.Pp
114The
115.Ar interface
116argument is the name of a
117.Xr tap 4
118device node or the path to a
119.Xr vknetd 8
120socket.
121The
122.Pa /dev/
123path prefix does not have to be specified and will be automatically prepended
124for a device node.
125Specifying
126.Cm auto
127will pick the first unused
128.Xr tap 4
129device.
130.Pp
131The
132.Ar address1
133and
134.Ar address2
135arguments are the IP addresses of the
136.Xr tap 4
137and
138.Nm vke
139interfaces.
140Optionally,
141.Ar address1
142may be of the form
143.Li bridge Ns Em X
144in which case the
145.Xr tap 4
146interface is added to the specified
147.Xr bridge 4
148interface.
149The
150.Nm vke
151address is not assigned until the interface is brought up in the guest.
152.Pp
153The
154.Ar netmask
155argument applies to all interfaces for which an address is specified.
156.Pp
157The
158.Ar MAC
159argument is the MAC address of the
160.Xr vke 4
161interface.
162If not specified, a pseudo-random one will be generated.
163.Pp
164When running multiple vkernels it is often more convenient to simply
165connect to a
166.Xr vknetd 8
167socket and let vknetd deal with the tap and/or bridge.
168An example of this would be
169.Pa /var/run/vknet:0.0.0.0:10.2.0.2/16 .
170.It Fl l Ar cpulock
171Specify which, if any, real CPUs to lock virtual CPUs to.
172.Ar cpulock
173is one of
174.Cm any ,
175.Cm map Ns Op Ns , Ns Ar startCPU ,
176or
177.Ar CPU .
178.Pp
179.Cm any
180does not map virtual CPUs to real CPUs.
181This is the default.
182.Pp
183.Cm map Ns Op Ns , Ns Ar startCPU
184maps each virtual CPU to a real CPU starting with real CPU 0 or
185.Ar startCPU
186if specified.
187.Pp
188.Ar CPU
189locks all virtual CPUs to the real CPU specified by
190.Ar CPU .
191.It Fl m Ar size
192Specify the amount of memory to be used by the kernel in bytes,
193.Cm K
194.Pq kilobytes ,
195.Cm M
196.Pq megabytes
197or
198.Cm G
199.Pq gigabytes .
200Lowercase versions of
201.Cm K , M ,
202and
203.Cm G
204are allowed.
205.It Fl n Ar numcpus Ns Op Ar :lbits Ns Oo Ar :cbits Oc
206.Ar numcpus
207specifies the number of CPUs you wish to emulate.
208Up to 16 CPUs are supported with 2 being the default unless otherwise
209specified.
210.Pp
211.Ar lbits
212specifies the number of bits within APICID(=CPUID) needed for representing
213the logical ID.
214Controls the number of threads/core (0 bits - 1 thread, 1 bit - 2 threads).
215This parameter is optional (mandatory only if
216.Ar cbits
217is specified).
218.Pp
219.Ar cbits
220specifies the number of bits within APICID(=CPUID) needed for representing
221the core ID.
222Controls the number of core/package (0 bits - 1 core, 1 bit - 2 cores).
223This parameter is optional.
224.It Fl p Ar pidfile
225Specify a pidfile in which to store the process ID.
226Scripts can use this file to locate the vkernel pid for the purpose of
227shutting down or killing it.
228.Pp
229The vkernel will hold a lock on the pidfile while running.
230Scripts may test for the lock to determine if the pidfile is valid or
231stale so as to avoid accidentally killing a random process.
232Something like '/usr/bin/lockf -ks -t 0 pidfile echo -n' may be used
233to test the lock.
234A non-zero exit code indicates that the pidfile represents a running
235vkernel.
236.Pp
237An error is issued and the vkernel exits if this file cannot be opened for
238writing or if it is already locked by an active vkernel process.
239.It Fl r Ar file Ns Op Ar :serno
240Specify a R/W disk image
241.Ar file
242to be used by the kernel, with the first
243.Fl r
244option defining
245.Li vkd0 ,
246the second one
247.Li vkd1 ,
248and so on.
249A serial number for the virtual disk can be specified in
250.Ar serno .
251.Pp
252The first
253.Fl r
254or
255.Fl c
256option specified on the command line will be the boot disk.
257.It Fl R Ar file Ns Op Ar :serno
258Works like
259.Fl r
260but treats the disk image as copy-on-write.  This allows
261a private copy of the image to be modified but does not
262modify the image file.  The image file will not be locked
263in this situation and multiple vkernels can run off the
264same image file if desired.
265.Pp
266Since modifications are thrown away, any data you wish
267to retain across invocations needs to be exported over
268the network prior to shutdown.
269This gives you the flexibility to mount the disk image
270either read-only or read-write depending on what is
271convenient.
272However, keep in mind that when mounting a COW image
273read-write, modifications will eat system memory and
274swap space until the vkernel is shut down.
275.It Fl s
276Boot into single-user mode.
277.It Fl U
278Enable writing to kernel memory and module loading.
279By default, those are disabled for security reasons.
280.It Fl v
281Turn on verbose booting.
282.El
283.Sh DEVICES
284A number of virtual device drivers exist to supplement the virtual kernel.
285.Ss Disk device
286The
287.Nm vkd
288driver allows for up to 16
289.Xr vn 4
290based disk devices.
291The root device will be
292.Li vkd0
293(see
294.Sx EXAMPLES
295for further information on how to prepare a root image).
296.Ss CD-ROM device
297The
298.Nm vcd
299driver allows for up to 16 virtual CD-ROM devices.
300Basically this is a read only
301.Nm vkd
302device with a block size of 2048.
303.Ss Network interface
304The
305.Nm vke
306driver supports up to 16 virtual network interfaces which are associated with
307.Xr tap 4
308devices on the host.
309For each
310.Nm vke
311device, the per-interface read only
312.Xr sysctl 3
313variable
314.Va hw.vke Ns Em X Ns Va .tap_unit
315holds the unit number of the associated
316.Xr tap 4
317device.
318.Pp
319By default, half of the total mbuf clusters available is distributed equally
320among all the vke devices up to 256.
321This can be overridden with the tunable
322.Va hw.vke.max_ringsize .
323Take into account the number passed will be aligned to the lower power of two.
324.Sh SIGNALS
325The virtual kernel only enables
326.Dv SIGQUIT
327and
328.Dv SIGTERM
329while operating in regular console mode.
330Sending
331.Ql \&^\e
332.Pq Dv SIGQUIT
333to the virtual kernel causes the virtual kernel to enter its internal
334.Xr ddb 4
335debugger and re-enable all other terminal signals.
336Sending
337.Dv SIGTERM
338to the virtual kernel triggers a clean shutdown by passing a
339.Dv SIGUSR2
340to the virtual kernel's
341.Xr init 8
342process.
343.Sh DEBUGGING
344It is possible to directly gdb the virtual kernel's process.
345It is recommended that you do a
346.Ql handle SIGSEGV noprint
347to ignore page faults processed by the virtual kernel itself and
348.Ql handle SIGUSR1 noprint
349to ignore signals used for simulating inter-processor interrupts.
350.Sh PROFILING
351To compile a vkernel with profiling support, the
352.Va CONFIGARGS
353variable needs to be used to pass
354.Fl p
355to
356.Xr config 8 .
357.Bd -literal
358cd /usr/src
359make -DNO_MODULES CONFIGARGS=-p buildkernel KERNCONF=VKERNEL64
360.Ed
361.Sh FILES
362.Bl -tag -width ".It Pa /sys/config/VKERNEL64" -compact
363.It Pa /dev/vcdX
364.Nm vcd
365device nodes
366.It Pa /dev/vkdX
367.Nm vkd
368device nodes
369.It Pa /sys/config/VKERNEL64
370.El
371.Pp
372.Nm
373configuration file, for
374.Xr config 8 .
375.Sh CONFIGURATION FILES
376Your virtual kernel is a complete
377.Dx
378system, but you might not want to run all the services a normal kernel runs.
379Here is what a typical virtual kernel's
380.Pa /etc/rc.conf
381file looks like, with some additional possibilities commented out.
382.Bd -literal
383hostname="vkernel"
384network_interfaces="lo0 vke0"
385ifconfig_vke0="DHCP"
386sendmail_enable="NO"
387#syslog_enable="NO"
388blanktime="NO"
389.Ed
390.Sh BOOT DRIVE SELECTION
391You can override the default boot drive selection and filesystem
392using a kernel environment variable.  Note that the filesystem
393selected must be compiled into the vkernel and not loaded as
394a module.  You need to escape some quotes around the variable data
395to avoid mis-interpretation of the colon in the
396.Fl e
397option.  For example:
398.Pp
399.Fl e
400vfs.root.mountfrom=\\"hammer:vkd0s1d\\"
401.Sh DISKLESS OPERATION
402To boot a
403.Nm
404from a NFS root, a number of tunables need to be set:
405.Bl -tag -width indent
406.It Va boot.netif.ip
407IP address to be set in the vkernel interface.
408.It Va boot.netif.netmask
409Netmask for the IP to be set.
410.It Va boot.netif.name
411Network interface name inside the vkernel.
412.It Va boot.nfsroot.server
413Host running
414.Xr nfsd 8 .
415.It Va boot.nfsroot.path
416Host path where a world and distribution
417targets are properly installed.
418.El
419.Pp
420See an example on how to boot a diskless
421.Nm
422in the
423.Sx EXAMPLES
424section.
425.Sh EXAMPLES
426A couple of steps are necessary in order to prepare the system to build and
427run a virtual kernel.
428.Ss Setting up the filesystem
429The
430.Nm
431architecture needs a number of files which reside in
432.Pa /var/vkernel .
433Since these files tend to get rather big and the
434.Pa /var
435partition is usually of limited size, we recommend the directory to be
436created in the
437.Pa /home
438partition with a link to it in
439.Pa /var :
440.Bd -literal
441mkdir -p /home/var.vkernel/boot
442ln -s /home/var.vkernel /var/vkernel
443.Ed
444.Pp
445Next, a filesystem image to be used by the virtual kernel has to be
446created and populated (assuming world has been built previously).
447If the image is created on a UFS filesystem you might want to pre-zero it.
448On a HAMMER filesystem you should just truncate-extend to the image size
449as HAMMER does not re-use data blocks already present in the file.
450.Bd -literal
451vnconfig -c -S 2g -T vn0 /var/vkernel/rootimg.01
452disklabel -r -w vn0s0 auto
453disklabel -e vn0s0	# add `a' partition with fstype `4.2BSD'
454newfs /dev/vn0s0a
455mount /dev/vn0s0a /mnt
456cd /usr/src
457make installworld DESTDIR=/mnt
458cd etc
459make distribution DESTDIR=/mnt
460echo '/dev/vkd0s0a	/	ufs	rw	1  1' >/mnt/etc/fstab
461echo 'proc		/proc	procfs	rw	0  0' >>/mnt/etc/fstab
462.Ed
463.Pp
464Edit
465.Pa /mnt/etc/ttys
466and replace the
467.Li console
468entry with the following line and turn off all other gettys.
469.Bd -literal
470console	"/usr/libexec/getty Pc"		cons25	on  secure
471.Ed
472.Pp
473Replace
474.Li \&Pc
475with
476.Li al.Pc
477if you would like to automatically log in as root.
478.Pp
479Then, unmount the disk.
480.Bd -literal
481umount /mnt
482vnconfig -u vn0
483.Ed
484.Ss Compiling the virtual kernel
485In order to compile a virtual kernel use the
486.Li VKERNEL64
487kernel configuration file residing in
488.Pa /sys/config
489(or a configuration file derived thereof):
490.Bd -literal
491cd /usr/src
492make -DNO_MODULES buildkernel KERNCONF=VKERNEL64
493make -DNO_MODULES installkernel KERNCONF=VKERNEL64 DESTDIR=/var/vkernel
494.Ed
495.Ss Enabling virtual kernel operation
496A special
497.Xr sysctl 8 ,
498.Va vm.vkernel_enable ,
499must be set to enable
500.Nm
501operation:
502.Bd -literal
503sysctl vm.vkernel_enable=1
504.Ed
505.Ss Configuring the network on the host system
506In order to access a network interface of the host system from the
507.Nm ,
508you must add the interface to a
509.Xr bridge 4
510device which will then be passed to the
511.Fl I
512option:
513.Bd -literal
514kldload if_bridge.ko
515kldload if_tap.ko
516ifconfig bridge0 create
517ifconfig bridge0 addm re0	# assuming re0 is the host's interface
518ifconfig bridge0 up
519.Ed
520.Ss Running the kernel
521Finally, the virtual kernel can be run:
522.Bd -literal
523cd /var/vkernel
524\&./boot/kernel/kernel -m 64m -r rootimg.01 -I auto:bridge0
525.Ed
526.Pp
527You can issue the
528.Xr reboot 8 ,
529.Xr halt 8 ,
530or
531.Xr shutdown 8
532commands from inside a virtual kernel.
533After doing a clean shutdown the
534.Xr reboot 8
535command will re-exec the virtual kernel binary while the other two will
536cause the virtual kernel to exit.
537.Ss Diskless operation
538Booting a
539.Nm
540with a
541.Xr vknetd 8
542network configuration:
543.Bd -literal
544\&./boot/kernel/kernel -m 64m -m -i memimg.0000 -I /var/run/vknet
545	-e boot.netif.ip=172.1.0.4
546	-e boot.netif.netmask=255.255.0.0
547	-e boot.netif.name=vke0
548	-e boot.nfsroot.server=172.1.0.1
549	-e boot.nfsroot.path=/home/vkernel/vkdiskless
550.Ed
551.Sh BUILDING THE WORLD UNDER A VKERNEL
552The virtual kernel platform does not have all the header files expected
553by a world build, so the easiest thing to do right now is to specify a
554pc64 (in a 64 bit vkernel) target when building the world under a virtual
555kernel, like this:
556.Bd -literal
557vkernel# make MACHINE_PLATFORM=pc64 buildworld
558vkernel# make MACHINE_PLATFORM=pc64 installworld
559.Ed
560.Sh SEE ALSO
561.Xr vknet 1 ,
562.Xr bridge 4 ,
563.Xr ifmedia 4 ,
564.Xr tap 4 ,
565.Xr vn 4 ,
566.Xr sysctl.conf 5 ,
567.Xr build 7 ,
568.Xr config 8 ,
569.Xr disklabel 8 ,
570.Xr ifconfig 8 ,
571.Xr vknetd 8 ,
572.Xr vnconfig 8
573.Rs
574.%A Aggelos Economopoulos
575.%D March 2007
576.%T "A Peek at the DragonFly Virtual Kernel"
577.Re
578.Sh HISTORY
579Virtual kernels were introduced in
580.Dx 1.7 .
581.Sh AUTHORS
582.An -nosplit
583.An Matt Dillon
584thought up and implemented the
585.Nm
586architecture and wrote the
587.Nm vkd
588device driver.
589.An Sepherosa Ziehau
590wrote the
591.Nm vke
592device driver.
593This manual page was written by
594.An Sascha Wildner .
595