xref: /freebsd/usr.sbin/nfsd/pnfs.4 (revision 780fb4a2)
1.\" Copyright (c) 2017 Rick Macklem
2.\"
3.\" Redistribution and use in source and binary forms, with or without
4.\" modification, are permitted provided that the following conditions
5.\" are met:
6.\" 1. Redistributions of source code must retain the above copyright
7.\"    notice, this list of conditions and the following disclaimer.
8.\" 2. Redistributions in binary form must reproduce the above copyright
9.\"    notice, this list of conditions and the following disclaimer in the
10.\"    documentation and/or other materials provided with the distribution.
11.\"
12.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
13.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
14.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
15.\" ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
16.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
17.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
18.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
19.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
20.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
21.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
22.\" SUCH DAMAGE.
23.\"
24.\" $FreeBSD$
25.\"
26.Dd July 10, 2018
27.Dt PNFS 4
28.Os
29.Sh NAME
30.Nm pNFS
31.Nd NFS Version 4.1 Parallel NFS Protocol
32.Sh DESCRIPTION
33The NFSv4.1 client and server provides support for the
34.Tn pNFS
35specification; see
36.%T "Network File System (NFS) Version 4 Minor Version 1 Protocol RFC 5661" .
37A pNFS service separates Read/Write operations from all other NFSv4.1
38operations, which are referred to as Metadata operations.
39The Read/Write operations are performed directly on the Data Server (DS)
40where the file's data resides, bypassing the NFS server.
41All other file operations are performed on the NFS server, which is referred to
42as a Metadata Server (MDS).
43NFS clients that do not support
44.Tn pNFS
45perform Read/Write operations on the MDS, which acts as a proxy for the
46appropriate DS(s).
47.Pp
48The NFSv4.1 protocol provides two pieces of information to pNFS aware
49clients that allow them to perform Read/Write operations directly on
50the DS.
51.Pp
52The first is DeviceInfo, which is static information defining the DS
53server.
54The critical piece of information in DeviceInfo for the layout types
55supported by FreeBSD is the IP address that is used to perform RPCs on the DS.
56It also indicates which version of NFS the DS supports, I/O size and other
57layout specific information.
58In the DeviceInfo, there is a DeviceID which, for the FreeBSD server
59is unique to the DS configuration
60and changes whenever the
61.Xr nfsd
62daemon is restarted or the server is rebooted.
63.Pp
64The second is the layout, which is per file and references the DeviceInfo
65to use via the DeviceID.
66It is for a byte range of a file and is either Read or Read/Write.
67For the FreeBSD server, a layout covers all bytes of a file.
68A layout may be recalled by the MDS using a LayoutRecall callback.
69When a client returns a layout via the LayoutReturn operation it can
70indicate that error(s) were encountered while doing I/O on the DS,
71at least for certain layout types such as the Flexible File Layout.
72.Pp
73The FreeBSD client and server supports two layout types.
74.Pp
75The File Layout is described in RFC5661 and uses the NFSv4.1 protocol
76to perform I/O on the DS.
77It does not support client aware DS mirroring and, as such,
78the FreeBSD server only provides File Layout support for non-mirrored
79configurations.
80.Pp
81The Flexible File Layout allows the use of the NFSv3, NFSv4.0 or NFSv4.1
82protocol to perform I/O on the DS and does support client aware mirroring.
83As such, the FreeBSD server uses Flexible File Layout layouts for the
84mirrored DS configurations.
85The FreeBSD server supports the
86.Dq tightly coupled
87variant and all DSs use the
88NFSv4.1 protocol for I/O operations.
89Clients that support the Flexible File Layout will do writes and commits
90to all DS mirrors in the mirror set.
91.Pp
92A FreeBSD pNFS service consists of a single MDS server plus one or more
93DS servers, all of which are FreeBSD systems.
94For a non-mirrored configuration, the FreeBSD server will issue File Layout
95layouts by default.
96However that default can be set to the Flexible File Layout by setting the
97.Xr sysctl 1
98sysctl
99.Dq vfs.nfsd.default_flexfile
100to one.
101Mirrored server configurations will only issue Flexible File Layouts.
102.Tn pNFS
103clients mount the MDS as they would a single NFS server.
104.Pp
105A FreeBSD
106.Tn pNFS
107client must be running the
108.Xr nfscbd 8
109daemon and use the mount options
110.Dq nfsv4,minorversion=1,pnfs .
111.Pp
112When files are created, the MDS creates a file tree identical to what a
113single NFS server creates, except that all the regular (VREG) files will
114be empty.
115As such, if you look at the exported tree on the MDS directly
116on the MDS server (not via an NFS mount), the files will all be of size zero.
117Each of these files will also have two extended attributes in the system
118attribute name space:
119.Bd -literal -offset indent
120pnfsd.dsfile - This extended attrbute stores the information that the
121    MDS needs to find the data file on a DS(s) for this file.
122pnfsd.dsattr - This extended attribute stores the Size, AccessTime,
123    ModifyTime and Change attributes for the file.
124.Ed
125.Pp
126For each regular (VREG) file, the MDS creates a data file on one
127(or on N of them for the mirrored case, where N is the mirror_level)
128of the DS(s) where the file's data will be stored.
129The name of this file is
130the file handle of the file on the MDS in hexadecimal at time of file creation.
131The data file will have the same file ownership, mode and NFSv4 ACL
132(if ACLs are enabled for the file system) as the file on the MDS, so that
133permission checking can be done on the DS.
134This is referred to as
135.Dq tightly coupled
136for the Flexible File Layout.
137.Pp
138For
139.Tn pNFS
140aware clients, the service generates File Layout
141or Flexible File Layout
142layouts and associated DeviceInfo.
143For non-pNFS aware NFS clients, the pNFS service appears just like a normal
144NFS service.
145For the non-pNFS aware client, the MDS will perform I/O operations on the appropriate DS(s), acting as
146a proxy for the non-pNFS aware client.
147This is also true for NFSv3 and NFSv4.0 mounts, since these are always non-pNFS
148aware.
149.Pp
150It is possible to assign a DS to an MDS exported file system so that it will
151store data for files on the MDS exported file system.
152If a DS is not assigned to an MDS exported file system, it will store data
153for files on all exported file systems on the MDS.
154.Pp
155If mirroring is enabled, the pNFS service will continue to function when
156DS(s) have failed, so long is there is at least one DS still operational
157that stores data for files on all of the MDS exported file systems.
158After a disabled mirrored DS is repaired, it is possible to recover the DS
159as a mirror while the pNFS service continues to function.
160.Pp
161See
162.Bd -literal -offset indent
163http://people.freebsd.org/~rmacklem/pnfs-planb-setup.txt
164.Ed
165.sp
166for information on how to set up a FreeBSD pNFS service.
167.Sh SEE ALSO
168.Xr nfsv4 4 ,
169.Xr exports 5 ,
170.Xr fstab 5 ,
171.Xr rc.conf 5 ,
172.Xr nfscbd 8 ,
173.Xr nfsd 8 ,
174.Xr nfsuserd 8 ,
175.Xr pnfsdscopymr 8 ,
176.Xr pnfsdsfile 8 ,
177.Xr pnfsdskill 8
178.Sh BUGS
179Linux kernel versions prior to 4.12 only supports NFSv3 DSs in its client
180and will do all I/O through the MDS.
181For Linux 4.12 kernels, support for NFSv4.1 DSs was added, but I have seen
182Linux client crashes when testing this client.
183For Linux 4.17-rc2 kernels, I have not seen client crashes during testing,
184but it only supports the
185.Dq loosely coupled
186variant.
187To make it work correctly when mounting the FreeBSD server, you must either
188patch the Flexible File Layout client driver with a patch like:
189.Bd -literal -offset indent
190http://people.freebsd.org/~rmacklem/flexfile.patch
191.Ed
192.sp
193or set the sysctl
194.Dq vfs.nfsd.flexlinuxhack
195to one so that it works around
196the Linux client driver's limitations.
197.Pp
198Since the MDS cannot be mirrored, it is a single point of failure just
199as a non
200.Tn pNFS
201server is.
202