1 NFS Attribute Caching OS Problems and Amd 2 Last updated September 18, 2005 3 4* Summary: 5 6Some OSs don't seem to have a way to turn off the NFS attribute cache, which 7breaks the Amd automounter so badly that it is not recommend using Amd on 8such OS for heavy use, not until this is fixed. 9 10 11* Details: 12 13Amd is a user-level NFSv2 server that manages automounts of all other file 14systems. The kernel contacts Amd via RPCs, and Amd in turn performs the 15actual mounts, and then responds back to the kernel's RPCs. Every kernel 16caches attributes of files, in a cache called the Directory Name Lookup 17Cache (DNLC), or a Directory Cache (dcache). 18 19Amd manages its namespace in the user level, but the kernel caches names 20itself. So the two must coordinate to ensure that both namespaces are in 21sync. If the kernel uses a cached entry from the DNLC, without consulting 22Amd, users may see corruption of the automounter namespace (symlinks 23pointing to the wrong places, ESTALE errors, and more). For example, 24suppose Amd timed out an entry and removed the entry from Amd's namespace. 25Amd has to tell the kernel to purge its corresponding DNLC entry too. The 26way Amd often does that is by incrementing the last modification time 27(mtime) of the parent directory. This is the most common method for kernels 28to check if their DNLC entries are stale: if the parent directory mtime is 29newer, the kernel will discard all cached entries for that directory, and 30will re-issue lookup methods. Those lookups will result in 31NFS_GETATTR/NFS_LOOKUP calls sent from the kernel down to Amd, and Amd can 32then properly inform the kernel of the new state of automounted entries. 33 34In order to ensure that Amd is "in charge" of its namespace without 35interference from the kernel, Amd will try to turn off the NFS attribute 36cache. It does so by using the NFSMNT_NOAC flag, if it exists, or by 37setting various "cache timeout" fields in struct nfs_args to 0 (acregmin, 38acregmax, acdirmin, or acdirmax). 39 40We have released a major new version of am-utils, version 6.1, in June 2005. 41Since then, a lot of people have experimented with Amd, in anticipation of 42migrating from the very old am-utils 6.0 to the new 6.1. For a couple of 43months since the release of 6.1, we have received reports of problems with 44Amd, especially under heavy use. Users reported getting ESTALE errors from 45time to time, or seeing automounted entries whose symlinks don't point to 46where it should be. After much debugging, we traced it to a few places in 47Amd where it wasn't updating the parent directory mtime as it should have; 48in some places where Amd was indeed updating the mtime, it was using a 49resolution of only 1 second, which was not fine enough under heavy load. We 50fixed this problem and switched to using a microsecond resolution mtime. 51 52After fixing this in Amd, we went on to verify that things work for other 53OSs. When we got to test certain BSDs, we found out that they always cache 54directory entries, and there is no way to turn it off completely. 55Specifically, if we set the ac{reg,dir}{min,max} fields in struct nfs_args 56all to zero, the kernel seems to cache the entries for a default number of 57seconds (something like 5-30 seconds). On some OSs, setting these four 58fields to 0 turns off the attribute cache, but not on some BSDs. We were 59able to verify this using Amd and a script that exercises the interaction of 60the kernel's attrcache and Amd. (If you're interested, the script can be 61made available.) 62 63We then experimented by setting the ac{reg,dir}{min,max} fields in struct 64nfs_args all to 1, the smallest non-zero value we could. When we ran the 65Amd exercising script, we found that the value of 1 reduced the race between 66the DNLC and Amd, and the script took a little longer to run before it 67detected an incoherency. That makes sense: the smaller the DNLC cache 68interval is, the shorter the window of vulnerability is. (BTW, the man 69pages on some OSs say that the ac{reg,dir}{min,max} fields use a 1 second 70resolution, but experimentation indicated it was in 0.1 second units.) 71 72Clearly, setting the ac{reg,dir}{min,max} fields to 0 is worse than setting 73it to 1 on those OSs that don't have a way to turn off the attribute cache. 74So the current workaround I've implemented in am-utils is to create a 75configuration parameter called "broken_attrcache" which, if turned on, will 76set these nfs_args fields to 1 instead of 0. I wish I didn't have to create 77such ugly workaround features in Amd, but I've got no choice. 78 79The near term solution is for every OS to support a true 'noac' flag, which 80can be added fairly easily. This'd make Amd work reliably. 81 82The long term solution is to implement Autofs support for all OSs and to 83support it in Amd. Currently, Amd supports autofs on Solaris and Linux; 84FreeBSD is next. Still, we found that even with autofs support, many 85sysadmins still prefer to use the good 'ol non-autofs mode. 86 87 88* Confirmed Status 89 90This is the confirmed status of various OSs' vulnerability to this attribute 91cache bug. We are slowly checking the status of other OSs. The status of 92any OS not listed is unknown as of the date at the top of this file. 93 94** Not Vulnerable (support a proper "noac" flag): 95 96Sun Solaris 8 and 9 (10 probably works fine) 97Linux: 2.6.11 kernel (2.4.latest probably works fine) 98FreeBSD 5.4 and 6.0-SNAP001 (older versions probably work fine) 99OpenBSD 3.7 (older versions probably work fine) 100 101** Vulnerable (don't support a proper "noac" flag natively): 102 103NetBSD 2.0.2 (older versions are also probably affected) 104 105Note: NetBSD has promised to support a noac flag hopefully after 2.1.0 is 106released (maybe in 3.0 or 2.2). In the mean time, you can apply one of 107these two kernel patchs to support a 'noac' flag in NetBSD 2.x or 3.x: 108 ftp://ftp.netbsd.org/pub/NetBSD/misc/christos/2x.nfs.noac.diff 109 ftp://ftp.netbsd.org/pub/NetBSD/misc/christos/3x.nfs.noac.diff 110After applying this patch and rebuilding your kernel, reboot with the new 111kernel. Then copy the new nfs.h and nfsmount.h from /sys/nfs/ to 112/usr/include/nfs/, and finally rebuild am-utils from scratch. 113 114** Testing 115 116When you build am-utils, a script named scripts/test-attrcache is built, 117which can be used to test the NFS attribute cache behavior of the current 118OS. You can run this script as root as follows: 119 120# make install 121# cd scripts 122# sh test-attrcache 123 124If you run this script on an OS whose status is known (and not listed 125above), please report it to us via Bugzilla or the am-utils mailing list 126(see www.am-utils.org), so we can record it in this file. 127 128Sincerely, 129Erez. 130