Using NFS
© 2002 by metaconsultancy

A network file server is essential to any medium-scale operation. Users want to be able to work on the same files from any machine, without having to explicitly copy files back and forth. Having a file server also allows you to centralize backup and disk quota adminstration.

The oldest and by far most widely deployed network file server for Unix is NFS, the Network File System, originally designed by Sun for Solaris. NFS volumes can be shared between essentailly all Unix machines, and more recently also by Mac OS X machines.

Installation

Making your computer an NFS server or client is very easy. Hooks are already compiled into the Linux kernel, so you just need to obtain the programs that use them. A Debian NFS client needs

# apt-get install nfs-common portmap
while a Debian NFS server needs
# apt-get install nfs-kernel-server nfs-common portmap
Other distributions will provide similiar packages.
Basic Server Configuration

NFS exports from a server are controlled by the file /etc/exports. Each line begins with the absolute path of a directory to be exported, followed by a space-seperated list of allowed clients.

/etc/exports
/home 192.0.34.2(rw,no_root_squash) www.example.com(ro)
/usr  192.0.34.0/24(ro,insecure)
A client can be specified either by name or IP address. Wildcards (*) are allowed in names, as are netmasks (e.g. /24) following IP addresses, but should usually be avoided for security reasons.

A client specification may be followed by a set of options, in parenthesis. It is important not to leave any space between the last client specification character and the opening parenthesis, since spaces are intrepreted as client seperators. (Yes, the format is rather unwieldy.) Common options include:

ro rw allow client read-only or read-write access
(root_squash) no_root_squash map client root to server nobody, or not
(secure) insecure require clients to mount from a privleged port, or not
Defaults are in parenthesis.

When requesting a file, the client tells the server the uid number of the requesting user. It is therefore important that clients and servers share uids. If this is not possible, there are schemes to map client uids to server uids (man exports for more information), but it is almost always easier in the long run to adopt a scheme which allows clients and servers to share uids.

Because the server trusts the client to correctly identify users, you should only export volumes to clients which you personally administer. To prevent a malicious user on the client from running a program that mounts your export and then send the server incorrect uids, you should never use the insecure directive.

The only time it is safe to export to untrusted clients or to export to trusted clients insecurely is when you export is read-only and contains only information which you don't mind anyone reading (e.g. /usr volumes).

Finally, the RPC infrastructure over which NFS runs is known to be extremely exploitable. Therefore you should only run NFS behind a firewall which does not let RPC (port 111, either tcp or udp) flow across it.

If you make changes to /etc/exports on a running NFS server, you can make these changes effective by issuing the command:

# exportfs -a
Basic Client Configuration

NFS volumes can be mounted by root directly from the command line. For example

# mount files.example.com:/home /mnt/nfs
mounts the /home directory from the machine files.example.com as the directory /mnt/nfs on the client. Of course, for this to work, the directory /mnt/nfs must exist on the client and the server must have been configured to allow the client to access the volume.

It is more usual for clients to mount NFS volumes automatically at boot-time. NFS volumes can be specified like any others in /etc/fstab.

/etc/fstab
192.0.34.1:/home  /home  nfs  rw,rsize=4096,wsize=4096,hard,intr,async,nodev,nosuid  0  0
192.0.34.2:/usr   /usr   nfs  ro,rsize=8192,hard,intr,nfsvers=3,tcp,noatime,nodev,async  0  0

There are two kinds of mount options to consider: those specific to NFS and those which apply to all mounts. Consider first those specific to NFS.

option option description
(hard) soft client blocks during I/O or not
intr (nointr) client may interrupt operations or not
(udp) tcp communicate using UDP or TCP protocol
(nfsvers=2) nfsvers=3 use NFS protocol version 2 or version 3
rsize wsize read=nnnn and write=nnnn block transfer sizes
rw ro read-write or read-only access
sync async server blocks during I/O or not
atime noatime record access times on not
dev nodev allow device entries or not
suid nosuid allow suid entries or not
hard or soft?

Lore is so emphatic that soft mounts cause data corruption, that I have never tried them. When you use hard, though, be sure to also use intr, so that clients can escape from a hung NFS server with a Ctrl-C.

udp or tcp?

I usually end up using udp because I use Linux servers, which can serve NFS over TCP only using code that I consider too experimental for production use. But if you have BSD or Solaris servers, by all means use TCP, as long as your tests indicate that it does not have a substantial, negative impact on performance.

v2 or v3?

NFS v2 and NFS v3 differ only in minor details. While v3 supports a non-blocking write operation which theoretically speeds up NFS, in practice I have not seen any discernable performance advantage of v2 over v3. Still, I use v3 when I can, since it supports files larger than 2 GB and block sizes larger than 8192 bytes.

rsize and wsize

See the section on performance tuning below for advise of choosing rsize and wsize.

NFS security is utterly attrocious. An NFS server trusts an NFS client to enfore file access permissions. Therefore it is very important that you are root on any box you export to, and that you export with the insecure option, which would allow any old user on the client box arbitrary access to all the exported files.

Performance Tuning

NFS does not need a fast processor or a lot of memory. I/O is the bottleneck, so fast disks and a fast network help. If you use IDE disks, use hdparam to tune them for optimal transfer rates. If you support multiple, simultaneous users, consider paying for SCSI disks; SCSI can schedule multiple, interleaved requests much more intelligently than IDE can.

On the software side, by far the most effective step you can take is to optimize the NFS block size. NFS transfers data in chunks. If the chunks are too small, your computers spend more time processing chunk headers than moving bits. If the chunks are too large, your computers move more bits than they need to for a given set of data. To optimize the NFS block size, measure the transfer time for various block size values. Here is a measurement of the transfer time for a 256 MB file full of zeros.

# mount files.example.com:/home /mnt -o rw,wsize=1024
# time dd if=/dev/zero of=/mnt/test bs=16k count=16k
16384+0 records in
16384+0 records out

real	0m32.207s
user	0m0.000s
sys	0m0.990s
# umount /mnt
This corresponds to a throughput of 63 Mb/s.

Try writing with block sizes of 1024, 2048, 4096, and 8192 bytes (if you use NFS v3, you can try 16384 and 32768, too) and measuring the time required for each. In order to get an idea of the uncertainly in your measurements, repeat each measurement several times. In order to defeat caching, be sure to unmount and remount between measurements.

# mount files.example.com:/home /mnt -o ro,rsize=1024
# time dd if=/mnt/test of=/dev/null bs=16k
16384+0 records in
16384+0 records out

real	0m26.772s
user	0m0.010s
sys	0m0.530s
# umount /mnt

Your optimal block sizes for both reading and writing will almost certainly exceed 1024 bytes. It may occur that, like mine, your data do not indicate a clear optimum, but instead seem to approach an asymptote as block size is increased. In this case, you should pick the lowest block size which gets you close to the asymptote, rather than the highest available block size; anecdotal evidence indicates that too large block sizes can cause problems.

Once you have decided on an rsize and wsize, be sure to write them into your clients' /etc/fstab. You might also consider specifying the noatime option. In a long career as a sysadmin, I can attest that I have never once needed to know the last access time for a file; on the other hand, not recording it will not buy you much performance either.

Automount Failover