[wplug] NFS mount problem

Michael Skowvron skowvron at verizon.net
Fri Jul 2 12:54:52 EDT 2004


Gentgeen wrote:

 > Sorry for the long post, just hoping for a little insite -

I'll say the same about my long response. Maybe more details than your 
looking for, but here it is anyway.

> I was able
> to find that 'linuxbox' was showing dropped packets, but no one else
> was.

How did you determine that linuxbox was showing dropped packets?
Does 'netstat -i' show a non-zero value for RX-DRP?

>  After a google search, I found a cure --I changed the option in
> /etc/fstab from rsize=8192 to rsize=1024,wsize=1024.  That seems to have
> done the trick.
> 
> Now does anyone have an idea why the laptop can be rsize=8192, but the
> desktop has to be rsize=1024 ?  Should I change all my network mounts to
> rsize=1024?

What you have done by lowering the request size is make NFS more 
tolerant of lost packets. NFS is done in "transactions" where each 
"transaction" commits a certain quantity of data. The entire 
transaction is checksummed and a bad checksum (or missing transaction) 
forces a retransmission of the entire transaction.

The default transaction (also known as the request size) for NFS 
version 2 is usually 4KBytes or 8KBytes and the default for NFS 
version 3 is usually 32KBytes or 48KBytes. The entire transaction must 
make it to the other host or the entire transaction gets 
retransmitted. Transferring the 8K request over ethernet requires 6 
packets. If any one is lost, all are resent.

Because of this, NFS is very sensitive to network problems. A 1% 
packet losss can lead to 80-90% drop in performance. It can also cause 
reads and writes to hang if the networking code isn't robust enough to 
deal with all types of lost data. In addition, if the packet loss is 
due to network congestion, a large request size just makes the problem 
worse. Losing 1 packet can cause 6 or more packets to be resent, 
compounding the congestion.

When you lowered your request to 1K, the entire transaction fit into 
an ethernet packet. If the packet is dropped, only 1 packet needs to 
be resent. This makes for a much faster "recovery" by NFS.

You do not need to change all mounts to 1K. On a per client basis, the 
NFS server will use whatever the negotiated request size is between 
them. However, you should not be dropping any packets. The source of 
this problem should be located and corrected.

Since linuxbox and kingpin appear to be running the same kernel and 
the exact same ethernet card, we can probably rule out a driver 
problem. If you are seeing a non-zero value for RX-DRP, I can think of 
two reasons. Either the packet is coming in and not passing the 
checksum, or the process that the data is being sent to is not 
accepting the data quickly enough. The former is more likely than the 
latter.

Since linuxbox is running a faster cpu than kingpin, it is less likely 
that the system will get too busy to service network data in a timely 
manner. It is more likely that either the cable, the ethernet card, or 
the port on the hub/switch is bad.

Start by moving linuxbox to a different port on the hub/switch.
If that makes no difference, replace the ethernet cable or swap it 
with kingpin's to see if the problem follows the cable. Lastly, you 
can try swapping ethernet cards between linuxbox and kingpin to see if 
one of the cards is bad.

A good tool to use for network testing is nttcp.
    http://www.leo.org/~elmar/nttcp/
It's good for streaming large amounts of data between two systems to 
test performance and detect network problems.

I hope this information helps you to locate and resolve your network 
issue.

Michael





More information about the wplug mailing list