[wplug] NFS mount problem
Michael Skowvron
skowvron at verizon.net
Fri Jul 2 12:54:52 EDT 2004
Gentgeen wrote:
> Sorry for the long post, just hoping for a little insite -
I'll say the same about my long response. Maybe more details than your
looking for, but here it is anyway.
> I was able
> to find that 'linuxbox' was showing dropped packets, but no one else
> was.
How did you determine that linuxbox was showing dropped packets?
Does 'netstat -i' show a non-zero value for RX-DRP?
> After a google search, I found a cure --I changed the option in
> /etc/fstab from rsize=8192 to rsize=1024,wsize=1024. That seems to have
> done the trick.
>
> Now does anyone have an idea why the laptop can be rsize=8192, but the
> desktop has to be rsize=1024 ? Should I change all my network mounts to
> rsize=1024?
What you have done by lowering the request size is make NFS more
tolerant of lost packets. NFS is done in "transactions" where each
"transaction" commits a certain quantity of data. The entire
transaction is checksummed and a bad checksum (or missing transaction)
forces a retransmission of the entire transaction.
The default transaction (also known as the request size) for NFS
version 2 is usually 4KBytes or 8KBytes and the default for NFS
version 3 is usually 32KBytes or 48KBytes. The entire transaction must
make it to the other host or the entire transaction gets
retransmitted. Transferring the 8K request over ethernet requires 6
packets. If any one is lost, all are resent.
Because of this, NFS is very sensitive to network problems. A 1%
packet losss can lead to 80-90% drop in performance. It can also cause
reads and writes to hang if the networking code isn't robust enough to
deal with all types of lost data. In addition, if the packet loss is
due to network congestion, a large request size just makes the problem
worse. Losing 1 packet can cause 6 or more packets to be resent,
compounding the congestion.
When you lowered your request to 1K, the entire transaction fit into
an ethernet packet. If the packet is dropped, only 1 packet needs to
be resent. This makes for a much faster "recovery" by NFS.
You do not need to change all mounts to 1K. On a per client basis, the
NFS server will use whatever the negotiated request size is between
them. However, you should not be dropping any packets. The source of
this problem should be located and corrected.
Since linuxbox and kingpin appear to be running the same kernel and
the exact same ethernet card, we can probably rule out a driver
problem. If you are seeing a non-zero value for RX-DRP, I can think of
two reasons. Either the packet is coming in and not passing the
checksum, or the process that the data is being sent to is not
accepting the data quickly enough. The former is more likely than the
latter.
Since linuxbox is running a faster cpu than kingpin, it is less likely
that the system will get too busy to service network data in a timely
manner. It is more likely that either the cable, the ethernet card, or
the port on the hub/switch is bad.
Start by moving linuxbox to a different port on the hub/switch.
If that makes no difference, replace the ethernet cable or swap it
with kingpin's to see if the problem follows the cable. Lastly, you
can try swapping ethernet cards between linuxbox and kingpin to see if
one of the cards is bad.
A good tool to use for network testing is nttcp.
http://www.leo.org/~elmar/nttcp/
It's good for streaming large amounts of data between two systems to
test performance and detect network problems.
I hope this information helps you to locate and resolve your network
issue.
Michael
More information about the wplug
mailing list