[wplug] Hardware RAID tuning

Wed Jun 15 09:48:41 EDT 2011

On Tue, Jun 14, 2011 at 2:00 PM, Matthew Zwier <mczwier at gmail.com> wrote:

> Hi all,
>
> I'm the systems administrator for a relatively small (~20 nodes, ~320
> cores) scientific computing cluster with relatively large (20 TB)
> storage needs.  We have a couple of RAID5 arrays on a Dell PERC/5E
> (aka LSI MegaRAID) controller , both running XFS filesystems, and
> while performance is generally acceptable, it appears I can't get a
> backup in under five days for our 11 TB array.  That leads me to a
> couple of questions:
>
> 1)  That translates to about a 40 MB/s sustained read, with frequent
> lengthy drops to around 2 MB/s (this using xfsdump into /dev/null, for
> the moment).  For those of you with more experience than I...is that
> typical performance for a filesystem dump?  Pleasantly fast?
> Unacceptably slow?
>
> 2)  Does anyone know of documentation about how to go about tuning an
> on-line hardware RAID array in Linux, specifically for file service?
> About all I can find are discussions about how to optimize MySQL
> performance, or tips on what parameters in /sys to tweak while piping
> zeros directly to /dev/sdb using dd, and the like.  I can't find any
> documentation on how various hardware/kernel/filesystem parameters
> interact.  The three-way optimization problem among RAID controller
> settings (i.e. read-ahead), disk- and controller-specific kernel
> settings (TCQ depth, read-ahead), and I/O-scheduler-specific settings
> (noop vs. deadline vs. cfq, queue size, etc) is just killing me.
>
> Matt Z.
> _______________________________________________
> wplug mailing list
> wplug at wplug.org
> http://www.wplug.org/mailman/listinfo/wplug
>

I have a dell system like this. The disks are smaller, 300G, 10K.
I see write speeds of 300M per second over 6 drives.

you will need 10G of free space to run this test.

run
    $ dd if=/dev/zero of=/mount/yourraidfs/zero.txt  bs=1M count=10000

Do you see any error massages in the logs?

You must have a good backup of your data for the next test.  The extra load
from rebuilding can cause your array to fail.  But it's better to have it
fail when you have a good backup.
"rebuild your array one disk at a time"
buy one new disk.
Pull the first disk and replace it.
The array will rebuild the disk. This will write data to every used sector.
This is one way to prevent bitrot.
check the logs after each rebuild.
repeat until you have pulled all disk.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.wplug.org/pipermail/wplug/attachments/20110615/23e0373f/attachment.html