Not sure if this applies to your hardware. But in the past I have had supermicro boards autodetect some 3ware card at 133 MHz pci-x vs their real 100 MHz. This can lead to ever increasing latency and performance issues before the card just fails.
On Jun 27, 2011 9:27 PM, "Lisa Kachold" <
lisakachold@obnosis.com> wrote:
> Hi Hans:
>
> On Mon, Jun 27, 2011 at 5:07 PM, der.hans <
PLUGd@lufthans.com> wrote:
>> moin moin,
>>
>> I've got a machine experiencing a lot of IO wait.
>>
>> We had power at a datacenter go down last week. Since then IO wait has
>> been over 35%. At first we thought it was due to 3ware RAID verify taking
>> place due to the crash. That took a few days, then the weekly verify
>> started. We stopped that and IO wait stayed high. 8 disks in a RAID 10.
>>
>> Load avg is also very high, presumably due to the IO wait.
>>
>> smartctl short tests didn't turn up any issues.
>>
>> We're not swapping at all.
>>
>> Disk read and write are fairly low.
>>
>> Network traffic is down as is the total number of process and the number
>> of running processes. No evidence of network errors on the box or at the
>> switch.
>>
>> Not much going on in the logs. We've stopped several reporting processes
>> in order to reduce disk access.
>>
>> On the positive side, entropy has been staying high :).
>>
>> IO wait is not explicitly disk? It could be network, serial, USB, etc.?
>>
>> How do I determine what resource is causing the IO wait? Is there a way to
>> track to a specific process?
>>
>> vmstat, iostat, top and lots of other tools have been great at showing
>> that there's overall IO wait ( I've been able to show that almost all
>> processors have high wait, one was only at 5% ), but I haven't yet
>> determined what and how.
>
> What version is your 3ware firmware? That's fairly important, you realize?
>
>> The server is running CentOS in case that matters.
>
> Please see this link related to known kernel bug in rhel kernel for
> 3ware products:
>
https://bugzilla.redhat.com/show_bug.cgi?id=121434
> It also discusses troubleshooting commands to verify, some kernel proc
> tuning and resolutions that worked for some.
>
> I don't see where your kernel or distro version is listed? CentOs in
> a 2.4 kernel? CentOs 5.6?
>
> There are many suggestions that will give you a place to start:
>
> For instance, try reducing the queue depth of the 3Ware driver:
>
> can_queue from 254 to 30
> command_per_lun from 254 to 4
>
> There is a good deal of material in this post that will give you some
> ideas on how to do high performance kernel tuning and troubleshooting.
>
> But first, I would search using your firmware version and kernel
> version/distro to get all the known issues in preparation for
> UPGRADING. You certainly can't expect CURRENT performance without
> kernel sources?
>> ciao,
>>
>> der.hans
>> --
>> #
http://www.LuftHans.com/ http://www.LuftHans.com/Classes/>> # Hope has two beautiful daughters: Anger and Courage. Anger at the way
>> # things are, and Courage to struggle to create things as they should be.
>> # -- St. Augustine
>> ---------------------------------------------------
>> PLUG-discuss mailing list -
PLUG-discuss@lists.plug.phoenix.az.us
>> To subscribe, unsubscribe, or to change your mail settings:
>>
http://lists.PLUG.phoenix.az.us/mailman/listinfo/plug-discuss
>>
>
>
>
> --
> (602) 791-8002 Android
> (623) 239-3392 Skype
> (623) 688-3392 Google Voice
>
> HomeSmartInternational.com
> ---------------------------------------------------
> PLUG-discuss mailing list -
PLUG-discuss@lists.plug.phoenix.az.us> To subscribe, unsubscribe, or to change your mail settings:
>
http://lists.PLUG.phoenix.az.us/mailman/listinfo/plug-discuss