Hi Ben.
What drives do you have attached to the Adaptec card? We had some
problems with WD 2TB green power drives and Adaptec controllers and
ended up updating the drive firmware. I know CERN have recently had
some problems with their Adaptec controllers too.
James.
On 7 March 2011 11:37, Ben Waugh <[log in to unmask]> wrote:
> Hi Storage Experts,
>
> In the absence of any well-known procedure for burning in or stress testing
> file servers, I thought I would try a naive approach and see what happened.
> Now I have problems but don't know how they have arisen or whether I am
> simply making unreasonable demands on the system.
>
> My naive test procedure involves simply copying a lot of bytes from
> /dev/zero onto multiple filesystems on our new RAID servers. So basically I
> create one 60 TB partition on each RAID, make it into an LVM physical
> volume, created a volume group on top of that, and then divide it into six
> or so logical volumes, creating an XFS filesystem on each. Then I start
> writing to these in parallel as follows:
> dd if=/dev/zero of=/mnt/data/temp1/testfile bs=1M &
> dd if=/dev/zero of=/mnt/data/temp2/testfile bs=1M &
> etc.
>
> This does not make any allowance for possible file-size limits, but I would
> have hoped at least for a graceful exit with a helpful error message.
> Instead, one of the servers has stopped writing to the disks and displays an
> impressive variety of errors in /var/log/messages, starting with:
>
> Mar 7 08:23:58 nfs2 kernel: aacraid: Host adapter abort request (0,0,1,0)
> Mar 7 08:23:58 nfs2 kernel: aacraid: Host adapter abort request (0,0,1,0)
> Mar 7 08:24:56 nfs2 last message repeated 188 times
> Mar 7 08:24:56 nfs2 kernel: aacraid: Host adapter reset request. SCSI hang
> ?
> Mar 7 08:24:56 nfs2 kernel: sd 0:0:1:0: SCSI error: return code =
> 0x08000002
> Mar 7 08:24:56 nfs2 kernel: sdb: Current: sense key: Hardware Error
> Mar 7 08:24:56 nfs2 kernel: Add. Sense: Internal target failure
> Mar 7 08:24:56 nfs2 kernel:
> Mar 7 08:24:56 nfs2 kernel: end_request: I/O error, dev sdb, sector
> 53707122737
> Mar 7 08:24:56 nfs2 kernel: I/O error in filesystem ("dm-6") meta-data dev
> dm-6 block 0x28001a68f ("xlog_iodone") error 5 buf count 2048
>
> This is a SuperMicro server, running SL5, with an Adaptec RAID controller.
>
> Any suggestions? My inclination is to try reconfiguring the RAID from
> scratch and designing a test procedure that limits file sizes to say 1 TB,
> but if this is indicative of a real underlying problem then maybe someone
> here can say so. One of the messages does say "Hardware Error" but how
> conclusive is this?
>
> Cheers,
> Ben
>
> --
> Dr Ben Waugh Tel. +44 (0)20 7679 7223
> Dept of Physics and Astronomy Internal: 37223
> University College London
> London WC1E 6BT
>
--
http://jamesthorne.net
http://photoze.net
http://twitter.com/bfjiant
|