We have (and RHUL did have) a specific issue with a batch of disks in our C6100s, all brought at the same time. RHUL replaced all of them, we've so far replaced 50% of ours. I've updated the firmware on the remaining disks and had to do it via the usb stick method. Had few issues since the update.
For all other firemware updates It's totally possible to do firmware updates in linux using the downloads from the dell support web page.
In addition in the past I've used the firmware updates from the repo
http://linux.dell.com/repo/hardware/
Although for the R510 you will have to use version 7.3.0 as later versions do not contain the RX10 firmware and it will not be as new as that available from the support web page
We have (and RHUL did have) a specific issue with a batch of disks in our C6100s, all brought at the same time. RHUL replaced all of them, weve so far replaced 50% of ours,
dan
* Dr Daniel Traynor, Grid cluster system manager
* Tel +44(0)20 7882 6560, Particle Physics,QMUL
________________________________________
From: Testbed Support for GridPP member institutes <[log in to unmask]> on behalf of Winnie Lacesso <[log in to unmask]>
Sent: 13 February 2014 11:54
To: [log in to unmask]
Subject: Slightly OT: Does anyone have a flakey Dell R510?(host LCG VMs)
Greetings all,
In 2011 Bristol bought for a LCG VM-hosting box:
Dell R510, Intel Xeon E5620 (4 x 2.4GHz), 24GB RAM, 9 x 300GB SAS 15K
Has PERC H700 hardware RAID controller that makes those 9 disks = 2TB /sda
in a RAID6; hosts site-bdii VM, APEL, 2 x CREAM-CE, squid, etc.
Does anyone else have a box like this? Has anyone had any disk errors
on it, requiring warranty disk replaced (supplied by Dell)?
How many disk errors so far?
Starting May 2013 disk 0:0:3 on Bristol's R510 logged major errors, & was
replaced in Aug 2013 (hotswap hardware RAID) under warranty from Dell.
(This was when I found out that logwatch - which I do read once a week -
*ignores* the Dell Server Administrator error messages about bad disk or
other error, logged in /var/log/messages - which I don't (didn't) look at
much. I've since added to logwatch so it reports those errors.)
In Oct 2013 disk 0:0:2 logged a few errors, then more in Dec. We got a
replacment from Dell in Dec & replaced it just before Christmas.
(The 9 disks are 0:0:0 to 0:0:8)
Then starting mid-Jan 2014 disk 0:0:4 logged a few errors, & has
continued to log errors with slowly increasing frequency.
This time Dell is suggesting that there may be some other problem than
just a disk (since the "bad disk/errors" seems to be getting a bit
strangely frequent). They say we need to shut the server down, create
some microsoft-boot-able-usb-thing, boot the server from that, & update
the firmware on all the drives.
(I'm not inclined to do this... if we have to all I can say is, it better
not wreck ANYTHING on the vm-hosting box!)
Has anyone else got a Dell R510 that has had similar issues & has either
had this advice from Dell, or done it*? If so, was the outcome good?
Bristol PP is in the market for another server, & the above experience
makes me want a DNUK, not a Dell....
* or even "done anything like it"? (with a good outcome)
Grateful for advice
Winnie Lacesso / 55% HPC Storage Admin, 20% Particle Physics, 25% SysOps
HH Wills Physics Laboratory, Tyndall Avenue, Bristol, BS8 1TL, UK
University of Bristol
|