2/4 failed drives in a raid5 array?? • Aaron Parecki

2/4 failed drives in a raid5 array??

December 30, 2006
Apparently raid arrays don't like it when you kill power to the machine. We've been doing remodeling in the house, and I've been turning on and off breakers. Apparently I forgot which breaker the servers were on, and turned it off accidentally a few times.

One of the drives apparently did fail for some reason. I accept this. This happens all the time, and that is exactly why I have a raid 5 array. I was like ok no big deal, I'll just send it in for an RMA. But shortly after, another drive looked like it failed. I saw this in /proc/mdstat:
```
...[U__U]
```
meaning only two of the four drives were left. Future attempts at rebooting the machine resulted in the raid volume not being accessible at all. Other clues that indicated a drive failure:

From /var/log/messages:
```
Dec 29 19:29:06 onyx kernel: Buffer I/O error on device md0, logical block 0
Dec 29 19:29:06 onyx kernel: lost page write due to I/O error on md0
Dec 29 19:29:06 onyx kernel: EXT2-fs error (device md0): ext2_readdir: bad page in #2
```
There was also some output in dmesg I found by typing
```
dmesg | less
```
(But I didn't write it down, and now dmesg outputs information from the last boot which successfully brought up the array with 3/4 drives.)

I was convinced I hadn't actually lost 2/4 drives at the same time, and set out to figure out a way to bring it back.

After several hours of looking through forums and reading the mdadm documentation, I was able to get the array back running on 3/4 drives.

I created the configuration file /etc/mdadm.conf:
```
DEVICE /dev/sd[abcd]1
ARRAY /dev/md0 devices=/dev/sda1,/dev/sdb1,/dev/sdc1,/dev/sdd1
```
Then ran:
```
/sbin/mdadm --assemble -f /dev/md0
mdadm: forcing event count in /dev/sdc1(2) from 1077319 upto 1077330
mdadm: clearing FAULTY flag for device 1 in /dev/md0 for /dev/sdc1
mdadm: /dev/md0 has been started with 3 drives (out of 4).
```
Now I just have to RMA this drive very quickly before another drive actually does fail.
Sat, Dec 30, 2006 10:56pm -08:00 #harddrives #raid5
Have you written a response to this? Let me know the URL:

Posted in /articles