I spent about an hour bashing my head against a wall with this one.
I just put 4 400gb drives into my linux server. I didn't feel like getting a hardware raid card, so I went with the linux software raid solultion.
I run the command to create the array:
mdadm -v --create /dev/md1 --chunk=128 --level=5 --raid-devices=4 /dev/hda1 /dev/hdb1 /dev/hdc1 /dev/hdh1
And then it starts to build the array. I check the status with
watch -n 1 cat /proc/mdstat
In the output of /proc/mdstat, it shows all the drives in the array and which ones are failed. Like so:
md1 : active raid5 hdh1[<b>4</b>] hdc1[2] hdb1[1] hda1[0]
1172126208 blocks level 5, 128k chunk, algorithm 2 [4/4] [<b>UUU_</b>]
In Webmin, I see that the array is created, but one of the drives has failed. At first I thought I had a bad drive. But it turns out I can get any of the four drives to fail depending on the order that I specify them. Whichever drive is last in the list is the one that fails.
Finally, after an hour and a half of trying various combinations of creating arrays with 2, 3, 4 disks, raid 0 and 1, a regular file system, dinner, and several unsuccessful google searches, I stumbled upon an email from Neil Brown, the creator of the mdadm program, which explains everything.
Here is the answer:
. . . 4/ Assume that the parity blocks are all correct, but that one drive is missing (i.e. the array is degraded). This is repaired by reconstructing what should have been on the missing drive, onto a spare. This involves reading all the 'good' drives in parallel, calculating them missing block (whether data or parity) and writing it to the 'spare' drive. The 'spare' will be written to a few (10s or 100s of) blocks behind the blocks being read off the 'good' drives, but each drive will run completely sequentially and so at top speed. On a new array where most of the parity blocks are probably bad, '4' is clearly the best option. 'mdadm' makes sure this happens by creating a raid5 array not with N good drives, but with N-1 good drives and one spare. ...
So after all that, it was working correctly, and if I had just let it rebuild, it would go back to normal.
But, I had also come across a different website which mentioned the --force option to mdadm. I tried that, and lo and behold, the array was set up correctly!
This guy had the same problem, which is where I got the link to Neil Brown's email.