Monitoring Linux Software RAID
Introduction
RAID is an acronym of ‘Redundant Array of Independent Disks’. It basically is a virtual device created from physical drives or partitions. With Linux, it is possible to use RAID without a need to have a hardware RAID controller, with both software and hardware RAID supported. Most RAID levels will allow some degree of drive failure, which is useful to protect important data.
Linux Supports the following software RAID levels: RAID 0 (No Redundancy), RAID 1, RAID 4, RAID 5, RAID 6, RAID 10. The levels of RAID are explained in detail here: https://www.enterprisestorageforum.com/management/raid-levels-explained/ However the most popular ones for redundancy are RAID1 (Mirroring) and RAID 10 (Striping and Mirroring).
Checking RAID configuration and status
All essential information about RAID devices are stored in ‘/etc/mdadm.conf’ file, which looks similar to the following:
[root@server ~]# cat /etc/mdadm.conf # mdadm.conf written out by anaconda MAILADDR root AUTO +imsm +1.x -all ARRAY /dev/md/boot level=raid1 num-devices=2 UUID=f8ce33fe:afd0b1b5:7aedbcf9:13e967d3 ARRAY /dev/md/root level=raid1 num-devices=2 UUID=a7866d4d:ec7a94a9:dc0e1c5a:76848e6b ARRAY /dev/md/swap level=raid1 num-devices=2 UUID=faca3cc3:7da4398b:5b42d265:3d14b0e2
Note: If you are using Debian operating system the mdadm.conf file will be located at: /etc/mdadm/mdadm.conf
To check the health status of your RAID array, you can simply run the following command:
[root@server ~]# cat /proc/mdstat Personalities : [raid1] md125 : active raid1 sdb1[1] sda2[0] 15623168 blocks super 1.2 [2/2] [UU] md126 : active raid1 sdb2[1] sda3[0] 975872 blocks super 1.2 [2/2] [UU] bitmap: 0/1 pages [0KB], 65536KB chunk md127 : active raid1 sdb3[1] sda5[0] 471641088 blocks super 1.2 [2/2] [UU] bitmap: 2/4 pages [8KB], 65536KB chunk unused devices: <none>
If you need more details about the specific RAID device, just run the following command, replacing /dev/md126 with the name of the device, you wanted to check:
[root@server ~]# mdadm --detail /dev/md126 /dev/md126: Version : 1.2 Creation Time : Tue Mar 23 01:37:13 2021 Raid Level : raid1 Array Size : 975872 (953.00 MiB 999.29 MB) Used Dev Size : 975872 (953.00 MiB 999.29 MB) Raid Devices : 2 Total Devices : 2 Persistence : Superblock is persistent Intent Bitmap : Internal Update Time : Tue Mar 23 15:24:09 2021 State : clean Active Devices : 2 Working Devices : 2 Failed Devices : 0 Spare Devices : 0 Consistency Policy : bitmap Name : localhost:boot UUID : f8ce33be:afd0b1b5:7aedbcf9:13e967d3 Events : 72 Number Major Minor RaidDevice State 0 8 3 0 active sync /dev/sda3 1 8 18 1 active sync /dev/sdb2
Setting up Email Alerts for RAID Monitoring
It is very simple and very useful at the same time to setup email alerts, so if there is something wrong with RAID setup, you will receive an email. At Fraction Servers we do not monitor customers RAID arrays and recommend any customer using software RAID setups monitoring of their RAD arrays.
To set this up, simply edit /etc/mdadm.conf ( or /etc/mdadm/mdadm.conf if it is Debian) file and add the following line:
MAILADDR you@yourdomain.com
obviously replacing you@yourdomain.com with your email address.
Then save the file and restart mdadm by executing the command:
/etc/init.d/mdadm restart
Now in case there is something wrong with your RAID setup you will receive an email alert similar to this one:
From: mdadm monitoring <root@server.example.com> To: you@yourdomain.com Subject: DegradedArray event on /dev/md1:server.example.com This is an automatically generated mail message from mdadm running on server.example.com A DegradedArray event had been detected on md device /dev/md1. Faithfully yours, etc. P.S. The /proc/mdstat file currently contains the following: Personalities : [raid0] [raid1] md1 : active raid1 sda2[2] sdb2[1] 487853760 blocks [2/1] [_U] [>....................] recovery = 4.3% (21448384/487853760) finish=114.3min speed=67983K/sec md0 : active raid1 sda1[0] sdb1[1] 530048 blocks [2/2] [UU] unused devices: <none>