How To Configure Software RAID To Send An Email When Something's Wrong With RAID

Author: Falko Timme

How To Configure Software RAID To Send An Email When Something's Wrong With RAID

Version 1.0
Author: Falko Timme

This short guide explains how you can configure software RAID to send you an email when something's wrong with RAID, for example if a hard drive fails. I've tested this on Debian Etch, but it should apply to all other distributions with minor adjustments to paths, etc.

I do not issue any guarantee that this will work for you!

Open your mdadm.conf file (on Debian it's /etc/mdadm/mdadm.conf)...

vi /etc/mdadm/mdadm.conf

... and add a MAILADDR line (with your email address) to the file, e.g. like this:

DEVICES /dev/sda* /dev/sdb*
ARRAY /dev/md0 level=raid1 num-devices=2 UUID=c8a78e3a:e335c0f0:997be224:f02c088a
ARRAY /dev/md1 level=raid1 num-devices=2 UUID=fd9f3b6b:4fc9cf4f:09db592d:480d34fe
MAILADDR you@yourdomain.com

Then restart mdadm:

/etc/init.d/mdadm restart

That's it. Now whenever there's something wrong with your RAID setup, you will receive an email, for example as follows:

From: mdadm monitoring <root@server1.example.com>
To: you@yourdomain.com
Subject: DegradedArray event on /dev/md1:server1.example.com

This is an automatically generated mail message from mdadm
running on server1.example.com

A DegradedArray event had been detected on md device /dev/md1.

Faithfully yours, etc.

P.S. The /proc/mdstat file currently contains the following:

Personalities : [raid0] [raid1]
md1 : active raid1 sda2[2] sdb2[1]
      487853760 blocks [2/1] [_U]
      [>....................]  recovery =  4.3% (21448384/487853760) finish=114.3min speed=67983K/sec

md0 : active raid1 sda1[0] sdb1[1]
      530048 blocks [2/2] [UU]

unused devices: <none>

or like this:

From: mdadm monitoring <root@server1.example.com>
To: you@yourdomain.com
Subject: FailSpare event on /dev/md1:server1.example.com

This is an automatically generated mail message from mdadm
running on server1.example.com

A FailSpare event had been detected on md device /dev/md1.

It could be related to component device /dev/sda2.

Faithfully yours, etc.

P.S. The /proc/mdstat file currently contains the following:

Personalities : [raid0] [raid1]
md1 : active raid1 sda2[2](F) sdb2[1]
      487853760 blocks [2/1] [_U]
      [===================>.]  recovery = 99.9% (487851840/487853760) finish=0.0min speed=61037K/sec

md0 : active raid1 sda1[0] sdb1[1]
      530048 blocks [2/2] [UU]

unused devices: <none>

How To Configure Software RAID To Send An Email When Something's Wrong With RAID