HowtoForge

How To Configure Software RAID To Send An Email When Something's Wrong With RAID

How To Configure Software RAID To Send An Email When Something's Wrong With RAID

Version 1.0
Author: Falko Timme

This short guide explains how you can configure software RAID to send you an email when something's wrong with RAID, for example if a hard drive fails. I've tested this on Debian Etch, but it should apply to all other distributions with minor adjustments to paths, etc.

I do not issue any guarantee that this will work for you!

Open your mdadm.conf file (on Debian it's /etc/mdadm/mdadm.conf)...

vi /etc/mdadm/mdadm.conf

... and add a MAILADDR line (with your email address) to the file, e.g. like this:

DEVICES /dev/sda* /dev/sdb*
ARRAY /dev/md0 level=raid1 num-devices=2 UUID=c8a78e3a:e335c0f0:997be224:f02c088a
ARRAY /dev/md1 level=raid1 num-devices=2 UUID=fd9f3b6b:4fc9cf4f:09db592d:480d34fe
MAILADDR you@yourdomain.com

Then restart mdadm:

/etc/init.d/mdadm restart

That's it. Now whenever there's something wrong with your RAID setup, you will receive an email, for example as follows:

From: mdadm monitoring <root@server1.example.com>
To: you@yourdomain.com
Subject: DegradedArray event on /dev/md1:server1.example.com


This is an automatically generated mail message from mdadm
running on server1.example.com

A DegradedArray event had been detected on md device /dev/md1.

Faithfully yours, etc.

P.S. The /proc/mdstat file currently contains the following:

Personalities : [raid0] [raid1]
md1 : active raid1 sda2[2] sdb2[1]
      487853760 blocks [2/1] [_U]
      [>....................]  recovery =  4.3% (21448384/487853760) finish=114.3min speed=67983K/sec

md0 : active raid1 sda1[0] sdb1[1]
      530048 blocks [2/2] [UU]

unused devices: <none>

or like this:

From: mdadm monitoring <root@server1.example.com>
To: you@yourdomain.com
Subject: FailSpare event on /dev/md1:server1.example.com


This is an automatically generated mail message from mdadm
running on server1.example.com

A FailSpare event had been detected on md device /dev/md1.

It could be related to component device /dev/sda2.

Faithfully yours, etc.

P.S. The /proc/mdstat file currently contains the following:

Personalities : [raid0] [raid1]
md1 : active raid1 sda2[2](F) sdb2[1]
      487853760 blocks [2/1] [_U]
      [===================>.]  recovery = 99.9% (487851840/487853760) finish=0.0min speed=61037K/sec

md0 : active raid1 sda1[0] sdb1[1]
      530048 blocks [2/2] [UU]

unused devices: <none>

How To Configure Software RAID To Send An Email When Something's Wrong With RAID