creating a distributed dovecot mail spool

Discussion in 'Developers' Forum' started by ispcomm, Dec 12, 2013.

  1. ispcomm

    ispcomm Member

    I am faced with an ever growing mail spool and increasing number of mailboxes.

    A move from courier to dovecot helped the situation as dovecot uses more efficient indexes but I'm bond to outgrow my current shared-storage.

    So, as shared-all does not work any more, I need to move to a shared-nothing backend storage, unless I want to shell big $$$ for a custom solution, which I don't.

    The obvious move would be to glusterfs based storage (perhaps easiest). However I'm wondering if glusterfs deployed on 3-4 nodes would be sufficient.

    There's also ceph, which is inherently more resource-hungry.

    And there's hadoop and hdfs, which might be usable with dovecot. I have not found much information about hadoop native dovecot plugin, but perhaps it's usable di hdbase.

    I'd just like to exchange some ideas with the smart guys using ispconfig in bigger installations.

    What do you use?

  2. till

    till Super Moderator Staff Member ISPConfig Developer

    Do you mean mail spool or maildir storage location? my understanding of the term mail spool is the postfix mail spool in the /var/spool/postfix/* directories. I think the postfix mail spool should not be a problem as you dont have to use a shared storage on them as emails are there only for a short time during delivery (except of mails that are stuck in the queue of course). Regarding maildir storage, I know a company that has used glusterfs with 2 nodes, but I dont know if they had any performance problems with that.
  3. till

    till Super Moderator Staff Member ISPConfig Developer

    There is one other dovecot plugin that came to my mind. Dovecot has a plugin to store older emails in a different location, this is transparent for the user, so he wont notice that. So maybe you can implement a solution that uses a second storage location were you e.g. store all emails that are older 1 year.
  4. ispcomm

    ispcomm Member

    Well... I use spool in a generic sense (system pool) as in the old days of slackware where the mail "spool" was contained in /var/spool/mail.

    However, I do mean dovecot dir space.

    May I ask how many accounts and total data storage is this company hosting on glusterfs ? I was having issues with courier-imap on a installation with 5000 accounts and a two terabytes of total storage, shared over nfs and backed by zfs/solaris.

    The problems come with indexing and search operations.

    This would be the ideal hadoop environment, where you could span a mapreduce process to search the space on a single node while the others would be free to serve other clients. But it would require to write a new dovecot storage backend, which is not economically feasible (for me at least).

  5. till

    till Super Moderator Staff Member ISPConfig Developer

    I dont know their exact numbers, I just did admin work for them sometimes on their systems and they have a few thousand accounts as well.

    When it comes to search operations, dovecot supports external search servers like solr:

    current dovecot version ahve also different storage formats, maybe a different format like mdbox is less ressource hungry as you dont have that many small files then.
  6. ispcomm

    ispcomm Member

    Thank you for the pointers.

    My first move was from courier to dovecot, keeping the maildir format (converted from courier to dovecot uidls only).

    As you say, glusterfs might be a good match for mdbox backend. I'll need to test this.

    I am also having a new requirement raised a few days ago: The colo facility where my servers are located is unfortunately receiving ddos which is exceeding their capacity. I have a very short window for moving out, before I start loosing customers.

    I need to plan for a smooth migration to another facility, with live operation. My guess is that the dovecot replication service might come handy and that I can both migrate out of the current colo and switch to a replicated glusterfs + mdbox based backend using the async replication of dovecot.

    What a mouth-full. Might work tough.

Share This Page