Move Dovecot index files in different path to reduce I/O on /var/vmail

Discussion in 'Tips/Tricks/Mods' started by Janninger, Mar 30, 2017.

  1. Janninger

    Janninger New Member

    during the last months I worked a lot on improving our mailcluster. We use several Debian 8 based HP-Servers with Dovecot, Postfix and ISPConfig 3.1.2. The /var/vmail - directory is shared accross the cluster via GlusterFS, while i write this we use gluster version 3.9.1. In generel this works fine and we made really good experiences with glusterfs. It's robust, the self-healing works fine - only speed is a little drawback.

    But under heavy load we discovered more and more problems with the index files of Dovecot which get by default stored within the maildir. We recognized more and more messages like these:
    "Error: Corrupted transaction log file ...."
    "Error: Log synchronization error at seq=5,offset=17212 for ...."
    " Index ..... : Lost log for seq=67 offset=33312"
    and so on. we had so many errors that the problems became user-visible. For example users could not move Emails within IMAP-Folders, the emails simply "jumped back". Or sometimes users got internal server error messages.

    To me it seems that the file locking of all these Dovecot index files does not work properly accross GlusterFS although i set the corresponding switches in the Dovecot config file /etc/dovecot/conf.d/10-mail.conf:
    mail_temp_dir = /var/tmp
    mail_fsync = always
    mmap_disable = yes
    fsync_disable = no
    mail_nfs_storage = yes
    mail_nfs_index = yes
    lock_method = fcntl

    I spent a lot of time searching along several boards, but found no useable solution. So i started experimenting with moving the index files somewhere else. I decided to use /var/vmail-index. As root i executed:
    mkdir /var/vmail-index
    chown vmail:vmail /var/vmail-index
    (Remark: /var/vmail-index is on a different file system and different harddrives than /var/vmail, in my case this lowered I/O-load and IMAP-access-times)

    now edit /etc/dovecot/dovecot-sql.conf and comment out this line:
    # user_query = SELECT email as user, maildir as home, CONCAT( maildir_format, ':', maildir, '/', IF(maildir_format='maildir','Maildir',maildir_format)) as mail, uid, gid, CONCAT('*:storage=', quota, 'B') AS quota_rule, CONCAT(maildir, '/.sieve') as sieve FROM mail_user WHERE (login = '%u' OR email = '%u') AND `disable%Ls` = 'n' AND server_id = '1'
    and replace it by:
    user_query = SELECT email as user, maildir as home, CONCAT( maildir_format, ':', maildir, '/', IF(maildir_format='maildir','Maildir',maildir_format),':INDEX=/var/vmail-index/%d/%n') as mail, uid, gid, CONCAT('*:storage=', quota, 'B') AS quota_rule, CONCAT(maildir, '/.sieve') as sieve FROM mail_user WHERE (login = '%u' OR email = '%u') AND `disable%Ls` = 'n' AND server_id = '1'

    Restart dovecot:
    /etc/init.d/dovecot force-reload

    From now an Dovecot will place index files under a directory structure like this:

    If you start this on a busy server the I/O-Load will increase for a while as Dopvecot has to create a lot of directories. But as he is done with it IMAP-access gets really really fast! From now on we discovered none of the above described index erros.
    I wondered if there might be any problems if each Dovecot uses it's own index and users get moved to another server my our loadbalancer. But it works fine.

    The drawback of this procedure is that you have to care about you index-directory by yourself as deletion of mailboxes or domains in ISPConfig will not cover deletion of the correxpondig folders in /var/vmail-index. So i created a little bash-script which gets executed every night on each machine:


    # Verbose? 0 -> off, 1-> little, 2-> a lot

    # Really delete?

    # Maildir-Path

    # Index-Path

    test ! -d $MAILDIR && echo "MAILDIR $MAILDIR is no directory, exit." && exit 1
    test ! -d $INDEXDIR && echo "INDEXDIR $INDEXDIR is no directory, exit." && exit 1

    test $verbose -gt 0 && echo "Clearing index dir $INDEXDIR"
    test $echtbetrieb || echo "Testing - no deletion"
    cd $INDEXDIR
    for domain in $(ls)
    if [ -d $INDEXDIR/$domain ]
    if [ -d $MAILDIR/$domain ]
    test $verbose -gt 1 && echo "$MAILDIR/$domain still exists"
    for user in $(ls $MAILDIR/$domain)
    if [ -d $MAILDIR/$domain/$user ]
    test $verbose -gt 1 && echo "$MAILDIR/$domain/$user still exists"
    test $verbose -gt 0 && echo "$MAILDIR/$domain/$user doesn't exist any more, gets deleted"
    test $echtbetrieb && rm -rf $INDEXDIR/$domain/$user
    test $verbose -gt 0 && echo "$MAILDIR/$domain doesn't exist any more, gets deleted"
    test $echtbetrieb && rm -rf $INDEXDIR/$domain
    # sleep 1


    So - maybe this is a little bit helpful for somebody. I'm not sure whether this is "ISPConfig-update-safe", i'll know after the next update.
    Last edited: Apr 1, 2017
  2. till

    till Super Moderator Staff Member ISPConfig Developer

    Thank you for providing the details on scaling ISPConfig on your setup.

    I recommend writing a small ispconfig plugin which binds itself to the mailbox delete event to remove the index files instead of a bash script that has to go trough all directories.

    Regarding update safeness, just take care that you store the modified template for the dovecot-sql.conf file in /usr/local/ispconfig/server/conf-custom/install/ to make the config changes update-safe.
    Janninger likes this.
  3. florian030

    florian030 ISPConfig Developer ISPConfig Developer

    Did you try to make your changes to /etc/dovecot.conf and not to /etc/dovecot/conf.d/10-mail.conf? A server with ispconfig does not read conf.d/*
    Janninger likes this.
  4. Janninger

    Janninger New Member

    Oh - you're absolutely right, i must have been blind .... i fixed that. I gave it a try and turned one node back to the orignal config of dovecot-sql.conf. The result was frustrating. The locking works now with the settings in /etc/dovecon.conf, but accessind mailboxes gets now really slow. The load avarage raised to 10 while it is around 0.5-2 with my configuration and index files on different fast hartddrives.
    So moving the index files off the shared file system turned out to be an enormous speed enhancement in our setup. We located /var on a SAS-RAID1 so access is really fast.

    Sorry for my stupid question. Do i simply put a copy of my modified dovecot-sql.conf in that directory, is that enough?

    Yes, this will be a project within the next weeks. The shell script was just a first and quick solution.
  5. florian030

    florian030 ISPConfig Developer ISPConfig Developer

    You can try different mount-options for your glusterfs. Is there any need for such a setup? I would use replication for 2 nodes and dovecot director if you need more servers
  6. Janninger

    Janninger New Member

    I tried several mount options for glusterfs in the past, but that didn't do any serious progress. Anyway, we have glusterfs running for years, and it works fine. We use a cluster of Fortigate Firewalls as load balancer. They use quite intelligent algorithms for load balancing.
  7. florian030

    florian030 ISPConfig Developer ISPConfig Developer

  8. Janninger

    Janninger New Member

    Again: we use an external hardware-based load balancer so there's no need for the director. Our external hardware based solution takes load off the mailservers.
    That depends on the number of concurrent accesses. During peak times we have >1500 concurrent IMAP accesses per node. So moving files that are accessed very often off the shared file system is definately a speedup. Shared GlusterFS or NFS volumes will never be faster than local SAS- or SSD-storage.
    Last edited: Mar 31, 2017
  9. till

    till Super Moderator Staff Member ISPConfig Developer

    Better use the template of that file that you can find in the install/tpl/ of the ISPConfig tar.gz file and put that into conf-custom after you added your modifications to that template.
    Janninger likes this.
  10. lavdnone

    lavdnone New Member

    How would you do full-text search when indexes are now different, which has not much options but external FTS?
    FTS with Solr can't share the same Solr index now. So you can't have one server do indexing and others use it.
    - Wonder if it will work to just rewrite dovecot index on not FTS indexing nodes with one of current indexer.
    - Wonder what would happen with something like Elastic cluster on this - it will probably give wrong results and have each email indexed same number of times the nodes you have.

    How large is your dovecot index going per 1Gb of emails? I see around 1% of total.

    If using glusterfs fuse to connect to the local brick, making sure it reads from local yields noticeable performance results. cluster.nufa on, cluster.choose-local on, maybe even mount with "xlator-option=*replicate*.read-subvolume-index=X"

    Also, I have
    mail_fsync = never
    #mmap_disable = no
    mail_nfs_storage = yes
    #mail_nfs_index = no

    #but mail_fsync = optimized for lmtp from postfix
    protocol lmtp {
    mail_plugins = quota sieve acl zlib listescape #mail_crypt
    auth_socket_path = /usr/local/var/run/dovecot/auth-master
    mail_fsync = optimized
    Last edited: Jan 15, 2018
  11. lavdnone

    lavdnone New Member

    please disregard
    full-text search with solr works fine with indexes out of synced file system with email, though mentioned above might be useful for someone
  12. lavdnone

    lavdnone New Member

    had typo in server name in cron commit, and read-only solr nodes were not grabbing uncommitted part on reload
  13. lavdnone

    lavdnone New Member

Share This Page