Token segmentation of spamassassin

Discussion in 'Server Operation' started by nguyenvuhoang, Nov 13, 2012.

    I'm researching about Vietnamese antispam by improving Spamassassin.

    I think in spamassassin program have a Bayesian Filter that detects SPAM email depend on tokens . According me , tokens are segmented by blanks . This is suitable for English language but in Vietnamese language isn't suitable. So i want to change "Token segmentation of spamassassin" to accordance with Vietnamese language, but i don't know position of the "Token segmentation" code is writted in spamassassin.

    Hope you let me know.
    Thanks so much!

