Last Updated: February 25, 2016
· creaktive

Extracting a wordlist from the spellchecker

Sometimes, /usr/share/dict/words is just not enough. Sometimes, you need more words, maybe in another language. Fortunately, you can always inflate the dic/aff file pairs from Firefox/Thunderbird/OpenOffice/LibreOffice spellcheckers into plain wordlists. There is a poorly documented unmunch utility from the hunspell package that does the trick:

$ pwd
$ ls
en-US.aff  en-US.dic
$ wc en-US.dic 
   57438   57438  624100 en-US.dic
$ unmunch en-US.dic en-US.aff 2>/dev/null 1|sort -u|wc
  136734  136734 1302152

You can find dic/aff pairs for almost any language on the OpenOffice.org 2.x Dictionaries page.

Say Thanks
Filed Under