Commit Graph

7 Commits

Author SHA1 Message Date
Rob Speer
af5f65b328 start a new multilingual wordlist called 'stems'
So far, this wordlist is only in Dutch.
2015-03-31 15:59:30 -04:00
Rob Speer
3507d8b630 Fix Dutch lists
- Use surface forms consistently, not stems
- Count all instances of words on Wikipedia, not one per article
2015-03-12 16:00:03 -04:00
Rob Speer
377336bcdc new Dutch data, bump version to 0.6 2015-03-03 15:54:45 -05:00
Rob Speer
ffdaa82b11 add surface forms from Twitter 2014 data 2015-02-17 15:06:11 -05:00
Rob Speer
6ab72201cd add twitter-stems-2014 wordlist data 2015-02-11 13:29:32 -05:00
Rob Speer
90772e33fb try to match the wordlist metanl actually uses 2013-10-31 15:13:22 -04:00
Rob Speer
26c0d7dd28 Add wordfreq_data files.
Now the build process is repeatable from scratch, even if something goes
wrong with the download server.
2013-10-31 13:39:02 -04:00