Commit Graph

3 Commits

Author SHA1 Message Date
Joshua Chin
b510e4144d removes combining marks from arabic words instead of treating them as punctuation
Former-commit-id: cebca52ea3
2015-06-25 12:36:41 -04:00
Joshua Chin
0a30164358 added non_punct to MANIFEST.in and moved it into data
Former-commit-id: b198f4b0c2
2015-06-24 17:30:01 -04:00
Rob Speer
1c65cb9f14 add new data files from wordfreq_builder
Former-commit-id: 35aec061de
2015-05-11 18:45:47 -04:00