Commit Graph

4 Commits

Author SHA1 Message Date
Elia Robyn Lake
bf05b1b1dc estimate the freq distribution of numbers 2022-03-10 18:33:42 -05:00
Robyn Speer
7a32b56c1c Round frequencies to 3 significant digits 2018-06-18 15:21:33 -04:00
Robyn Speer
b3c42be331 port remaining tests to pytest 2018-06-01 16:40:51 -04:00
Robyn Speer
0a2bfb2710 Tokenization in Korean, plus abjad languages (#38)
* Remove marks from more languages

* Add Korean tokenization, and include MeCab files in data

* add a Hebrew tokenization test

* fix terminology in docstrings about abjad scripts

* combine Japanese and Korean tokenization into the same function


Former-commit-id: fec6eddcc3
2016-07-15 15:10:25 -04:00