wordfreq/tests
Robyn Speer 9dac967ca3 Tokenize by graphemes, not codepoints (#50)
* Tokenize by graphemes, not codepoints

* Add more documentation to TOKEN_RE

* Remove extra line break

* Update docstring - Brahmic scripts are no longer an exception

* approve using version 2017.07.28 of regex
2017-08-08 11:35:28 -04:00
..
test_chinese.py Use langcodes when tokenizing again (it no longer connects to a DB) 2017-04-27 15:09:59 -04:00
test_french_and_related.py Use langcodes when tokenizing again (it no longer connects to a DB) 2017-04-27 15:09:59 -04:00
test_japanese.py Revert a small syntax change introduced by a circular series of changes. 2015-09-24 13:24:11 -04:00
test_korean.py Tokenization in Korean, plus abjad languages (#38) 2016-07-15 15:10:25 -04:00
test_serbian.py Use langcodes when tokenizing again (it no longer connects to a DB) 2017-04-27 15:09:59 -04:00
test.py Tokenize by graphemes, not codepoints (#50) 2017-08-08 11:35:28 -04:00