Commit Graph

  • ca9cf7d90f update the CHANGELOG for MeCab fix staging-20180629 code-review-20180629 Rob Speer 2018-06-26 11:31:03 -0400
  • 0149e9ec7f Merge pull request #59 from LuminosoInsight/korean-install-fixes Lance Nathan 2018-06-26 11:08:06 -0400
  • 3961a28973
    Merge pull request #59 from LuminosoInsight/korean-install-fixes Lance Nathan 2018-06-26 11:08:06 -0400
  • 79caa526c3 Merge pull request #58 from LuminosoInsight/significant-figures Lance Nathan 2018-06-25 18:53:39 -0400
  • a619ba6457
    Merge pull request #58 from LuminosoInsight/significant-figures Lance Nathan 2018-06-25 18:53:39 -0400
  • 830157d8e4 Fix instructions and search path for mecab-ko-dic Robyn Speer 2018-06-21 15:53:16 -0400
  • 676686fda1 Fix instructions and search path for mecab-ko-dic #59 Rob Speer 2018-06-21 15:53:16 -0400
  • fdf064b234 doctest the README Robyn Speer 2018-06-18 17:11:42 -0400
  • 5e05c942ac doctest the README #58 Rob Speer 2018-06-18 17:11:42 -0400
  • c6552f923f update README and CHANGELOG Robyn Speer 2018-06-18 15:15:07 -0400
  • 1dc763c9c5 update README and CHANGELOG Rob Speer 2018-06-18 15:15:07 -0400
  • 7a32b56c1c Round frequencies to 3 significant digits Robyn Speer 2018-06-15 15:42:54 -0400
  • c3b32b3c4a Round frequencies to 3 significant digits Rob Speer 2018-06-15 15:42:54 -0400
  • a95b360563 Merge pull request #57 from LuminosoInsight/version2.1 Lance Nathan 2018-06-18 12:06:47 -0400
  • 0911e90ba0
    Merge pull request #57 from LuminosoInsight/version2.1 Lance Nathan 2018-06-18 12:06:47 -0400
  • 39a1308770 update table in README: Dutch has 5 sources Robyn Speer 2018-06-18 11:43:52 -0400
  • 2b85a1cef2 update table in README: Dutch has 5 sources #57 Rob Speer 2018-06-18 11:43:52 -0400
  • 0280f82496 fix typo in previous changelog entry Robyn Speer 2018-06-18 10:52:28 -0400
  • 52aae3459d fix typo in previous changelog entry Rob Speer 2018-06-18 10:52:28 -0400
  • 42efcfc1ad relax the test that assumed the Chinese list has few ASCII words Robyn Speer 2018-06-15 16:29:15 -0400
  • 2f6b87c86b relax the test that assumed the Chinese list has few ASCII words Rob Speer 2018-06-15 16:29:15 -0400
  • ad0f046f47 fixes to tests, including that 'test.py' wasn't found by pytest Robyn Speer 2018-06-15 15:48:41 -0400
  • 57f676f4a6 fixes to tests, including that 'test.py' wasn't found by pytest Rob Speer 2018-06-15 15:48:41 -0400
  • a975bcedae update tests to include new languages Robyn Speer 2018-06-12 17:55:44 -0400
  • 93e3e03c60 update tests to include new languages Rob Speer 2018-06-12 17:55:44 -0400
  • 4b7e3d9655 bump version to 2.1; add test requirement for pytest Robyn Speer 2018-06-12 17:48:24 -0400
  • 93ddc192d8 bump version to 2.1; add test requirement for pytest Rob Speer 2018-06-12 17:48:24 -0400
  • 3259c4a375 Merge remote-tracking branch 'origin/pytest' into version2.1 Robyn Speer 2018-06-12 17:46:48 -0400
  • ff4f7bf3f6 Merge remote-tracking branch 'origin/pytest' into version2.1 Rob Speer 2018-06-12 17:46:48 -0400
  • d5f7335d90 New data import from exquisite-corpus Robyn Speer 2018-06-12 17:22:43 -0400
  • db43e0e25c New data import from exquisite-corpus Rob Speer 2018-06-12 17:22:43 -0400
  • b3c42be331 port remaining tests to pytest Robyn Speer 2018-06-01 16:40:51 -0400
  • 96a01b9685 port remaining tests to pytest pytest Rob Speer 2018-06-01 16:40:51 -0400
  • 75b4d62084 port test.py and test_chinese.py to pytest Robyn Speer 2018-06-01 16:33:06 -0400
  • 863d5be522 port test.py and test_chinese.py to pytest Rob Speer 2018-06-01 16:33:06 -0400
  • 6235d88869 Use data from fixed XC build - mostly changes Chinese Robyn Speer 2018-05-30 13:09:20 -0400
  • 8fcae9978e Use data from fixed XC build - mostly changes Chinese Rob Speer 2018-05-30 13:09:20 -0400
  • 5762508e7c commit new data files (Italian changed for some reason) Robyn Speer 2018-05-29 17:36:48 -0400
  • 90b5246a48 commit new data files (Italian changed for some reason) Rob Speer 2018-05-29 17:36:48 -0400
  • e4cb9a23b6 update data to include xc's processing of ParaCrawl Robyn Speer 2018-05-25 16:12:35 -0400
  • cd434b2219 update data to include xc's processing of ParaCrawl Rob Speer 2018-05-25 16:12:35 -0400
  • 8907423147 Packaging updates for the new PyPI Robyn Speer 2018-05-01 17:16:53 -0400
  • aa91e1f291 Packaging updates for the new PyPI staging-20180518 code-review-20180518 code-review-20180507 Rob Speer 2018-05-01 17:16:53 -0400
  • 316670a234 Merge pull request #56 from LuminosoInsight/japanese-edge-cases Lance Nathan 2018-05-01 14:57:45 -0400
  • 968bc3a85a
    Merge pull request #56 from LuminosoInsight/japanese-edge-cases Lance Nathan 2018-05-01 14:57:45 -0400
  • e0da20b0c4 update CHANGELOG for 2.0.1 Robyn Speer 2018-05-01 14:47:55 -0400
  • 0a95d96b20 update CHANGELOG for 2.0.1 #56 Rob Speer 2018-05-01 14:47:55 -0400
  • 666f7e51fa Handle Japanese edge cases in simple_tokenize Robyn Speer 2018-04-26 15:53:07 -0400
  • 3ec92a8952 Handle Japanese edge cases in simple_tokenize Rob Speer 2018-04-26 15:53:07 -0400
  • 18f176dbf6 Merge pull request #55 from LuminosoInsight/version2 Lance Nathan 2018-03-15 14:26:49 -0400
  • e3a1b470d9
    Merge pull request #55 from LuminosoInsight/version2 staging-20180323 code-review-20180323 Lance Nathan 2018-03-15 14:26:49 -0400
  • d9bc4af8cd update the changelog Robyn Speer 2018-03-14 17:56:29 -0400
  • a759f38540 update the changelog #55 Rob Speer 2018-03-14 17:56:29 -0400
  • b2663272a7 remove LAUGHTER_WORDS, which is now unused Robyn Speer 2018-03-14 17:33:35 -0400
  • 6f1a9aaff1 remove LAUGHTER_WORDS, which is now unused Rob Speer 2018-03-14 17:33:35 -0400
  • 65811d587e More explicit error message for a missing wordlist Robyn Speer 2018-03-14 15:10:27 -0400
  • 1a761199cd More explicit error message for a missing wordlist Rob Speer 2018-03-14 15:10:27 -0400
  • 2ecf31ee81 Actually use min_score in _language_in_list Robyn Speer 2018-03-14 15:08:52 -0400
  • b2bdc8a854 Actually use min_score in _language_in_list Rob Speer 2018-03-14 15:08:52 -0400
  • c57032d5cb code review fixes to wordfreq.tokens Robyn Speer 2018-03-14 15:07:45 -0400
  • bb2096ae04 code review fixes to wordfreq.tokens Rob Speer 2018-03-14 15:07:45 -0400
  • de81a23b9d code review fixes to __init__ Robyn Speer 2018-03-14 15:04:59 -0400
  • 430fb01e53 code review fixes to __init__ Rob Speer 2018-03-14 15:04:59 -0400
  • 8656688b0b fix mention of dependencies in README Robyn Speer 2018-03-14 15:01:08 -0400
  • a6bb267f89 fix mention of dependencies in README Rob Speer 2018-03-14 15:01:08 -0400
  • d68d4baad2 Subtle changes to CJK frequencies Robyn Speer 2018-03-14 11:18:24 -0400
  • bac3dcb620 Subtle changes to CJK frequencies Rob Speer 2018-03-14 11:18:24 -0400
  • 0cb36aa74f cache the language info (avoids 10x slowdown) Robyn Speer 2018-03-09 14:54:03 -0500
  • e64f409c55 cache the language info (avoids 10x slowdown) Rob Speer 2018-03-09 14:54:03 -0500
  • b162de353d avoid log spam: only warn about an unsupported language once Robyn Speer 2018-03-09 11:50:15 -0500
  • 11e758672e avoid log spam: only warn about an unsupported language once Rob Speer 2018-03-09 11:50:15 -0500
  • c5f64a5de8 update the README Robyn Speer 2018-03-08 18:16:15 -0500
  • 49a603ea63 update the README Rob Speer 2018-03-08 18:16:15 -0500
  • d8e3669a73 wordlist updates from new exquisite-corpus Robyn Speer 2018-03-08 18:16:00 -0500
  • 92784d1768 wordlist updates from new exquisite-corpus Rob Speer 2018-03-08 18:16:00 -0500
  • 53dc0bbb1a Test that we can leave the wordlist unspecified and get 'large' freqs Robyn Speer 2018-03-08 18:09:57 -0500
  • 1594ba3ad6 Test that we can leave the wordlist unspecified and get 'large' freqs Rob Speer 2018-03-08 18:09:57 -0500
  • 8e3dff3c1c Traditional Chinese should be preserved through tokenization Robyn Speer 2018-03-08 18:08:55 -0500
  • 47dac3b0b8 Traditional Chinese should be preserved through tokenization Rob Speer 2018-03-08 18:08:55 -0500
  • 45064a292f reorganize wordlists into 'small', 'large', and 'best' Robyn Speer 2018-03-08 17:52:44 -0500
  • 5a5acec9ff reorganize wordlists into 'small', 'large', and 'best' Rob Speer 2018-03-08 17:52:44 -0500
  • fe85b4e124 fix az-Latn transliteration, and test Robyn Speer 2018-03-08 16:47:36 -0500
  • 67e4475763 fix az-Latn transliteration, and test Rob Speer 2018-03-08 16:47:36 -0500
  • a4d9614e39 setup: update version number and dependencies Robyn Speer 2018-03-08 16:26:24 -0500
  • a42cf312ef setup: update version number and dependencies Rob Speer 2018-03-08 16:26:24 -0500
  • 5ab5d2ea55 Separate preprocessing from tokenization Robyn Speer 2018-03-08 16:25:45 -0500
  • 45b9bcdbcb Separate preprocessing from tokenization Rob Speer 2018-03-08 16:25:45 -0500
  • 72646f16a1 minor fixes to README Robyn Speer 2018-02-28 16:14:29 -0500
  • 846606d892 minor fixes to README staging-20180309 code-review-20180309 Rob Speer 2018-02-28 16:14:29 -0500
  • cd7bfc4060 Merge pull request #54 from LuminosoInsight/fix-deps Robyn Speer 2018-02-28 12:46:46 -0800
  • ad677e12fd
    Merge pull request #54 from LuminosoInsight/fix-deps Rob Speer 2018-02-28 12:46:46 -0800
  • 208559ae1e bump version to 1.7.0, belatedly Robyn Speer 2018-02-28 15:15:47 -0500
  • aadb19c9a3 bump version to 1.7.0, belatedly #54 Rob Speer 2018-02-28 15:15:47 -0500
  • 98cb47c774 update msgpack-python dependency to msgpack Robyn Speer 2018-02-28 15:14:51 -0500
  • db56528fb6 update msgpack-python dependency to msgpack Rob Speer 2018-02-28 15:14:51 -0500
  • 4d2ddc940a
    Merge 1d5d64c811 into 843ed92223 #53 Matan Shenhav 2018-02-24 22:34:47 +0000
  • 1d5d64c811 Updated setup.py (msgpack-python -> msgpack dependency, bumped version to 1.7.0, made regex dependency more forgiving in version number. #53 Matan Shenhav 2018-02-24 21:36:45 +0000
  • ec9c94be92 update citation to v1.7 Robyn Speer 2017-09-27 13:36:30 -0400
  • 843ed92223 update citation to v1.7 staging-20171006 code-review-20171006 Rob Speer 2017-09-27 13:36:30 -0400
  • 95a13ab4ce Merge pull request #51 from LuminosoInsight/version1.7 Andrew Lin 2017-09-08 17:02:05 -0400