Rob Speer
1793c1bb2e
Merge branch 'master' into chinese-external-wordlist
...
Conflicts:
wordfreq/chinese.py
2015-09-28 14:34:59 -04:00
Rob Speer
44b0c4f9ba
Fix documentation and clean up, based on Sep 25 code review
2015-09-28 12:58:46 -04:00
Rob Speer
b460eef444
describe optional dependencies better in the README
2015-09-24 17:54:52 -04:00
Rob Speer
5b918e7bb0
fix README conflict
2015-09-22 14:23:55 -04:00
Rob Speer
3cb3061e06
Merge branch 'greek-and-turkish' into chinese-and-more
...
Conflicts:
README.md
wordfreq_builder/wordfreq_builder/ninja.py
2015-09-10 15:27:33 -04:00
Rob Speer
5c8c36f4e3
Lower the frequency of phrases with inferred token boundaries
2015-09-10 14:16:22 -04:00
Rob Speer
354555514f
fixes based on code review notes
2015-09-09 13:10:18 -04:00
Rob Speer
6502f15e9b
fix SUBTLEX citations
2015-09-08 17:45:25 -04:00
Rob Speer
d9c44d5fcc
take out OpenSubtitles for Chinese
2015-09-08 17:25:05 -04:00
Rob Speer
d576e3294b
update the README for Chinese
2015-09-05 03:42:54 -04:00
Rob Speer
7906a671ea
WIP: Traditional Chinese
2015-09-04 18:52:37 -04:00
Rob Speer
3c3371a9ff
add Polish and Swedish to README
2015-09-04 17:10:40 -04:00
Rob Speer
8196643509
add more citations
2015-09-04 15:57:40 -04:00
Rob Speer
77c60c29b0
Use SUBTLEX for German, but OpenSubtitles for Greek
...
In German and Greek, SUBTLEX and Hermit Dave turn out to have been
working from the same source data. I looked at the quality of how they
processed the data, and chose SUBTLEX for German, and Dave's wordlist
for Greek.
2015-09-04 15:52:21 -04:00
Rob Speer
81bbe663fb
update README with additional SUBTLEX support
2015-09-04 13:23:33 -04:00
Rob Speer
d9a1c34d00
expand list of sources and supported languages
2015-09-04 01:03:36 -04:00
Rob Speer
d94428d454
support Turkish and more Greek; document more
2015-09-04 00:57:04 -04:00
Rob Speer
e6a2886a66
add SUBTLEX to the readme
2015-09-03 18:56:56 -04:00
Rob Speer
00a2812907
fix heading
2015-08-28 17:49:38 -04:00
Rob Speer
93f44683c5
fix list formatting
2015-08-28 17:49:07 -04:00
Rob Speer
2370287539
improve README with function documentation and examples
2015-08-28 17:45:50 -04:00
Rob Speer
573dd1ec79
update the README
2015-08-25 17:44:34 -04:00
Joshua Chin
b0a9a2980f
no use for use
2015-07-17 14:46:40 -04:00
Andrew Lin
9f8464c2d1
Document the version of Unicode used to build the regexes.
2015-07-08 18:48:33 -04:00
Rob Speer
0f4ca80026
add installation instructions to the readme
2015-05-28 14:02:12 -04:00
Rob Speer
611a6a35de
update Japanese data; test Japanese and token combining
2015-05-28 14:01:56 -04:00