Commit Graph

31 Commits

Author SHA1 Message Date
Rob Speer
dcb77a552b fix to README: we're only using Reddit in English 2016-05-11 15:38:29 -04:00
Rob Speer
697842b3f9 fix table showing marginal Korean support 2016-03-30 15:11:13 -04:00
Rob Speer
ed32b278cc make an example clearer with wordlist='large' 2016-03-30 15:08:32 -04:00
Rob Speer
a10c1d7ac0 update wordlists for new builder settings 2016-03-28 12:26:47 -04:00
Rob Speer
d79ee37da9 Add and document large wordlists 2016-01-22 16:23:43 -05:00
Rob Speer
1793c1bb2e Merge branch 'master' into chinese-external-wordlist
Conflicts:
	wordfreq/chinese.py
2015-09-28 14:34:59 -04:00
Rob Speer
44b0c4f9ba Fix documentation and clean up, based on Sep 25 code review 2015-09-28 12:58:46 -04:00
Rob Speer
b460eef444 describe optional dependencies better in the README 2015-09-24 17:54:52 -04:00
Rob Speer
5b918e7bb0 fix README conflict 2015-09-22 14:23:55 -04:00
Rob Speer
3cb3061e06 Merge branch 'greek-and-turkish' into chinese-and-more
Conflicts:
	README.md
	wordfreq_builder/wordfreq_builder/ninja.py
2015-09-10 15:27:33 -04:00
Rob Speer
5c8c36f4e3 Lower the frequency of phrases with inferred token boundaries 2015-09-10 14:16:22 -04:00
Rob Speer
354555514f fixes based on code review notes 2015-09-09 13:10:18 -04:00
Rob Speer
6502f15e9b fix SUBTLEX citations 2015-09-08 17:45:25 -04:00
Rob Speer
d9c44d5fcc take out OpenSubtitles for Chinese 2015-09-08 17:25:05 -04:00
Rob Speer
d576e3294b update the README for Chinese 2015-09-05 03:42:54 -04:00
Rob Speer
7906a671ea WIP: Traditional Chinese 2015-09-04 18:52:37 -04:00
Rob Speer
3c3371a9ff add Polish and Swedish to README 2015-09-04 17:10:40 -04:00
Rob Speer
8196643509 add more citations 2015-09-04 15:57:40 -04:00
Rob Speer
77c60c29b0 Use SUBTLEX for German, but OpenSubtitles for Greek
In German and Greek, SUBTLEX and Hermit Dave turn out to have been
working from the same source data. I looked at the quality of how they
processed the data, and chose SUBTLEX for German, and Dave's wordlist
for Greek.
2015-09-04 15:52:21 -04:00
Rob Speer
81bbe663fb update README with additional SUBTLEX support 2015-09-04 13:23:33 -04:00
Rob Speer
d9a1c34d00 expand list of sources and supported languages 2015-09-04 01:03:36 -04:00
Rob Speer
d94428d454 support Turkish and more Greek; document more 2015-09-04 00:57:04 -04:00
Rob Speer
e6a2886a66 add SUBTLEX to the readme 2015-09-03 18:56:56 -04:00
Rob Speer
00a2812907 fix heading 2015-08-28 17:49:38 -04:00
Rob Speer
93f44683c5 fix list formatting 2015-08-28 17:49:07 -04:00
Rob Speer
2370287539 improve README with function documentation and examples 2015-08-28 17:45:50 -04:00
Rob Speer
573dd1ec79 update the README 2015-08-25 17:44:34 -04:00
Joshua Chin
b0a9a2980f no use for use 2015-07-17 14:46:40 -04:00
Andrew Lin
9f8464c2d1 Document the version of Unicode used to build the regexes. 2015-07-08 18:48:33 -04:00
Rob Speer
0f4ca80026 add installation instructions to the readme 2015-05-28 14:02:12 -04:00
Rob Speer
611a6a35de update Japanese data; test Japanese and token combining 2015-05-28 14:01:56 -04:00