Commit Graph

74 Commits

Author SHA1 Message Date
Joshua Chin
21c809416d changed default to minimum for word_frequency
Former-commit-id: 9aa773aa2b
2015-07-07 15:03:26 -04:00
Joshua Chin
9c741bb341 updated tests
Former-commit-id: ca66a5f883
2015-07-07 14:13:28 -04:00
Robyn Speer
9615b9f843 test and document new twitter wordlists
Former-commit-id: 14cb408100
2015-07-01 17:53:38 -04:00
Robyn Speer
a9b9b2f080 update data using new build
Former-commit-id: f9a9ee7a82
2015-07-01 11:18:39 -04:00
Robyn Speer
4997d776b9 case-fold instead of just lowercasing tokens
Former-commit-id: 638467f600
2015-06-30 15:14:02 -04:00
Joshua Chin
fbd15947bb revert changes to test_not_really_random
Former-commit-id: bbf7b9de34
2015-06-30 11:29:14 -04:00
Joshua Chin
9b02abb5ea changed english test to take random ascii words
Former-commit-id: a49b66880e
2015-06-29 11:05:01 -04:00
Joshua Chin
d10109bb38 changed japanese test because the most common japanese ascii word keeps changing
Former-commit-id: 5ed03b006c
2015-06-29 11:04:19 -04:00
Joshua Chin
fa89956df3 Japanese people do not 'lol', they 'w'
Former-commit-id: 17f11ebd26
2015-06-29 11:01:13 -04:00
Joshua Chin
a0b7211451 updated tests for emoji splitting
Former-commit-id: 3bcb3e84a1
2015-06-25 11:25:51 -04:00
Robyn Speer
f3958d63ae Switch to a more precise centibel scale.
Former-commit-id: 7862a4d2b6
2015-06-22 17:36:30 -04:00
Joshua Chin
4706a38c7a updated test because the new tokenizer removes URLs
Former-commit-id: 35f472fcf9
2015-06-18 11:38:28 -04:00
Robyn Speer
860e929bf8 update Japanese data; test Japanese and token combining
Former-commit-id: 611a6a35de
2015-05-28 14:01:56 -04:00
Robyn Speer
4a865bfaec remove old tests
Former-commit-id: 410912d8f0
2015-05-21 20:36:09 -04:00
Robyn Speer
26517c1b86 tests for new wordfreq with full coverage
Former-commit-id: df863a5169
2015-05-21 20:34:17 -04:00
Robyn Speer
a06c3fc648 A different plan for the top-level word_frequency function.
When, before, I was importing wordfreq.query at the top level, this
created a dependency loop when installing wordfreq.

The new top-level __init__.py provides just a `word_frequency` function,
which imports the real function as needed and calls it. This should
avoid the dependency loop, at the cost of making
`wordfreq.word_frequency` slightly less efficient than
`wordfreq.query.word_frequency`.


Former-commit-id: 44ccf40742
2014-02-24 18:03:31 -05:00
Andrew Lin
181e8e08fa Remove the tests for metanl_word_frequency too. Doh.
Former-commit-id: 68d262791c
2013-11-11 13:21:25 -05:00
Robyn Speer
5f7c7e032c Clear wordlists before inserting them; yell at Python 2
Former-commit-id: 823b3828cd
2013-11-01 19:29:37 -04:00
Robyn Speer
5168da105a make the tests less picky about numerical exactness
Former-commit-id: 2b2bd943d2
2013-10-31 15:43:19 -04:00
Robyn Speer
773f6b9843 The metanl scale is not what I thought it was.
Former-commit-id: 0d2fb21726
2013-10-31 14:38:01 -04:00
Robyn Speer
101e767ad9 When strings are inconsistent between py2 and 3, don't test them on py2. 2013-10-31 13:11:13 -04:00
Robyn Speer
ea5de7cb2a Revise the build test to compare lengths of wordlists.
The test currently fails on Python 3, for some strange reason.
2013-10-30 13:22:56 -04:00
Robyn Speer
68f7b25cf7 Change default values to offsets. 2013-10-29 18:06:47 -04:00
Robyn Speer
8a48e57749 now this package has tests 2013-10-29 17:21:55 -04:00