Rob Speer
6ab72201cd
add twitter-stems-2014 wordlist data
2015-02-11 13:29:32 -05:00
Rob Speer
fcd6044c2d
add utility for combining wordlists
2015-02-11 11:45:10 -05:00
Rob Speer
d3374a9fe1
command-line entry points
2015-02-10 12:28:29 -05:00
Rob Speer
693c35476f
Initial commit
2015-02-04 20:19:36 -05:00
Rob Speer
bf0071fd8b
Allow multithreaded SQLite on Python 3
2014-10-02 18:10:09 -04:00
Rob Speer
6d90cef415
construct the download path correctly, even on Windows
2014-09-08 10:56:48 -04:00
Rob Speer
c55a701885
remove unused global
2014-09-02 14:29:31 -04:00
Rob Speer
5dee417302
cleanups to building and uploading, from code review
2014-08-18 14:14:01 -04:00
Rob Speer
cb7b2b76e6
Add license text for the whole package
2014-06-02 16:37:32 -04:00
Rob Speer
44ccf40742
A different plan for the top-level word_frequency function.
...
When, before, I was importing wordfreq.query at the top level, this
created a dependency loop when installing wordfreq.
The new top-level __init__.py provides just a `word_frequency` function,
which imports the real function as needed and calls it. This should
avoid the dependency loop, at the cost of making
`wordfreq.word_frequency` slightly less efficient than
`wordfreq.query.word_frequency`.
2014-02-24 18:03:31 -05:00
Rob Speer
3702a7c8d0
version 0.4: minor code changes, debugged database
...
- The database is built under Python 3.3.2, so it should correctly
implement Python 3's Unicode tricks, including special handling
of Greek lowercase letters. (Version 0.3 was supposed to do this
as well, but apparently, it didn't.)
- `word_frequency` and `iter_wordlist` can be imported from the
top level.
- The new function `random_words` supplies a string made from
random words that are sufficiently high in rank order.
2014-02-24 16:29:06 -05:00
Rob Speer
3447ae732e
Sometimes you need some random words.
2014-01-06 15:51:10 -05:00
Andrew Lin
68d262791c
Remove the tests for metanl_word_frequency too. Doh.
2013-11-11 13:21:25 -05:00
Rob Speer
63bebe6ad3
Merge pull request #1 from LuminosoInsight/remove_metanl_wf
...
Remove metanl_word_frequency(), which we no longer need.
2013-11-11 10:13:25 -08:00
Rob Speer
56f2c606f1
data is now hosted on wordfreq.services.luminoso.com
2013-11-07 14:43:15 -05:00
Andrew Lin
76a7267670
Remove metanl_word_frequency(), which we no longer need.
2013-11-04 16:51:25 -05:00
Rob Speer
823b3828cd
Clear wordlists before inserting them; yell at Python 2
2013-11-01 19:29:37 -04:00
Rob Speer
5c8ba34492
Revert "code review and pep8 fixes"
...
This reverts commit b4b8ba8be7
.
Conflicts:
wordfreq/transfer.py
2013-11-01 17:33:39 -04:00
Rob Speer
90e042f196
Merge branch 'master' of github.com:LuminosoInsight/wordfreq
...
Conflicts:
wordfreq/transfer.py
2013-11-01 17:05:59 -04:00
Rob Speer
b4b8ba8be7
code review and pep8 fixes
2013-11-01 17:05:12 -04:00
Lance Nathan
ea29469643
Two small stylistic tweaks
2013-10-31 16:00:48 -04:00
Rob Speer
2b2bd943d2
make the tests less picky about numerical exactness
2013-10-31 15:43:19 -04:00
Rob Speer
90772e33fb
try to match the wordlist metanl actually uses
2013-10-31 15:13:22 -04:00
Rob Speer
0d2fb21726
The metanl scale is not what I thought it was.
2013-10-31 14:38:01 -04:00
Rob Speer
e931062b5a
Don't download the DB if the right version is already there
2013-10-31 14:12:04 -04:00
Rob Speer
c1564908f2
try being really nonspecific about functools32 versions
2013-10-31 14:06:06 -04:00
Rob Speer
2542cf9e35
be less specific about the functools32 version
2013-10-31 14:02:40 -04:00
Rob Speer
26c0d7dd28
Add wordfreq_data files.
...
Now the build process is repeatable from scratch, even if something goes
wrong with the download server.
2013-10-31 13:39:02 -04:00
Rob Speer
2cf812a64e
When strings are inconsistent between py2 and 3, don't test them on py2.
2013-10-31 13:11:13 -04:00
Rob Speer
10115f3965
add util.py, which provides standardize_word
2013-10-30 18:14:43 -04:00
Rob Speer
2f7572e3fc
and of course this changes the metanl constant
2013-10-30 18:14:34 -04:00
Rob Speer
8ef11fd33c
Turns out we need to change the metanl constant after normalizing words.
2013-10-30 16:58:10 -04:00
Rob Speer
40102a3f63
Normalize words when storing them or looking them up.
2013-10-30 14:59:57 -04:00
Rob Speer
3063b3915a
Revise the build test to compare lengths of wordlists.
...
The test currently fails on Python 3, for some strange reason.
2013-10-30 13:22:56 -04:00
Lance
ce07c881c5
Another Py3 change, this one for functools32
2013-10-30 12:06:41 -04:00
Lance
357cbb531e
Py3 tweak to urllib import
2013-10-30 11:57:50 -04:00
Rob Speer
be183b2564
Change default values to offsets.
2013-10-29 18:06:47 -04:00
Rob Speer
2907f7f077
now this package has tests
2013-10-29 17:21:55 -04:00
Rob Speer
ca5b3e2f5d
Implement the data uploady downloady stuff in setup.
2013-10-29 16:44:13 -04:00
Rob Speer
793893e738
Deal with database connections more consistently
2013-10-29 16:43:58 -04:00
Rob Speer
c475415f74
Add a couple of useful statistics about wordlists
2013-10-29 16:42:38 -04:00
Rob Speer
79d6c56fed
add query.iter_wordlist, to visit all words in a list
2013-10-29 12:44:16 -04:00
Rob Speer
bc00bb3a8b
prepare to write custom commands in setup.py
2013-10-29 12:43:41 -04:00
Rob Speer
e2510d802d
revise config.py, clarify some of query.py
2013-10-29 12:18:38 -04:00
Rob Speer
89ab7204fd
better default parameters and better log messages in building
2013-10-29 12:04:17 -04:00
Rob Speer
709ca6be66
Initial version.
...
Noticeably missing: data files or any way to get them.
2013-10-28 19:26:44 -04:00