Commit Graph

18 Commits

Author SHA1 Message Date
c25844a335 adding separate database class 2019-06-27 12:37:23 +02:00
fa8a5e55f8 Merge branch 'sqlite' 2019-06-27 11:45:20 +02:00
c2c2ce7ff8 making sorted words sorted a bit more non-randomly. 2019-06-27 11:44:02 +02:00
8b06c4ec38 Skipping already used abailable words, stupid refactoring bug 2019-06-27 00:57:46 +02:00
11706b6f81 word stats on sqlite now, not yet really working. 2019-06-27 00:37:47 +02:00
1256a4de40 Fixing loading bad gz files and progress showing 2019-06-26 13:06:43 +02:00
049f5ca3dc Adding new N* msds 2019-06-26 12:47:02 +02:00
cfdb36b894 Adding ability to load gz files. 2019-06-17 20:41:11 +02:00
d2f6f8dac8 adding new Nw msd 2019-06-17 20:39:07 +02:00
70b05e8637 New progress bar 2019-06-17 17:30:51 +02:00
3552f14b81 Loader to its own module 2019-06-17 15:38:55 +02:00
51cf3e7064 Improving debugging ouptut 2019-06-16 01:32:31 +02:00
dc285ce265 Saving memory in word-stats 2019-06-16 01:31:40 +02:00
37acabc076 able to load pickled structures 2019-06-16 01:31:14 +02:00
f0109771aa chunk size now handled in file-sentence-generator 2019-06-16 00:59:44 +02:00
0d8aeb2282 load_files now returns a generator of senteces, not a generator of the whole file
This makes it much slower, but more adaptable for huge files.
2019-06-15 22:30:43 +02:00
a8183cf507 word stats now collected more memory-efficient 2019-06-15 22:20:20 +02:00
90dbbca5d5 HUGE refactor, creating lots of modules, no code changes though! 2019-06-15 18:55:35 +02:00