Commit Graph

25 Commits

Author SHA1 Message Date
f330a37764 Improved representations speed + Fixed bug in representations 2020-07-22 11:16:28 +02:00
4c84873ff5 Fixing for run.sh and adding run.sh 2020-07-20 17:36:44 +02:00
eb86a6bb1c Added collocation_sentence_map_dest 2020-07-20 10:51:09 +02:00
9a9d344510 Created new column "Joint_representative_form_variable" + Fixed collocation structures + Fixed bug with wrong lemma_fallback msds 2020-07-16 20:53:59 +02:00
de3e52c57c Changed output document to reflect most frequent word order 2020-07-10 13:43:52 +02:00
777791ad1e Added s/z, k/h + fixed bug 90 + connecting with sloleks on lemma_fallback 2020-07-08 19:23:56 +02:00
4124036474 match_num now loaded from database
and --keep-db deprecated in favour of --new-db (harder for me to fu*k up)
2019-09-09 15:29:15 +02:00
046aef031f adding timeinfo 2019-08-21 11:13:23 +02:00
2018745d52 files loaded now in database 2019-08-21 11:12:38 +02:00
d497749c78 better database commiting 2019-08-21 11:08:08 +02:00
f9bfac6430 If no output, then just commit stuff to database and exit. 2019-07-03 13:10:55 +02:00
ea92b44d71 Removing parallel stuff 2019-07-03 13:06:59 +02:00
d771137dc7 removing pickled structures 2019-07-03 13:05:52 +02:00
a07d14011d simplifying progress, because I will remove the parallel stuff 2019-07-03 13:05:31 +02:00
b5e281bdf4 adding indexes for speed and set_representations via database 2019-06-27 17:16:27 +02:00
188763c06a Incorporating database also in MatchStore 2019-06-27 16:51:58 +02:00
c25844a335 adding separate database class 2019-06-27 12:37:23 +02:00
1256a4de40 Fixing loading bad gz files and progress showing 2019-06-26 13:06:43 +02:00
70b05e8637 New progress bar 2019-06-17 17:30:51 +02:00
3552f14b81 Loader to its own module 2019-06-17 15:38:55 +02:00
51cf3e7064 Improving debugging ouptut 2019-06-16 01:32:31 +02:00
37acabc076 able to load pickled structures 2019-06-16 01:31:14 +02:00
f0109771aa chunk size now handled in file-sentence-generator 2019-06-16 00:59:44 +02:00
0d8aeb2282 load_files now returns a generator of senteces, not a generator of the whole file
This makes it much slower, but more adaptable for huge files.
2019-06-15 22:30:43 +02:00
90dbbca5d5 HUGE refactor, creating lots of modules, no code changes though! 2019-06-15 18:55:35 +02:00