Commit Graph

180 Commits

Author SHA1 Message Date
6dd97838b4 Added fix for when two restrictions are satisfied with the same word. 2020-10-19 15:40:43 +02:00
8c87d07b8a Scripts adapted to changes of new structures.xml format 2020-10-14 14:50:35 +02:00
09c4277ebe Modified error signal + Fixed no_stat 2020-10-09 20:13:37 +02:00
06435aa3a2 Added options for "modra" 2020-10-09 15:18:52 +02:00
1ea454f63c Added fix for punctuations 2020-10-08 18:31:50 +02:00
d5668c8b68 Moved wani.py + Added ignore of .zstd files for valency 2020-10-01 16:20:52 +02:00
412d0c0f62 Changing file structure 2020-09-17 14:17:40 +02:00
c19c95ad97 Renaming src to luscenje struktur 2020-09-17 14:02:56 +02:00
5bff3e370f Added setup.py 2020-09-17 13:09:20 +02:00
01b08667d2 Added some functions for compatibility with valency, fixed readme and fixed some minor bugs. 2020-09-10 15:06:09 +02:00
1b0e6a27eb Modified readme.md + Removed obligatory sloleks_db + Added frequency_limit and sorted parameters in recalculate_statistics.py 2020-09-02 10:53:45 +02:00
41952738ed Added support for valency 2020-09-01 13:35:22 +02:00
e38ff4c7b0 Added limit to minimum frequency = 10 + Ordered by frequency 2020-08-21 15:05:30 +02:00
edea80e6e0 Added script for file extension 2020-08-20 16:13:22 +02:00
e8fdbfdb6a Merge branch 'master' of https://gitea.cjvt.si/ozbolt/luscenje_struktur 2020-07-24 10:07:22 +02:00
49a8d5123e Quick fix for missing dispersions 2020-07-24 10:06:54 +02:00
8cf9083421 Removing results 2020-07-24 10:00:12 +02:00
23b062cc1b Adding issue992 fixes 2020-07-24 09:59:07 +02:00
f330a37764 Improved representations speed + Fixed bug in representations 2020-07-22 11:16:28 +02:00
4c84873ff5 Fixing for run.sh and adding run.sh 2020-07-20 17:36:44 +02:00
14951e8422 Added multi file reading 2020-07-20 15:52:01 +02:00
eb86a6bb1c Added collocation_sentence_map_dest 2020-07-20 10:51:09 +02:00
9a9d344510 Created new column "Joint_representative_form_variable" + Fixed collocation structures + Fixed bug with wrong lemma_fallback msds 2020-07-16 20:53:59 +02:00
de3e52c57c Changed output document to reflect most frequent word order 2020-07-10 13:43:52 +02:00
777791ad1e Added s/z, k/h + fixed bug 90 + connecting with sloleks on lemma_fallback 2020-07-08 19:23:56 +02:00
ozbolt
ec113f9cd2 Merge branch 'sql-join-test' of ozbolt/luscenje_struktur into master
OK
2020-03-02 19:12:37 +00:00
9e8cd2a2ec Issue #1000 2020-03-02 19:13:19 +01:00
1d4c0238a6 fixing how min_freq is used and more verbose writer 2019-11-06 02:39:26 +01:00
8fee3f8a8e Testing delayed insertions of representations 2019-09-11 08:58:02 +02:00
6bb3586051 Attempt at speed optimization with sql-join 2019-09-10 16:22:43 +02:00
4124036474 match_num now loaded from database
and --keep-db deprecated in favour of --new-db (harder for me to fu*k up)
2019-09-09 15:29:15 +02:00
07242f74c8 Also remember representations step. 2019-09-06 14:55:36 +02:00
33528f1495 step_done now implemented in database.py 2019-08-21 12:57:42 +02:00
3ea62ed242 dispersions now loaded into database and stored/loaded. 2019-08-21 12:49:03 +02:00
dedc031696 Step recorded: generate_renders 2019-08-21 12:16:10 +02:00
046aef031f adding timeinfo 2019-08-21 11:13:23 +02:00
2018745d52 files loaded now in database 2019-08-21 11:12:38 +02:00
8cca761b91 min frequecy now part of writer 2019-08-21 11:11:06 +02:00
3f1c154705 can now load csv files 2019-08-21 11:09:47 +02:00
d497749c78 better database commiting 2019-08-21 11:08:08 +02:00
b25e3de76b adding total keyword to progress and total time spent 2019-07-03 14:54:23 +02:00
771547b7e4 progress for dispersions 2019-07-03 14:53:51 +02:00
f9bfac6430 If no output, then just commit stuff to database and exit. 2019-07-03 13:10:55 +02:00
ec02242f47 num-words now part of database 2019-07-03 13:08:32 +02:00
ea92b44d71 Removing parallel stuff 2019-07-03 13:06:59 +02:00
d771137dc7 removing pickled structures 2019-07-03 13:05:52 +02:00
a07d14011d simplifying progress, because I will remove the parallel stuff 2019-07-03 13:05:31 +02:00
577983427e Better error reporting in parsing syntactic structures 2019-07-01 17:22:30 +02:00
48795c6227 common msd now calculated per colocation id and not for whole corpus 2019-07-01 17:22:01 +02:00
2f789e6550 last agreement now confirms some matches even if not all matches are ok 2019-07-01 17:20:27 +02:00