luscenje_struktur

Author	SHA1	Message	Date
lkrsnik	eb86a6bb1c	Added collocation_sentence_map_dest	2020-07-20 10:51:09 +02:00
lkrsnik	9a9d344510	Created new column "Joint_representative_form_variable" + Fixed collocation structures + Fixed bug with wrong lemma_fallback msds	2020-07-16 20:53:59 +02:00
lkrsnik	de3e52c57c	Changed output document to reflect most frequent word order	2020-07-10 13:43:52 +02:00
lkrsnik	777791ad1e	Added s/z, k/h + fixed bug 90 + connecting with sloleks on lemma_fallback	2020-07-08 19:23:56 +02:00
ozbolt	ec113f9cd2	Merge branch 'sql-join-test' of ozbolt/luscenje_struktur into master OK	2020-03-02 19:12:37 +00:00
ozbolt	9e8cd2a2ec	Issue #1000	2020-03-02 19:13:19 +01:00
ozbolt	1d4c0238a6	fixing how min_freq is used and more verbose writer	2019-11-06 02:39:26 +01:00
ozbolt	8fee3f8a8e	Testing delayed insertions of representations	2019-09-11 08:58:02 +02:00
ozbolt	6bb3586051	Attempt at speed optimization with sql-join	2019-09-10 16:22:43 +02:00
ozbolt	4124036474	match_num now loaded from database and --keep-db deprecated in favour of --new-db (harder for me to fu*k up)	2019-09-09 15:29:15 +02:00
ozbolt	07242f74c8	Also remember representations step.	2019-09-06 14:55:36 +02:00
ozbolt	33528f1495	step_done now implemented in database.py	2019-08-21 12:57:42 +02:00
ozbolt	3ea62ed242	dispersions now loaded into database and stored/loaded.	2019-08-21 12:49:03 +02:00
ozbolt	dedc031696	Step recorded: generate_renders	2019-08-21 12:16:10 +02:00
ozbolt	046aef031f	adding timeinfo	2019-08-21 11:13:23 +02:00
ozbolt	2018745d52	files loaded now in database	2019-08-21 11:12:38 +02:00
ozbolt	8cca761b91	min frequecy now part of writer	2019-08-21 11:11:06 +02:00
ozbolt	3f1c154705	can now load csv files	2019-08-21 11:09:47 +02:00
ozbolt	d497749c78	better database commiting	2019-08-21 11:08:08 +02:00
ozbolt	b25e3de76b	adding total keyword to progress and total time spent	2019-07-03 14:54:23 +02:00
ozbolt	771547b7e4	progress for dispersions	2019-07-03 14:53:51 +02:00
ozbolt	f9bfac6430	If no output, then just commit stuff to database and exit.	2019-07-03 13:10:55 +02:00
ozbolt	ec02242f47	num-words now part of database	2019-07-03 13:08:32 +02:00
ozbolt	ea92b44d71	Removing parallel stuff	2019-07-03 13:06:59 +02:00
ozbolt	d771137dc7	removing pickled structures	2019-07-03 13:05:52 +02:00
ozbolt	a07d14011d	simplifying progress, because I will remove the parallel stuff	2019-07-03 13:05:31 +02:00
ozbolt	577983427e	Better error reporting in parsing syntactic structures	2019-07-01 17:22:30 +02:00
ozbolt	48795c6227	common msd now calculated per colocation id and not for whole corpus	2019-07-01 17:22:01 +02:00
ozbolt	2f789e6550	last agreement now confirms some matches even if not all matches are ok	2019-07-01 17:20:27 +02:00
ozbolt	1401b82324	Adding msd to out formatter	2019-07-01 17:18:25 +02:00
ozbolt	47340fe80c	common msd now based on (lemma,msd0) not only lemma #757-127	2019-06-28 22:00:38 +02:00
ozbolt	8c20295adf	Adding dispersions to sqlite, finished moving to it.	2019-06-27 22:04:33 +02:00
ozbolt	b5e281bdf4	adding indexes for speed and set_representations via database	2019-06-27 17:16:27 +02:00
ozbolt	188763c06a	Incorporating database also in MatchStore	2019-06-27 16:51:58 +02:00
ozbolt	c25844a335	adding separate database class	2019-06-27 12:37:23 +02:00
ozbolt	fa8a5e55f8	Merge branch 'sqlite'	2019-06-27 11:45:20 +02:00
ozbolt	c2c2ce7ff8	making sorted words sorted a bit more non-randomly.	2019-06-27 11:44:02 +02:00
ozbolt	8b06c4ec38	Skipping already used abailable words, stupid refactoring bug	2019-06-27 00:57:46 +02:00
ozbolt	11706b6f81	word stats on sqlite now, not yet really working.	2019-06-27 00:37:47 +02:00
ozbolt	1256a4de40	Fixing loading bad gz files and progress showing	2019-06-26 13:06:43 +02:00
ozbolt	049f5ca3dc	Adding new N* msds	2019-06-26 12:47:02 +02:00
ozbolt	cfdb36b894	Adding ability to load gz files.	2019-06-17 20:41:11 +02:00
ozbolt	d2f6f8dac8	adding new Nw msd	2019-06-17 20:39:07 +02:00
ozbolt	70b05e8637	New progress bar	2019-06-17 17:30:51 +02:00
ozbolt	3552f14b81	Loader to its own module	2019-06-17 15:38:55 +02:00
ozbolt	51cf3e7064	Improving debugging ouptut	2019-06-16 01:32:31 +02:00
ozbolt	dc285ce265	Saving memory in word-stats	2019-06-16 01:31:40 +02:00
ozbolt	37acabc076	able to load pickled structures	2019-06-16 01:31:14 +02:00
ozbolt	f0109771aa	chunk size now handled in file-sentence-generator	2019-06-16 00:59:44 +02:00
ozbolt	0d8aeb2282	load_files now returns a generator of senteces, not a generator of the whole file This makes it much slower, but more adaptable for huge files.	2019-06-15 22:30:43 +02:00

1 2 3 4

159 Commits