|
4c2b5f2b13
|
Updating for lemma representation of word_form. Also cleaning code, adding tqdm,...
|
2019-05-24 18:15:21 +02:00 |
|
|
3c669c7901
|
looking for agreements from the whole corpus
|
2019-05-23 08:13:29 +02:00 |
|
|
e99ba59908
|
lemma/msd representations now global! Need to also use for agreements
|
2019-05-22 11:55:51 +02:00 |
|
|
d14efff709
|
Intermediate UGLY CODE commit. Working more on representations
|
2019-05-22 11:22:07 +02:00 |
|
|
dce55d04a3
|
Does not yet work, agreements in representation
|
2019-05-20 18:14:11 +02:00 |
|
|
5bd0b4a064
|
correct representation when rep_failed
|
2019-05-17 20:45:39 +02:00 |
|
|
111512a901
|
no more structureselection enum
|
2019-05-17 20:45:10 +02:00 |
|
|
d2f1e95a8f
|
continued work on representation, almost there...
|
2019-05-16 01:53:38 +02:00 |
|
|
84a184c44d
|
I think this is the way to set representations, all info is available
... just have to actually use it
|
2019-05-13 10:48:21 +02:00 |
|
|
6eefd9c9f6
|
redid representation storate, (as prev commit: to make it easier to use)
find_next does not collect representations, no separate
class to parse representation features,
|
2019-05-13 09:52:29 +02:00 |
|
|
19067e4135
|
Moving matches into colocation ids, now easier for representation
|
2019-05-13 08:35:55 +02:00 |
|
|
87712128be
|
joint representation form
|
2019-05-13 00:26:00 +02:00 |
|
|
401698409e
|
Implementing new output formats, all and normal, no more lemma_only and stuff
Still need to implement representation in normal form.
|
2019-05-12 23:00:38 +02:00 |
|
|
b4b93022fe
|
Updating for new representations, for now only parsing
|
2019-05-12 22:13:22 +02:00 |
|
|
de6c73980e
|
adding min-frequency option
|
2019-02-19 15:04:44 +01:00 |
|
|
93d7af3aea
|
Reversed order sorting
|
2019-02-19 13:56:32 +01:00 |
|
|
1c9ac7c867
|
Adding sorting
|
2019-02-19 11:29:40 +01:00 |
|
|
8107a9f647
|
Adding parallel execution using subprocesses
|
2019-02-17 16:01:03 +01:00 |
|
|
dec173ae33
|
Restucturing, now words are parsed right after loading one file, not after loading all of them. Should be easilly parallelizable now
|
2019-02-14 14:33:15 +01:00 |
|
|
2f2bb91d0f
|
Supporting different xml:id variations
|
2019-02-12 17:38:32 +01:00 |
|
|
31483c79ff
|
count-files for more verbose output added
|
2019-02-12 12:19:21 +01:00 |
|
|
2d373ab477
|
Adding changable pc tag (when it is c and not pc)
|
2019-02-12 12:08:30 +01:00 |
|
|
c1e85255c7
|
msd of <pc> now always N
|
2019-02-12 11:59:51 +01:00 |
|
|
40db51adf1
|
msd translate now optional
|
2019-02-12 11:58:04 +01:00 |
|
|
f89212f7c9
|
Parsing files as they come instead of parsing all at once.
Thus removed temporary load/save stuff
|
2019-02-12 11:41:35 +01:00 |
|
|
25f3918170
|
Loading/Saving to temporary file
|
2019-02-09 13:40:57 +01:00 |
|
|
518fe5e113
|
Multiple input files support
|
2019-02-09 13:25:26 +01:00 |
|
|
b4e73e2d60
|
Implemented multiple output option
|
2019-02-07 10:19:36 +01:00 |
|
|
8b47e2b317
|
lemma_only bug fixed and skip-check-id instead of check-id (opt out).
|
2019-02-06 15:46:02 +01:00 |
|
|
5f7b5f969c
|
Check root ids is now skipped by default.
|
2019-02-06 15:33:33 +01:00 |
|
|
27a60c439b
|
Working: using all new stuff
|
2019-02-06 15:29:37 +01:00 |
|
|
916269e710
|
NW: Writer class implemented
|
2019-02-06 15:29:19 +01:00 |
|
|
1298a45d0f
|
NW: ColocationIds class implemented
|
2019-02-06 15:29:03 +01:00 |
|
|
5b75d6e4fa
|
Using argparse
|
2019-02-06 15:28:39 +01:00 |
|
|
3dc69158b9
|
NW: switching print for logging
|
2019-02-06 15:26:09 +01:00 |
|
|
f8103990a8
|
Link order added.
|
2019-02-04 11:01:30 +01:00 |
|
|
4dc87ce953
|
Handling empty
|
2019-01-28 09:39:57 +01:00 |
|
|
a9b6681576
|
removed restriction on number of rules
|
2019-01-28 08:50:32 +01:00 |
|
|
6d574f674f
|
removing pickle stuff for faster loading...
|
2019-01-25 18:44:41 +01:00 |
|
|
6a221ae8fe
|
Fixes for msd length matching and pc matching
Also some cleanup and fix output formatting
|
2019-01-25 11:58:40 +01:00 |
|
|
cddeb9c4e4
|
accomodating for #773
|
2019-01-19 22:42:51 +01:00 |
|
|
106db9394e
|
Removing getchildren() and adding root_words (don't know why yet, will remove if I dont remember)
|
2019-01-08 21:17:35 +01:00 |
|
|
aeb2770966
|
files input as argv
|
2019-01-08 21:13:36 +01:00 |
|
|
36d4a217f7
|
Catching modra links from root
|
2019-01-08 19:37:28 +01:00 |
|
|
06d4217b0b
|
Moving from linkedlist of component to tree structure.
|
2018-10-30 13:33:08 +01:00 |
|
|
319800e0ca
|
Links with | now parsed
|
2018-10-29 12:43:07 +01:00 |
|
|
52e6fc92c6
|
Two fixes, "10-1"-like structures and restriction_or
|
2018-10-29 12:16:42 +01:00 |
|
|
74a1e4834b
|
First commit
|
2018-10-29 11:29:51 +01:00 |
|