Ozbolt Menegatti
0249ef1523
Correct ordercorrect order for wordform any/msd rendering
...
(most frequent first)
5 years ago
Ozbolt Menegatti
119b85568f
actually not showing components without representation
5 years ago
Ozbolt Menegatti
7d1bfbf73e
wordform all only lowercase
5 years ago
Ozbolt Menegatti
ad7ba8c0b2
removing debugging/dead code
5 years ago
Ozbolt Menegatti
09bd4f55ef
mor->more typo
5 years ago
Ozbolt Menegatti
bfd4d4a747
Refactoring representations. Now muuuuch nicer code, not yet working though :)
...
Added: multiple representations per component id
5 years ago
Ozbolt Menegatti
307007218d
Work to fix #757-104 and #757-89
...
for word_form all, now removing duplicates
for word_form msd, now word_forms from the collocation, not from whole corpus
determening more specific msd for agreements, so that it gets better match when using backup-lemma representation
for agreements, now ordered by colocation's own number of occurances, not global
removed a bit of debug code
5 years ago
Ozbolt Menegatti
4c2b5f2b13
Updating for lemma representation of word_form. Also cleaning code, adding tqdm,...
5 years ago
Ozbolt Menegatti
3c669c7901
looking for agreements from the whole corpus
5 years ago
Ozbolt Menegatti
e99ba59908
lemma/msd representations now global! Need to also use for agreements
5 years ago
Ozbolt Menegatti
d14efff709
Intermediate UGLY CODE commit. Working more on representations
5 years ago
Ozbolt Menegatti
dce55d04a3
Does not yet work, agreements in representation
5 years ago
Ozbolt Menegatti
5bd0b4a064
correct representation when rep_failed
5 years ago
Ozbolt Menegatti
111512a901
no more structureselection enum
5 years ago
Ozbolt Menegatti
d2f1e95a8f
continued work on representation, almost there...
5 years ago
Ozbolt Menegatti
84a184c44d
I think this is the way to set representations, all info is available
...
... just have to actually use it
5 years ago
Ozbolt Menegatti
6eefd9c9f6
redid representation storate, (as prev commit: to make it easier to use)
...
find_next does not collect representations, no separate
class to parse representation features,
5 years ago
Ozbolt Menegatti
19067e4135
Moving matches into colocation ids, now easier for representation
5 years ago
Ozbolt Menegatti
87712128be
joint representation form
5 years ago
Ozbolt Menegatti
401698409e
Implementing new output formats, all and normal, no more lemma_only and stuff
...
Still need to implement representation in normal form.
5 years ago
Ozbolt Menegatti
b4b93022fe
Updating for new representations, for now only parsing
5 years ago
Ozbolt Menegatti
de6c73980e
adding min-frequency option
5 years ago
Ozbolt Menegatti
93d7af3aea
Reversed order sorting
5 years ago
Ozbolt Menegatti
1c9ac7c867
Adding sorting
5 years ago
Ozbolt Menegatti
8107a9f647
Adding parallel execution using subprocesses
5 years ago
Ozbolt Menegatti
dec173ae33
Restucturing, now words are parsed right after loading one file, not after loading all of them. Should be easilly parallelizable now
5 years ago
Ozbolt Menegatti
2f2bb91d0f
Supporting different xml:id variations
5 years ago
Ozbolt Menegatti
31483c79ff
count-files for more verbose output added
5 years ago
Ozbolt Menegatti
2d373ab477
Adding changable pc tag (when it is c and not pc)
5 years ago
Ozbolt Menegatti
c1e85255c7
msd of <pc> now always N
5 years ago
Ozbolt Menegatti
40db51adf1
msd translate now optional
5 years ago
Ozbolt Menegatti
f89212f7c9
Parsing files as they come instead of parsing all at once.
...
Thus removed temporary load/save stuff
5 years ago
Ozbolt Menegatti
25f3918170
Loading/Saving to temporary file
5 years ago
Ozbolt Menegatti
518fe5e113
Multiple input files support
5 years ago
Ozbolt Menegatti
b4e73e2d60
Implemented multiple output option
5 years ago
Ozbolt Menegatti
8b47e2b317
lemma_only bug fixed and skip-check-id instead of check-id (opt out).
5 years ago
Ozbolt Menegatti
5f7b5f969c
Check root ids is now skipped by default.
5 years ago
Ozbolt Menegatti
27a60c439b
Working: using all new stuff
5 years ago
Ozbolt Menegatti
916269e710
NW: Writer class implemented
5 years ago
Ozbolt Menegatti
1298a45d0f
NW: ColocationIds class implemented
5 years ago
Ozbolt Menegatti
5b75d6e4fa
Using argparse
5 years ago
Ozbolt Menegatti
3dc69158b9
NW: switching print for logging
5 years ago
Ozbolt Menegatti
f8103990a8
Link order added.
5 years ago
Ozbolt Menegatti
4dc87ce953
Handling empty
5 years ago
Ozbolt Menegatti
a9b6681576
removed restriction on number of rules
5 years ago
Ozbolt Menegatti
6d574f674f
removing pickle stuff for faster loading...
5 years ago
Ozbolt Menegatti
6a221ae8fe
Fixes for msd length matching and pc matching
...
Also some cleanup and fix output formatting
5 years ago
Ozbolt Menegatti
cddeb9c4e4
accomodating for #773
5 years ago
Ozbolt Menegatti
106db9394e
Removing getchildren() and adding root_words (don't know why yet, will remove if I dont remember)
5 years ago
Ozbolt Menegatti
aeb2770966
files input as argv
5 years ago
Ozbolt Menegatti
36d4a217f7
Catching modra links from root
5 years ago
Ozbolt Menegatti
06d4217b0b
Moving from linkedlist of component to tree structure.
6 years ago
Ozbolt Menegatti
319800e0ca
Links with | now parsed
6 years ago
Ozbolt Menegatti
52e6fc92c6
Two fixes, "10-1"-like structures and restriction_or
6 years ago
Ozbolt Menegatti
74a1e4834b
First commit
6 years ago