This website requires JavaScript.
865351b3f6
Turns out previous commit was OK. Proceeding with stats work
Ozbolt Menegatti
2019-06-09 23:00:19 +0200
c6440162b8
NOT WORKING inbetween commit
Ozbolt Menegatti
2019-06-09 22:25:58 +0200
dff9643edf
Simplifying main writing stuff
Ozbolt Menegatti
2019-06-09 13:36:31 +0200
89f35f5259
handling writers for when we dont need outputs (no --all for example)
Ozbolt Menegatti
2019-06-09 13:36:07 +0200
5929004c44
now using new formatters, simplifies the code nicely
Ozbolt Menegatti
2019-06-09 13:35:19 +0200
111b088c6c
defining formatter for --output
Ozbolt Menegatti
2019-06-09 13:33:03 +0200
2a437b1703
Defining writer for --all
Ozbolt Menegatti
2019-06-09 13:27:24 +0200
96e61d2f64
Defining Formatter parent class for out/all/stats output files
Ozbolt Menegatti
2019-06-09 13:27:04 +0200
2387bd7cb7
Stats flag
Ozbolt Menegatti
2019-06-09 10:20:29 +0200
6a9ee516a3
EMPTY COMMIT - fixing some pylint warnings
Ozbolt Menegatti
2019-06-09 10:13:46 +0200
9117734b91
EMPTY COMMIT - assert statement vs function call
Ozbolt Menegatti
2019-06-08 15:43:53 +0200
46e169095c
EMPTY COMMIT - removing too long lines
Ozbolt Menegatti
2019-06-08 11:54:47 +0200
797060f619
EMPTY COMMIT - removing trailing whitespace
Ozbolt Menegatti
2019-06-08 11:42:57 +0200
3a22cd91c3
determining jppb (for 2 word statistics)
Ozbolt Menegatti
2019-06-08 11:31:52 +0200
30a5e80569
determine polnopomenska-beseda components in structure (for now only type='main')
Ozbolt Menegatti
2019-06-08 11:27:51 +0200
9ae7e1e9f6
Determine distrinct matches for one colocation id.
Ozbolt Menegatti
2019-06-08 11:25:55 +0200
2773a8b9e9
Getters for number of lemmas and number of all words
Ozbolt Menegatti
2019-06-08 11:25:00 +0200
2167e4b6fe
Restrictions now always a list, removes/simplifies a bit of code
Ozbolt Menegatti
2019-06-08 11:23:50 +0200
d83d619dc0
removing old __str__ and __repr__ debugging code
Ozbolt Menegatti
2019-06-08 11:19:40 +0200
b2baedca52
determining dispersions
Ozbolt Menegatti
2019-06-08 11:18:49 +0200
57c0ff6f85
Removing prints from slimmer
Ozbolt Menegatti
2019-06-08 10:20:53 +0200
3263125898
Also need to check msd for agreements in the whole corpus.
Ozbolt Menegatti
2019-06-03 15:09:22 +0200
44d532808d
tqdm now optional
Ozbolt Menegatti
2019-06-03 09:47:36 +0200
ed27e549b7
Adding slimming script
Ozbolt Menegatti
2019-06-03 09:37:48 +0200
08c8050f3f
Removing old logging.debug calls, makes matching stuff much faster :)
Ozbolt Menegatti
2019-06-02 14:03:29 +0200
2c8a9f0ed0
Whitespace fixes
Ozbolt Menegatti
2019-06-02 13:51:32 +0200
460a55cb6c
Improving representation speed ~5%
Ozbolt Menegatti
2019-06-02 13:50:53 +0200
5f226d0cd4
fixing matching of agreements with msd
Ozbolt Menegatti
2019-06-02 12:53:16 +0200
5b9859af3e
Removing dead code
Ozbolt Menegatti
2019-06-02 12:50:43 +0200
44f0a6762e
Improving speed of matching ~40%
Ozbolt Menegatti
2019-06-02 12:50:04 +0200
fe4c95939f
Removing deprecated commented out code.
Ozbolt Menegatti
2019-06-01 10:40:44 +0200
ed83b2b9c4
implementing multiple agreements to one cid.
Ozbolt Menegatti
2019-06-01 10:36:28 +0200
0249ef1523
Correct ordercorrect order for wordform any/msd rendering
Ozbolt Menegatti
2019-06-01 10:35:51 +0200
119b85568f
actually not showing components without representation
Ozbolt Menegatti
2019-06-01 10:35:23 +0200
7d1bfbf73e
wordform all only lowercase
Ozbolt Menegatti
2019-06-01 10:33:02 +0200
ad7ba8c0b2
removing debugging/dead code
Ozbolt Menegatti
2019-06-01 10:31:29 +0200
09bd4f55ef
mor->more typo
Ozbolt Menegatti
2019-06-01 10:30:07 +0200
bfd4d4a747
Refactoring representations. Now muuuuch nicer code, not yet working though :)
Ozbolt Menegatti
2019-05-30 11:34:31 +0200
307007218d
Work to fix #757-104 and #757-89
Ozbolt Menegatti
2019-05-29 20:22:22 +0200
4c2b5f2b13
Updating for lemma representation of word_form. Also cleaning code, adding tqdm,...
Ozbolt Menegatti
2019-05-24 18:15:21 +0200
3c669c7901
looking for agreements from the whole corpus
Ozbolt Menegatti
2019-05-23 08:13:29 +0200
e99ba59908
lemma/msd representations now global! Need to also use for agreements
Ozbolt Menegatti
2019-05-22 11:55:51 +0200
d14efff709
Intermediate UGLY CODE commit. Working more on representations
Ozbolt Menegatti
2019-05-22 11:22:07 +0200
dce55d04a3
Does not yet work, agreements in representation
Ozbolt Menegatti
2019-05-20 18:14:11 +0200
5bd0b4a064
correct representation when rep_failed
Ozbolt Menegatti
2019-05-17 20:45:39 +0200
111512a901
no more structureselection enum
Ozbolt Menegatti
2019-05-17 20:45:10 +0200
d2f1e95a8f
continued work on representation, almost there...
Ozbolt Menegatti
2019-05-16 01:53:38 +0200
84a184c44d
I think this is the way to set representations, all info is available
Ozbolt Menegatti
2019-05-13 10:48:21 +0200
6eefd9c9f6
redid representation storate, (as prev commit: to make it easier to use)
Ozbolt Menegatti
2019-05-13 09:52:29 +0200
19067e4135
Moving matches into colocation ids, now easier for representation
Ozbolt Menegatti
2019-05-13 08:35:55 +0200
87712128be
joint representation form
Ozbolt Menegatti
2019-05-13 00:26:00 +0200
401698409e
Implementing new output formats, all and normal, no more lemma_only and stuff
Ozbolt Menegatti
2019-05-12 23:00:38 +0200
b4b93022fe
Updating for new representations, for now only parsing
Ozbolt Menegatti
2019-05-12 22:13:22 +0200
de6c73980e
adding min-frequency option
Ozbolt Menegatti
2019-02-19 14:57:48 +0100
93d7af3aea
Reversed order sorting
Ozbolt Menegatti
2019-02-19 13:56:32 +0100
1c9ac7c867
Adding sorting
Ozbolt Menegatti
2019-02-19 11:29:40 +0100
8107a9f647
Adding parallel execution using subprocesses
Ozbolt Menegatti
2019-02-17 15:55:17 +0100
dec173ae33
Restucturing, now words are parsed right after loading one file, not after loading all of them. Should be easilly parallelizable now
Ozbolt Menegatti
2019-02-14 14:33:15 +0100
f3fe981614
Adding few more lines to msd_translate
Ozbolt Menegatti
2019-02-14 14:30:12 +0100
658d8698f4
Adding two new lines into msd translate
Ozbolt Menegatti
2019-02-12 17:38:53 +0100
2f2bb91d0f
Supporting different xml:id variations
Ozbolt Menegatti
2019-02-12 17:38:32 +0100
31483c79ff
count-files for more verbose output added
Ozbolt Menegatti
2019-02-12 12:19:21 +0100
2d373ab477
Adding changable pc tag (when it is c and not pc)
Ozbolt Menegatti
2019-02-12 12:08:30 +0100
c1e85255c7
msd of <pc> now always N
Ozbolt Menegatti
2019-02-12 11:59:51 +0100
40db51adf1
msd translate now optional
Ozbolt Menegatti
2019-02-12 11:58:04 +0100
f89212f7c9
Parsing files as they come instead of parsing all at once.
Ozbolt Menegatti
2019-02-12 11:41:35 +0100
25f3918170
Loading/Saving to temporary file
Ozbolt Menegatti
2019-02-09 13:40:57 +0100
518fe5e113
Multiple input files support
Ozbolt Menegatti
2019-02-09 13:25:26 +0100
b4e73e2d60
Implemented multiple output option
Ozbolt Menegatti
2019-02-07 10:19:36 +0100
8b47e2b317
lemma_only bug fixed and skip-check-id instead of check-id (opt out).
Ozbolt Menegatti
2019-02-06 15:46:02 +0100
5f7b5f969c
Check root ids is now skipped by default.
Ozbolt Menegatti
2019-02-06 15:33:33 +0100
27a60c439b
Working: using all new stuff
Ozbolt Menegatti
2019-02-06 15:29:37 +0100
916269e710
NW: Writer class implemented
Ozbolt Menegatti
2019-02-06 15:29:19 +0100
1298a45d0f
NW: ColocationIds class implemented
Ozbolt Menegatti
2019-02-06 15:29:03 +0100
5b75d6e4fa
Using argparse
Ozbolt Menegatti
2019-02-06 15:28:39 +0100
3dc69158b9
NW: switching print for logging
Ozbolt Menegatti
2019-02-06 15:26:09 +0100
f8103990a8
Link order added.
Ozbolt Menegatti
2019-02-04 11:01:30 +0100
4dc87ce953
Handling empty
Ozbolt Menegatti
2019-01-28 09:39:57 +0100
a9b6681576
removed restriction on number of rules
Ozbolt Menegatti
2019-01-28 08:50:32 +0100
bf433a3a19
Merge branch 'master' of gitea.cjvt.si:ozbolt/luscenje_struktur
Ozbolt Menegatti
2019-01-25 18:47:27 +0100
6d574f674f
removing pickle stuff for faster loading...
Ozbolt Menegatti
2019-01-25 18:44:41 +0100
40b6a07839
Add README
ozbolt
2019-01-25 11:03:28 +0000
6a221ae8fe
Fixes for msd length matching and pc matching
Ozbolt Menegatti
2019-01-25 11:58:40 +0100
cddeb9c4e4
accomodating for #773
Ozbolt Menegatti
2019-01-19 22:42:51 +0100
106db9394e
Removing getchildren() and adding root_words (don't know why yet, will remove if I dont remember)
Ozbolt Menegatti
2019-01-08 21:17:15 +0100
aeb2770966
files input as argv
Ozbolt Menegatti
2019-01-08 21:13:36 +0100
36d4a217f7
Catching modra links from root
Ozbolt Menegatti
2019-01-08 19:37:28 +0100
06d4217b0b
Moving from linkedlist of component to tree structure.
Ozbolt Menegatti
2018-10-30 13:33:08 +0100
319800e0ca
Links with | now parsed
Ozbolt Menegatti
2018-10-29 12:43:07 +0100
52e6fc92c6
Two fixes, "10-1"-like structures and restriction_or
Ozbolt Menegatti
2018-10-29 12:16:42 +0100
74a1e4834b
First commit
Ozbolt Menegatti
2018-10-29 11:29:51 +0100
4604ac1878
Just gitignore
Ozbolt Menegatti
2018-10-29 11:29:32 +0100