Commit Graph

  • 865351b3f6 Turns out previous commit was OK. Proceeding with stats work ozbolt 2019-06-09 23:00:19 +02:00
  • c6440162b8 NOT WORKING inbetween commit ozbolt 2019-06-09 22:25:58 +02:00
  • dff9643edf Simplifying main writing stuff ozbolt 2019-06-09 13:36:31 +02:00
  • 89f35f5259 handling writers for when we dont need outputs (no --all for example) ozbolt 2019-06-09 13:36:07 +02:00
  • 5929004c44 now using new formatters, simplifies the code nicely ozbolt 2019-06-09 13:35:19 +02:00
  • 111b088c6c defining formatter for --output ozbolt 2019-06-09 13:33:03 +02:00
  • 2a437b1703 Defining writer for --all ozbolt 2019-06-09 13:27:24 +02:00
  • 96e61d2f64 Defining Formatter parent class for out/all/stats output files ozbolt 2019-06-09 13:27:04 +02:00
  • 2387bd7cb7 Stats flag ozbolt 2019-06-09 10:20:29 +02:00
  • 6a9ee516a3 EMPTY COMMIT - fixing some pylint warnings ozbolt 2019-06-09 10:13:46 +02:00
  • 9117734b91 EMPTY COMMIT - assert statement vs function call ozbolt 2019-06-08 15:43:53 +02:00
  • 46e169095c EMPTY COMMIT - removing too long lines ozbolt 2019-06-08 11:54:47 +02:00
  • 797060f619 EMPTY COMMIT - removing trailing whitespace ozbolt 2019-06-08 11:42:57 +02:00
  • 3a22cd91c3 determining jppb (for 2 word statistics) ozbolt 2019-06-08 11:31:52 +02:00
  • 30a5e80569 determine polnopomenska-beseda components in structure (for now only type='main') ozbolt 2019-06-08 11:27:51 +02:00
  • 9ae7e1e9f6 Determine distrinct matches for one colocation id. ozbolt 2019-06-08 11:25:55 +02:00
  • 2773a8b9e9 Getters for number of lemmas and number of all words ozbolt 2019-06-08 11:25:00 +02:00
  • 2167e4b6fe Restrictions now always a list, removes/simplifies a bit of code ozbolt 2019-06-08 11:23:50 +02:00
  • d83d619dc0 removing old __str__ and __repr__ debugging code ozbolt 2019-06-08 11:19:40 +02:00
  • b2baedca52 determining dispersions ozbolt 2019-06-08 11:18:49 +02:00
  • 57c0ff6f85 Removing prints from slimmer ozbolt 2019-06-08 10:20:53 +02:00
  • 3263125898 Also need to check msd for agreements in the whole corpus. ozbolt 2019-06-03 15:09:22 +02:00
  • 44d532808d tqdm now optional ozbolt 2019-06-03 09:47:36 +02:00
  • ed27e549b7 Adding slimming script ozbolt 2019-06-03 09:37:48 +02:00
  • 08c8050f3f Removing old logging.debug calls, makes matching stuff much faster :) ozbolt 2019-06-02 14:03:29 +02:00
  • 2c8a9f0ed0 Whitespace fixes ozbolt 2019-06-02 13:51:32 +02:00
  • 460a55cb6c Improving representation speed ~5% ozbolt 2019-06-02 13:50:53 +02:00
  • 5f226d0cd4 fixing matching of agreements with msd ozbolt 2019-06-02 12:53:16 +02:00
  • 5b9859af3e Removing dead code ozbolt 2019-06-02 12:50:43 +02:00
  • 44f0a6762e Improving speed of matching ~40% ozbolt 2019-06-02 12:50:04 +02:00
  • fe4c95939f Removing deprecated commented out code. ozbolt 2019-06-01 10:40:44 +02:00
  • ed83b2b9c4 implementing multiple agreements to one cid. ozbolt 2019-06-01 10:36:28 +02:00
  • 0249ef1523 Correct ordercorrect order for wordform any/msd rendering ozbolt 2019-06-01 10:35:51 +02:00
  • 119b85568f actually not showing components without representation ozbolt 2019-06-01 10:35:23 +02:00
  • 7d1bfbf73e wordform all only lowercase ozbolt 2019-06-01 10:33:02 +02:00
  • ad7ba8c0b2 removing debugging/dead code ozbolt 2019-06-01 10:31:29 +02:00
  • 09bd4f55ef mor->more typo ozbolt 2019-06-01 10:30:07 +02:00
  • bfd4d4a747 Refactoring representations. Now muuuuch nicer code, not yet working though :) ozbolt 2019-05-30 11:34:31 +02:00
  • 307007218d Work to fix #757-104 and #757-89 ozbolt 2019-05-29 20:22:22 +02:00
  • 4c2b5f2b13 Updating for lemma representation of word_form. Also cleaning code, adding tqdm,... ozbolt 2019-05-24 18:15:21 +02:00
  • 3c669c7901 looking for agreements from the whole corpus ozbolt 2019-05-23 08:13:29 +02:00
  • e99ba59908 lemma/msd representations now global! Need to also use for agreements ozbolt 2019-05-22 11:55:51 +02:00
  • d14efff709 Intermediate UGLY CODE commit. Working more on representations ozbolt 2019-05-22 11:22:07 +02:00
  • dce55d04a3 Does not yet work, agreements in representation ozbolt 2019-05-20 18:14:11 +02:00
  • 5bd0b4a064 correct representation when rep_failed ozbolt 2019-05-17 20:45:39 +02:00
  • 111512a901 no more structureselection enum ozbolt 2019-05-17 20:45:10 +02:00
  • d2f1e95a8f continued work on representation, almost there... ozbolt 2019-05-16 01:53:38 +02:00
  • 84a184c44d I think this is the way to set representations, all info is available ozbolt 2019-05-13 10:48:21 +02:00
  • 6eefd9c9f6 redid representation storate, (as prev commit: to make it easier to use) ozbolt 2019-05-13 09:52:29 +02:00
  • 19067e4135 Moving matches into colocation ids, now easier for representation ozbolt 2019-05-13 08:35:55 +02:00
  • 87712128be joint representation form ozbolt 2019-05-13 00:26:00 +02:00
  • 401698409e Implementing new output formats, all and normal, no more lemma_only and stuff ozbolt 2019-05-12 23:00:38 +02:00
  • b4b93022fe Updating for new representations, for now only parsing ozbolt 2019-05-12 22:13:22 +02:00
  • de6c73980e adding min-frequency option ozbolt 2019-02-19 14:57:48 +01:00
  • 93d7af3aea Reversed order sorting ozbolt 2019-02-19 13:56:32 +01:00
  • 1c9ac7c867 Adding sorting ozbolt 2019-02-19 11:29:40 +01:00
  • 8107a9f647 Adding parallel execution using subprocesses ozbolt 2019-02-17 15:55:17 +01:00
  • dec173ae33 Restucturing, now words are parsed right after loading one file, not after loading all of them. Should be easilly parallelizable now ozbolt 2019-02-14 14:33:15 +01:00
  • f3fe981614 Adding few more lines to msd_translate ozbolt 2019-02-14 14:30:12 +01:00
  • 658d8698f4 Adding two new lines into msd translate ozbolt 2019-02-12 17:38:53 +01:00
  • 2f2bb91d0f Supporting different xml:id variations ozbolt 2019-02-12 17:38:32 +01:00
  • 31483c79ff count-files for more verbose output added ozbolt 2019-02-12 12:19:21 +01:00
  • 2d373ab477 Adding changable pc tag (when it is c and not pc) ozbolt 2019-02-12 12:08:30 +01:00
  • c1e85255c7 msd of <pc> now always N ozbolt 2019-02-12 11:59:51 +01:00
  • 40db51adf1 msd translate now optional ozbolt 2019-02-12 11:58:04 +01:00
  • f89212f7c9 Parsing files as they come instead of parsing all at once. ozbolt 2019-02-12 11:41:35 +01:00
  • 25f3918170 Loading/Saving to temporary file ozbolt 2019-02-09 13:40:57 +01:00
  • 518fe5e113 Multiple input files support ozbolt 2019-02-09 13:25:26 +01:00
  • b4e73e2d60 Implemented multiple output option ozbolt 2019-02-07 10:19:36 +01:00
  • 8b47e2b317 lemma_only bug fixed and skip-check-id instead of check-id (opt out). ozbolt 2019-02-06 15:46:02 +01:00
  • 5f7b5f969c Check root ids is now skipped by default. ozbolt 2019-02-06 15:33:33 +01:00
  • 27a60c439b Working: using all new stuff ozbolt 2019-02-06 15:29:37 +01:00
  • 916269e710 NW: Writer class implemented ozbolt 2019-02-06 15:29:19 +01:00
  • 1298a45d0f NW: ColocationIds class implemented ozbolt 2019-02-06 15:29:03 +01:00
  • 5b75d6e4fa Using argparse ozbolt 2019-02-06 15:28:39 +01:00
  • 3dc69158b9 NW: switching print for logging ozbolt 2019-02-06 15:26:09 +01:00
  • f8103990a8 Link order added. ozbolt 2019-02-04 11:01:30 +01:00
  • 4dc87ce953 Handling empty ozbolt 2019-01-28 09:39:57 +01:00
  • a9b6681576 removed restriction on number of rules ozbolt 2019-01-28 08:50:32 +01:00
  • bf433a3a19 Merge branch 'master' of gitea.cjvt.si:ozbolt/luscenje_struktur ozbolt 2019-01-25 18:47:27 +01:00
  • 6d574f674f removing pickle stuff for faster loading... ozbolt 2019-01-25 18:44:41 +01:00
  • 40b6a07839 Add README ozbolt 2019-01-25 11:03:28 +00:00
  • 6a221ae8fe Fixes for msd length matching and pc matching ozbolt 2019-01-25 11:58:40 +01:00
  • cddeb9c4e4 accomodating for #773 ozbolt 2019-01-19 22:42:51 +01:00
  • 106db9394e Removing getchildren() and adding root_words (don't know why yet, will remove if I dont remember) ozbolt 2019-01-08 21:17:15 +01:00
  • aeb2770966 files input as argv ozbolt 2019-01-08 21:13:36 +01:00
  • 36d4a217f7 Catching modra links from root ozbolt 2019-01-08 19:37:28 +01:00
  • 06d4217b0b Moving from linkedlist of component to tree structure. ozbolt 2018-10-30 13:33:08 +01:00
  • 319800e0ca Links with | now parsed ozbolt 2018-10-29 12:43:07 +01:00
  • 52e6fc92c6 Two fixes, "10-1"-like structures and restriction_or ozbolt 2018-10-29 12:16:42 +01:00
  • 74a1e4834b First commit ozbolt 2018-10-29 11:29:51 +01:00
  • 4604ac1878 Just gitignore ozbolt 2018-10-29 11:29:32 +01:00