Luka
|
d5668c8b68
|
Moved wani.py + Added ignore of .zstd files for valency
|
4 years ago |
Ozbolt Menegatti
|
90dbbca5d5
|
HUGE refactor, creating lots of modules, no code changes though!
|
5 years ago |
Ozbolt Menegatti
|
43c6c9151b
|
Simplifying and also improving the speed (less regex comparisons!)
|
5 years ago |
Ozbolt Menegatti
|
c0939fbbd4
|
fixed performance bug for representations
No more creating millions of namedtuple classes. Works about 15x faster
|
5 years ago |
Ozbolt Menegatti
|
3be4118dc0
|
Refactoring lexis/morphology matchers, now "pickable".
|
5 years ago |
Ozbolt Menegatti
|
ad0f9b0956
|
Fixing logdice all stat (and mini refactoring)
|
5 years ago |
Ozbolt Menegatti
|
d30f8c1980
|
Dynamically calculated max num components
|
5 years ago |
Ozbolt Menegatti
|
c0a22a4ef3
|
float formatting for stats
|
5 years ago |
Ozbolt Menegatti
|
bf0ed35e00
|
removing old unused commented out code
|
5 years ago |
Ozbolt Menegatti
|
68c22d4e27
|
deprecating output to stdout
|
5 years ago |
Ozbolt Menegatti
|
b819d9953f
|
using new formatters via --out and --out-no-stat
|
5 years ago |
Ozbolt Menegatti
|
432dc87a5f
|
new outformatter, old is not outnostatformatter
|
5 years ago |
Ozbolt Menegatti
|
cb53a9c7b3
|
moving delta_p12/21 to the end of stats formatter
|
5 years ago |
Ozbolt Menegatti
|
9ccbd02603
|
Implementing the rest of stats. Maybe ok?
|
5 years ago |
Ozbolt Menegatti
|
d7f97ba9b3
|
implementing but commenting out distinct_2w_forms
|
5 years ago |
Ozbolt Menegatti
|
ca0d6f0f55
|
num_words now proper dict
|
5 years ago |
Ozbolt Menegatti
|
865351b3f6
|
Turns out previous commit was OK. Proceeding with stats work
|
5 years ago |
Ozbolt Menegatti
|
c6440162b8
|
NOT WORKING inbetween commit
|
5 years ago |
Ozbolt Menegatti
|
dff9643edf
|
Simplifying main writing stuff
|
5 years ago |
Ozbolt Menegatti
|
89f35f5259
|
handling writers for when we dont need outputs (no --all for example)
|
5 years ago |
Ozbolt Menegatti
|
5929004c44
|
now using new formatters, simplifies the code nicely
|
5 years ago |
Ozbolt Menegatti
|
111b088c6c
|
defining formatter for --output
|
5 years ago |
Ozbolt Menegatti
|
2a437b1703
|
Defining writer for --all
|
5 years ago |
Ozbolt Menegatti
|
96e61d2f64
|
Defining Formatter parent class for out/all/stats output files
|
5 years ago |
Ozbolt Menegatti
|
2387bd7cb7
|
Stats flag
|
5 years ago |
Ozbolt Menegatti
|
6a9ee516a3
|
EMPTY COMMIT - fixing some pylint warnings
|
5 years ago |
Ozbolt Menegatti
|
9117734b91
|
EMPTY COMMIT - assert statement vs function call
and one if statement simplified and unused variable
|
5 years ago |
Ozbolt Menegatti
|
46e169095c
|
EMPTY COMMIT - removing too long lines
|
5 years ago |
Ozbolt Menegatti
|
797060f619
|
EMPTY COMMIT - removing trailing whitespace
|
5 years ago |
Ozbolt Menegatti
|
3a22cd91c3
|
determining jppb (for 2 word statistics)
|
5 years ago |
Ozbolt Menegatti
|
30a5e80569
|
determine polnopomenska-beseda components in structure (for now only type='main')
|
5 years ago |
Ozbolt Menegatti
|
9ae7e1e9f6
|
Determine distrinct matches for one colocation id.
|
5 years ago |
Ozbolt Menegatti
|
2773a8b9e9
|
Getters for number of lemmas and number of all words
|
5 years ago |
Ozbolt Menegatti
|
2167e4b6fe
|
Restrictions now always a list, removes/simplifies a bit of code
|
5 years ago |
Ozbolt Menegatti
|
d83d619dc0
|
removing old __str__ and __repr__ debugging code
|
5 years ago |
Ozbolt Menegatti
|
b2baedca52
|
determining dispersions
|
5 years ago |
Ozbolt Menegatti
|
3263125898
|
Also need to check msd for agreements in the whole corpus.
|
5 years ago |
Ozbolt Menegatti
|
44d532808d
|
tqdm now optional
|
5 years ago |
Ozbolt Menegatti
|
08c8050f3f
|
Removing old logging.debug calls, makes matching stuff much faster :)
|
5 years ago |
Ozbolt Menegatti
|
2c8a9f0ed0
|
Whitespace fixes
|
5 years ago |
Ozbolt Menegatti
|
460a55cb6c
|
Improving representation speed ~5%
|
5 years ago |
Ozbolt Menegatti
|
5f226d0cd4
|
fixing matching of agreements with msd
|
5 years ago |
Ozbolt Menegatti
|
5b9859af3e
|
Removing dead code
|
5 years ago |
Ozbolt Menegatti
|
44f0a6762e
|
Improving speed of matching ~40%
|
5 years ago |
Ozbolt Menegatti
|
fe4c95939f
|
Removing deprecated commented out code.
|
5 years ago |
Ozbolt Menegatti
|
ed83b2b9c4
|
implementing multiple agreements to one cid.
|
5 years ago |
Ozbolt Menegatti
|
0249ef1523
|
Correct ordercorrect order for wordform any/msd rendering
(most frequent first)
|
5 years ago |
Ozbolt Menegatti
|
119b85568f
|
actually not showing components without representation
|
5 years ago |
Ozbolt Menegatti
|
7d1bfbf73e
|
wordform all only lowercase
|
5 years ago |
Ozbolt Menegatti
|
ad7ba8c0b2
|
removing debugging/dead code
|
5 years ago |