|  | dc285ce265 | Saving memory in word-stats | 2019-06-16 01:31:40 +02:00 |  | 
			
				
					|  | 37acabc076 | able to load pickled structures | 2019-06-16 01:31:14 +02:00 |  | 
			
				
					|  | f0109771aa | chunk size now handled in file-sentence-generator | 2019-06-16 00:59:44 +02:00 |  | 
			
				
					|  | 0d8aeb2282 | load_files now returns a generator of senteces, not a generator of the whole file This makes it much slower, but more adaptable for huge files. | 2019-06-15 22:30:43 +02:00 |  | 
			
				
					|  | a8183cf507 | word stats now collected more memory-efficient | 2019-06-15 22:20:20 +02:00 |  | 
			
				
					|  | 90dbbca5d5 | HUGE refactor, creating lots of modules, no code changes though! | 2019-06-15 18:55:35 +02:00 |  | 
			
				
					|  | 43c6c9151b | Simplifying and also improving the speed (less regex comparisons!) | 2019-06-15 13:10:23 +02:00 |  | 
			
				
					|  | 09bdd0fe3f | Adding gitignore | 2019-06-15 12:53:16 +02:00 |  | 
			
				
					|  | c0939fbbd4 | fixed performance bug for representations No more creating millions of namedtuple classes. Works about 15x faster | 2019-06-11 10:26:10 +02:00 |  | 
			
				
					|  | 3be4118dc0 | Refactoring lexis/morphology matchers, now "pickable". | 2019-06-11 10:02:24 +02:00 |  | 
			
				
					|  | ad0f9b0956 | Fixing logdice all stat (and mini refactoring) | 2019-06-11 09:22:25 +02:00 |  | 
			
				
					|  | d30f8c1980 | Dynamically calculated max num components | 2019-06-10 14:05:40 +02:00 |  | 
			
				
					|  | c0a22a4ef3 | float formatting for stats | 2019-06-10 11:05:46 +02:00 |  | 
			
				
					|  | bf0ed35e00 | removing old unused commented out code | 2019-06-10 10:54:01 +02:00 |  | 
			
				
					|  | 68c22d4e27 | deprecating output to stdout | 2019-06-10 10:52:00 +02:00 |  | 
			
				
					|  | b819d9953f | using new formatters via --out and --out-no-stat | 2019-06-10 10:50:51 +02:00 |  | 
			
				
					|  | 432dc87a5f | new outformatter, old is not outnostatformatter | 2019-06-10 10:49:53 +02:00 |  | 
			
				
					|  | cb53a9c7b3 | moving delta_p12/21 to the end of stats formatter | 2019-06-10 10:25:42 +02:00 |  | 
			
				
					|  | 9ccbd02603 | Implementing the rest of stats. Maybe ok? | 2019-06-10 00:25:36 +02:00 |  | 
			
				
					|  | d7f97ba9b3 | implementing but commenting out distinct_2w_forms | 2019-06-10 00:25:14 +02:00 |  | 
			
				
					|  | ca0d6f0f55 | num_words now proper dict | 2019-06-10 00:24:47 +02:00 |  | 
			
				
					|  | 865351b3f6 | Turns out previous commit was OK. Proceeding with stats work | 2019-06-09 23:00:19 +02:00 |  | 
			
				
					|  | c6440162b8 | NOT WORKING inbetween commit | 2019-06-09 22:25:58 +02:00 |  | 
			
				
					|  | dff9643edf | Simplifying main writing stuff | 2019-06-09 13:36:31 +02:00 |  | 
			
				
					|  | 89f35f5259 | handling writers for when we dont need outputs (no --all for example) | 2019-06-09 13:36:07 +02:00 |  | 
			
				
					|  | 5929004c44 | now using new formatters, simplifies the code nicely | 2019-06-09 13:35:34 +02:00 |  | 
			
				
					|  | 111b088c6c | defining formatter for --output | 2019-06-09 13:33:03 +02:00 |  | 
			
				
					|  | 2a437b1703 | Defining writer for --all | 2019-06-09 13:32:10 +02:00 |  | 
			
				
					|  | 96e61d2f64 | Defining Formatter parent class for out/all/stats output files | 2019-06-09 13:27:04 +02:00 |  | 
			
				
					|  | 2387bd7cb7 | Stats flag | 2019-06-09 10:20:29 +02:00 |  | 
			
				
					|  | 6a9ee516a3 | EMPTY COMMIT - fixing some pylint warnings | 2019-06-09 10:13:46 +02:00 |  | 
			
				
					|  | 9117734b91 | EMPTY COMMIT - assert statement vs function call and one if statement simplified and unused variable | 2019-06-08 15:43:53 +02:00 |  | 
			
				
					|  | 46e169095c | EMPTY COMMIT - removing too long lines | 2019-06-08 11:54:47 +02:00 |  | 
			
				
					|  | 797060f619 | EMPTY COMMIT - removing trailing whitespace | 2019-06-08 11:42:57 +02:00 |  | 
			
				
					|  | 3a22cd91c3 | determining jppb (for 2 word statistics) | 2019-06-08 11:31:52 +02:00 |  | 
			
				
					|  | 30a5e80569 | determine polnopomenska-beseda components in structure (for now only type='main') | 2019-06-08 11:27:51 +02:00 |  | 
			
				
					|  | 9ae7e1e9f6 | Determine distrinct matches for one colocation id. | 2019-06-08 11:25:55 +02:00 |  | 
			
				
					|  | 2773a8b9e9 | Getters for number of lemmas and number of all words | 2019-06-08 11:25:00 +02:00 |  | 
			
				
					|  | 2167e4b6fe | Restrictions now always a list, removes/simplifies a bit of code | 2019-06-08 11:23:50 +02:00 |  | 
			
				
					|  | d83d619dc0 | removing old __str__ and __repr__ debugging code | 2019-06-08 11:19:40 +02:00 |  | 
			
				
					|  | b2baedca52 | determining dispersions | 2019-06-08 11:18:49 +02:00 |  | 
			
				
					|  | 57c0ff6f85 | Removing prints from slimmer | 2019-06-08 10:20:53 +02:00 |  | 
			
				
					|  | 3263125898 | Also need to check msd for agreements in the whole corpus. | 2019-06-03 15:09:22 +02:00 |  | 
			
				
					|  | 44d532808d | tqdm now optional | 2019-06-03 09:47:36 +02:00 |  | 
			
				
					|  | ed27e549b7 | Adding slimming script | 2019-06-03 09:37:48 +02:00 |  | 
			
				
					|  | 08c8050f3f | Removing old logging.debug calls, makes matching stuff much faster :) | 2019-06-02 14:03:29 +02:00 |  | 
			
				
					|  | 2c8a9f0ed0 | Whitespace fixes | 2019-06-02 13:51:32 +02:00 |  | 
			
				
					|  | 460a55cb6c | Improving representation speed ~5% | 2019-06-02 13:50:53 +02:00 |  | 
			
				
					|  | 5f226d0cd4 | fixing matching of agreements with msd | 2019-06-02 12:53:16 +02:00 |  | 
			
				
					|  | 5b9859af3e | Removing dead code | 2019-06-02 12:50:43 +02:00 |  |