cjvt-srl-tagging/tools/parser
2022-03-14 11:01:53 +01:00
..
msd added msd-not-found exception 2019-02-28 21:49:49 +01:00
__init__.py parser.py can read kres and/or ssj500k 2019-02-03 22:54:26 +01:00
bench_parser.py parser.py can read kres and/or ssj500k 2019-02-03 22:54:26 +01:00
msds_with_predicate.txt connl2009 output for kres 2019-02-13 08:49:37 +01:00
parser.py Adapted code to ssj500k and added its branch 2022-03-14 11:01:53 +01:00
README.md finished parse + tag toolchain -> TODO: tagger error 2019-02-18 08:49:04 +01:00

msdmap.py

Help conversion between english and slovenian MSD.
Hardcoded values from online documentation (html tables).

Tagging

Go to ./srl-29... and run ./scripts/{learn...,parse...}.
Change paths in the scripts.

ERR

Getting this weird error:

Executing: java -cp srl.jar:lib/liblinear-1.51-with-deps.jar:lib/anna.jar -Xmx2g se.lth.cs.srl.Parse ger ./../../data/kres_example_out/F0006347.xml.parsed.tsv ./srl-ger.model  -nopi ger-eval.out
Loading pipeline from ./srl-ger.model
Writing corpus to ger-eval.out...
Opening reader for ./../../data/kres_example_out/F0006347.xml.parsed.tsv...
Exception in thread "main" java.lang.IndexOutOfBoundsException: Index: 33, Size: 32
	at java.util.ArrayList.rangeCheck(ArrayList.java:657)
	at java.util.ArrayList.get(ArrayList.java:433)
	at se.lth.cs.srl.corpus.Sentence.buildDependencyTree(Sentence.java:61)
	at se.lth.cs.srl.corpus.Sentence.newSRLOnlySentence(Sentence.java:182)
	at se.lth.cs.srl.io.SRLOnlyCoNLL09Reader.readNextSentence(SRLOnlyCoNLL09Reader.java:23)
	at se.lth.cs.srl.io.AbstractCoNLL09Reader.open(AbstractCoNLL09Reader.java:43)
	at se.lth.cs.srl.io.AbstractCoNLL09Reader.<init>(AbstractCoNLL09Reader.java:26)
	at se.lth.cs.srl.io.SRLOnlyCoNLL09Reader.<init>(SRLOnlyCoNLL09Reader.java:11)
	at se.lth.cs.srl.Parse.main(Parse.java:36)
root@9f69d66a0d39:/cjvt-srl-tagging/tools/srl-20131216# 

Sources

[1] (conll09 data format) https://nlpado.de/~sebastian/pub/papers/conll09_hajic.pdf