forked from kristjan/cjvt-srl-tagging
bilateral-srl@86642e1866 | ||
data | ||
dockerfiles | ||
tools | ||
.gitignore | ||
.gitmodules | ||
data_format.xml | ||
Makefile | ||
README.md |
cjvt-srl-tagging
We'll be using mate-tools to perform SRL on Kres.
workspace
The tools require Java.
Go to ./dockerfiles/python-java/
and run make
.
You should get a docker environment, mounting this repo.
mate-tools
Check out ./tools/srl-20131216/README.md
.
Scripts
Check all possible xml tags (that occur after the tag.
cat F0006347.xml.parsed.xml | grep -A 999999999999 -e '<body>' | grep -o -e '<[^" "]*' | sort | uniq
Tools
- Parser for reading both
SSJ500k 2.1 TEI xml
andKres F....xml.parsed.xml"
files found in./tools/parser/parser.py
. fillpred_model
for creating a yes/no model for preditcing the predicate (based on ssj500k data).
Usage
$ cd ./dockerfiles/python-java`
$ make
# you should be inside a container now
$ cd ./cjvt-srl-tagging
$ make
If you want to run it on a server overnight, you might want to use nohup
, so you can close the ssh connection without closing the process.
$ nohup make > tagging.log &
Makefile
The Makefile follows certain steps:
- Create a fillpred model.
- Parse
.xml
files and create.tsv
files. - Run mate-tools srl-tagger on the created
.tsv
files.
Sources
- [1] (mate-tools) https://code.google.com/archive/p/mate-tools/
- [2] (benchmarking) https://github.com/clarinsi/bilateral-srl
- [3] (conll 2008 paper) http://www.aclweb.org/anthology/W08-2121.pdf
- [4] (format CoNLL 2009) https://wiki.ufal.ms.mff.cuni.cz/format-conll