srl taggin pipeline (output in .tsv)

This commit is contained in:
2019-02-24 22:23:32 +01:00
parent 9939bf0f55
commit b79721f6a7
25 changed files with 10104 additions and 4255 deletions

View File

@@ -3,7 +3,8 @@ We'll be using mate-tools to perform SRL on Kres.
## workspace
The tools require Java.
See `./dockerfiles/python-java/README.md` for environment preparation.
Go to `./dockerfiles/python-java/` and run `make`.
You should get a docker environment, mounting this repo.
## mate-tools
Check out `./tools/srl-20131216/README.md`.
@@ -14,15 +15,23 @@ Check all possible xml tags (that occur after the <body> tag.
## Tools
* Parser for reading both `SSJ500k 2.1 TEI xml` and `Kres F....xml.parsed.xml"` files found in `./tools/parser/parser.py`.
* `fillpred_model` for creating a yes/no model for preditcing the predicate (based on ssj500k data).
## Usage
```bash
$ ./dockerfiles/python-java`
$ cd ./dockerfiles/python-java`
$ make
# you should be inside a container now
$ make <option>
$ cd ./cjvt-srl-tagging
$ make
```
# Makefile
The Makefile follows certain steps:
1. Create a fillpred model.
2. Parse `.xml` files and create `.tsv` files.
3. Run *mate-tools srl-tagger* on the created `.tsv` files.
## Sources
* [1] (mate-tools) https://code.google.com/archive/p/mate-tools/