Pipeline which combines scripts and resources from other repositories to parse strings and assign them to standard CJVT structures, creating new structures if necessary.
Go to file
2020-11-04 19:02:15 +01:00
resources IssueID #1487: added schema validation for structures and dictionary 2020-11-04 18:05:12 +01:00
scripts IssueID #1487: split structure pipeline into two 2020-11-04 19:02:15 +01:00
tmp IssueID #1487: added basic script versions and directory structure 2020-09-17 09:29:53 +02:00
.gitignore IssueID #1487: added schema validation for structures and dictionary 2020-11-04 18:05:12 +01:00
README IssueID # 1538: incorpated script for converting tei to dictionary 2020-09-30 23:08:14 +02:00
requirements.txt IssueID #1487: added basic script versions and directory structure 2020-09-17 09:29:53 +02:00

Pipeline for parsing a file of arbitrary Slovene string and assigning
(first creating, if necessary) structure_ids for each string.

Example usage:

$ cd scripts
$ ./setup.sh
$ echo "velika miza" > ../tmp/strings.txt
$ echo "kdo ne more mimo česa" >> ../tmp/strings.txt
$ echo "pazi, avto!" >> ../tmp/strings.txt
$ echo "počitnice" >> ../tmp/strings.txt
$ source ../venv/bin/activate
$ python pipeline.py ../tmp/strings.txt ../tmp/dictionary.xml