Pipeline which combines scripts and resources from other repositories to parse strings and assign them to standard CJVT structures, creating new structures if necessary.
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
Cyprian Laskowski 743ee6ebae
IssueID #1487: fixed parameter names
4 years ago
resources IssueID #1487: added schema validation for structures and dictionary 4 years ago
scripts IssueID #1487: fixed parameter names 4 years ago
tmp IssueID #1487: added basic script versions and directory structure 4 years ago
.gitignore IssueID #1487: added schema validation for structures and dictionary 4 years ago
README IssueID # 1538: incorpated script for converting tei to dictionary 4 years ago
requirements.txt IssueID #1487: added basic script versions and directory structure 4 years ago

README

Pipeline for parsing a file of arbitrary Slovene string and assigning
(first creating, if necessary) structure_ids for each string.

Example usage:

$ cd scripts
$ ./setup.sh
$ echo "velika miza" > ../tmp/strings.txt
$ echo "kdo ne more mimo česa" >> ../tmp/strings.txt
$ echo "pazi, avto!" >> ../tmp/strings.txt
$ echo "počitnice" >> ../tmp/strings.txt
$ source ../venv/bin/activate
$ python pipeline.py ../tmp/strings.txt ../tmp/dictionary.xml