IssueID # 1538: incorpated script for converting tei to dictionary

This commit is contained in:
2020-09-30 23:08:14 +02:00
parent 4e7d6d50c1
commit 5e7ac5d832
4 changed files with 20 additions and 9 deletions

7
README
View File

@@ -1,5 +1,5 @@
Pipeline for assigning (first creating, if necessary) superbaza
structure_ids to a file of arbitrary Slovene strings, line by line.
Pipeline for parsing a file of arbitrary Slovene string and assigning
(first creating, if necessary) structure_ids for each string.
Example usage:
@@ -8,5 +8,6 @@ $ ./setup.sh
$ echo "velika miza" > ../tmp/strings.txt
$ echo "kdo ne more mimo česa" >> ../tmp/strings.txt
$ echo "pazi, avto!" >> ../tmp/strings.txt
$ echo "počitnice" >> ../tmp/strings.txt
$ source ../venv/bin/activate
$ python pipeline.py ../tmp/strings.txt ../tmp/output.xml
$ python pipeline.py ../tmp/strings.txt ../tmp/dictionary.xml