Compare commits

...

16 Commits
v1.0 ... master

49 changed files with 13096 additions and 1263 deletions

4
.gitignore vendored Normal file → Executable file
View File

@ -3,3 +3,7 @@ data
__pycache__/
venv/
src/__pycache__/
data_sample/KOST/results_small
data_sample/processing.annotation
data_sample/processing.tokenization
*.csv

53
README.md Normal file
View File

@ -0,0 +1,53 @@
# KOST
## Instructions
- Create a new virtual environment (using i.e. `virtualenv`)
- Run `pip install -r requirements.txt`, to install necessary libraries.
- Using python console download classla models (used for annotation and part of tokenization):
```python
import classla
classla.download(lang='sl', type='standard_jos')
```
- Extract/create metadata to `data_sample/KOST` folder.
- Run `svala2tei.py` script.
### Example
```python
python svala2tei.py --svala_folder data_sample/KOST/svala_small --raw_text data_sample/KOST/raw_small --results_folder data_sample/KOST/results_small --texts_metadata data_sample/KOST/texts_metadata5.csv --authors_metadata data_sample/KOST/authors_metadata5.csv --teachers_metadata data_sample/KOST/teachers_metadata.csv --translations data_sample/KOST/translations.csv --tokenization_interprocessing data_sample/processing.tokenization --annotation_interprocessing data_sample/processing.annotation --overwrite_tokenization --overwrite_annotation
```
## Parameter descriptions
### --svala_folder
Path to directory with `*.svala` files.
### --results_folder
Destination of results folder.
### --raw_text
Path to directory that contains raw texts.
### --texts_metadata
Location of metadata csv that contains information about texts.
### --authors_metadata
Location of metadata csv that contains information about authors.
### --teachers_metadata
Location of metadata csv that contains information about teachers.
### --translations
Path to mapper that translates column names in metadata files.
### --tokenization_interprocessing
Path to file where tokenized semi processed data is stored, to be able to proceed with processsing without rerunning whole test.
### --overwrite_tokenization
Tag that forces script to redo tokenization and overrides interprocessing file.
### --annotation_interprocessing
Path to file where annotated semi processed data is stored, to be able to proceed with processsing without rerunning whole test.
### --overwrite_annotation
Tag that forces script to redo annotation and overrides interprocessing file.
##

View File

@ -0,0 +1 @@
Moje najlepše potovanje je bilo pre 9 letih. Potoval sem s mamom v Opatijo. Tam sem šel kod strica in sem kod njega prenočeval. Tamo sem bil 10 dni. Tamo sem vse jedel. Najlepši del potovanja je bil moj rojstni dan (10 let). Mama in stric sta napravila parti in kupila veliko poklonov. S tim so me iznenadil in sem bil zelo srečen. 10 dan sm se vozili z gliserjem po morju in mi je bilo zelo všeč. Tako posebno je bilo zaradi mojega rojstnega dana. Stric je imal prijatelja [XImeX], ki je imal restavracijo. [XImeX] me je rad imal in jaz sem mu peval pjesme.

View File

@ -0,0 +1 @@
Pero Antič je najpomembnejši človek v zgodovini sporta v Makedoniji. To je moje mnenje. Pero Antič je makedonski reprezentant. On je prvi makedonski košarkar ki je igral v NBA. Igral je za Atlanta Hawks in je tekmoval 3 leta. Tudi je igral za Crveno Zvezdo, Fenerbahče, Olimpijakos. Ko je tekmoval za Olimpijakos dvakrat je osvojil Euroligo. Pero Antič je bil med najboljšimi na Evropskem turniru v košarki in Makedonija je končala na 4 mestu. V NBA svoj največi uspeh je imal kot je prišel do finale Istočne konferencije, ampak Atlanta je izgubilo to serijo. Pero Antič je skoraj bil izbran za presednik makedonsko košarkarsko federacijo.

View File

@ -0,0 +1,2 @@
Ko sem imela štirinajst let sem šla v Španijo z svojo plesno skupino in to je še vedno moje najlepše potovanje. Prenočevali smo v lepom hotelu ampak ne morem se spomniti kako je ime mesta. Hrana mi je bila zelo všeč in spomnim se sladoleda, ki sem tam jedla. Vsak dan smo šli na plažo in nisem skrbila za svoje probleme.
Spoznali smo veliko novih oseb iz drugih držav, ampak to potovanje je bilo posebno, ker sem bila z svoji prijatelji, s kateri se ne vidim več. Po povratku doma nam se pokvaril avtobus v Italiji in smo na eni bencinski črpalki dolgo čakali da novi avtobus pride po nas. Bilo je hladno in bili smo lačni in utrujeni, ampak smo vsi bili skupaj. Ko smo čakali na bus, sem bila jezna ker sem mislila da je to bil slab konec mojih počitnic, ampak zdaj mi se zdi kot ena super zgodba.

View File

@ -0,0 +1 @@
Zdaj živim v Ljubljani. Moj naslov je [XNaslovX]. Študiram na [XFakultetaX]. Študiram [XStudijskaSmerX] in sem 1 letnik 2 stopnja. V Ljubljani sem prišla v Septembru. Na koncu Septembra. Slovenija mi je všeč. Tudi Ljubjana. Dežala je zelo slikovita. Ima čudovito naravo. Všeč so mi centar mesta, stari grad. Vse je zeleno, hribe, reki. Nije mi všeč vreme. Pogosto dždi ali je maglevito. Preveč ima promet po cesti. Hrana je brez okusa. Kolegi so mi Slovenci, samo jaz sem iz Makedonije. Oni so zelo dobri. Pomagajo mi. Slovenci so gostoljubivi. Niso visoki preveč Slovenci, Slovenke so nizke.

View File

@ -0,0 +1,4 @@
Želim začeti s kratko zgodovino televizije in pogledati kako se ona je vstopila v naše domove. Korenine televizije segajo v 19 . stoletje, izumljen pa je bil v tako imenovani 5 . informacijski revoluciji. Leta 1884 v glavo Paula Nipkowa pride ideja mahanske televizije prenos slike na daljave. V tem času je bilo samo možno prenašati zvok in podatke na daljavo (radio, telegraf, telefon itn.). Kasneje Boris Rosing preide iz mehanskega na elektronski sistem televizije s pomočjo elektronskega teleskopa. Še vedno to izgleda kot velika škatla, ampak se poveča zalon. Leta 1923 se pojavil Zworykinov ionosklop. Prvi prenos gibljive slike na daljavo pa je bil leta 1925 . Vseeno takrat še ni vslopila televezija v domove. V naljevanju so bili veliko eksperimentov in poskusov za izboljšavo tako prenosa slike na daljavo kot samega aparata (televizorja). Zlata doba televizije se začne v 1950-ih. Taktrat se ustvarja milijonsko televizijsko občinstvo v 1950 . in 1960 . letih. To pomeni, da že imamo vpogled v javnost po celotnem svetu. Televizija množično vstopi v domove in postane medij za zabavo, ljudje vse bolj pogosto preživljajo prosti čas pred zaslonom. Danes že imamo barvno televizijo, ker prej je bila črno-bela. Imamo celo pametno ali t . i . hibridno televizijo, ker gra za konvergenco med računalnikom in televizijo.
Dandanes skoraj vsak ima doma televizijo. Mislim, da če nimaš televizora, lahko, bom rekla, ustvariš televizor na računalniku, tablici, celo na telefonu lahko gledaš različne oddaje. Sodobna družba je odvisna od digitalnih medije in njihovih prenosnikov. Ne moramo si zamisliti življenja brez telefona, on je postal podaljšek našega telesa. Zasledila sem tak trend, da starši, ko želijo pomiriti otroka vklopijo televizijo, na primer risanke. Pogosto tudi med vsakem obrokom. Potem pa otrok ima navado, da ne more jesti brez nekaj vklopljenega, brez risanke na televizije, tablici, telefonu ipd . Tako televizor oz. digitalni prenosnik postane neke vrste varuška za otroka, ki poskrbi za to, da bi otrok sedel v miru in recimo ne jokal in imel kaj početi.
Imam dve nečakinje, ena je na dva leta starejša od druge. Eva je starejša, sedaj že je 5 let stara in njena mama je vedno vklopila risanke med obrokom, potem pa dala igrati s tablico tekom dneva. S časom je zasledila, da otrok ne more jesti brez vklopljene risanke in zelo veliko časa igra videoigre in ne želi iti ven. Ko se je pojavil drugi otrok, poskusila odstraniti ta problem in dati manj igrati na tablici in manj gledati televezijo. Upam, da njej je uspelo, ker sedaj je tako važno preživljati več časa zunaj in se aktivno gibati.
Moje mnenje je, da starši morajo sami manj gledati televizijo in manj vklopiti jo otrokam. Naj sami starši več časa preživljajo in se igrajo z svojimi otroki, ker za nas je tako pomembna pozornost starših. Seveda, ko so straši zelo utrujeni lahko izkoriščajo možnost varuške-televizije.

View File

@ -0,0 +1,2 @@
Srečno Novo Leto in vessel Božič.
Torej, začnimo z najpomembnejšim: v Ukrajini Novo Leto je bolj pomemben praznik, za Bozič. In Božič mi imamo 7 januara, ne pa 25 decembera. Na Novo Leto vse imajo praznično mizo. Glavna jed je običajno francoska solata. Običajno praznujejo z družino. Ob 12 pogosto ljudi grejo na ulico in streljajo ognjemet. Potem otroki najdejo darila pod »Božnično« jelko, ampak mi imamo »Novo-letno« jelko. Mislim, da včasih ljudje dajejo darila prijatiljam. Jaz ne maram zimo, vendar Novo leto za me je vsaj nekaj veselja. Kot sem bil otrokom, je bilo zanimivo in zabavno prejemati darila. Zdaj to je zelo težko izbrati darila.

View File

@ -0,0 +1,2 @@
Zelo rada imam slovanske jezike, saj je tudi moj materni jezik, ukrajinščina, slovanski jezik. Ko sem bila v šoli, veliko slišala o Sloveniji: majhni, prijetni in neverjetno lepi deželi. Tako zaradi moje ljubezni do slovanskih jezikov in Slovenije, slovenščino študiram že tri leta. Všeč mi je praksa prevajanja. Na univerzi pogosto prevajam iz slovenščine v ukrajinščino. V prihodnosti želim delati kot slovensko-ukrajinska prevajalka.
Na 58 . seminarju slovenskega jezika, literature in kulture hotela bi izvedeti več o slovenski kulturi in izboljšati svojo slovenščino, predvsem izgovorjavo.

View File

@ -0,0 +1 @@
Že dolgo časa študiram slovenščino, ampak mi je še vedno težko. Vsak dan malo prevajam, uporavljam slovarje in slovnice, ampak premalo vadim jezik v živo. Zato seminar je tako pomembno za mene: moram slišati slovensko, plavati v jeziku za nekaj časa. Predvsem me zanima slovensko literaturo, rada bi pa tudi lahko govorila bolj naravno in lažje, da še nisem dosegla. Literarna dela ki prevajam, vedno jih delim s študenti na Univerzi v Buenos Airesu, in jih naučim vsega, ki sem se naučila o slovenski književnosti. Za študente tema je zelo zanimivo, ker v Argentini se ne ve veliko o kulturi srednje-vzhodne Europe. Vedno me prosijo za nova in nova literarna dela za branje, in se trudim nenehno obnavljati. Tudi intervjuvim slovenske pisatelje in filozofe ko grem na seminar, ker potrebujem teoretično gradivo, potem jo objavim na univerzi.

View File

@ -0,0 +1,983 @@
{
"source": [
{"id": "s0", "text": "Moje "},
{"id": "s1", "text": "najlepše "},
{"id": "s2", "text": "potovanje "},
{"id": "s3", "text": "je "},
{"id": "s4", "text": "bilo "},
{"id": "s5", "text": "pre "},
{"id": "s6", "text": "9 "},
{"id": "s300", "text": "letih "},
{"id": "s301", "text": ". "},
{"id": "s8", "text": "Potoval "},
{"id": "s9", "text": "sem "},
{"id": "s10", "text": "s "},
{"id": "s11", "text": "mamom "},
{"id": "s12", "text": "v "},
{"id": "s302", "text": "Opatijo "},
{"id": "s303", "text": ". "},
{"id": "s14", "text": "Tam "},
{"id": "s15", "text": "sem "},
{"id": "s16", "text": "šel "},
{"id": "s17", "text": "kod "},
{"id": "s18", "text": "strica "},
{"id": "s19", "text": "in "},
{"id": "s20", "text": "sem "},
{"id": "s21", "text": "kod "},
{"id": "s22", "text": "njega "},
{"id": "s304", "text": "prenočeval "},
{"id": "s305", "text": ". "},
{"id": "s24", "text": "Tamo "},
{"id": "s25", "text": "sem "},
{"id": "s26", "text": "bil "},
{"id": "s27", "text": "10 "},
{"id": "s306", "text": "dni "},
{"id": "s307", "text": ". "},
{"id": "s29", "text": "Tamo "},
{"id": "s30", "text": "sem "},
{"id": "s31", "text": "vse "},
{"id": "s309", "text": "jedel "},
{"id": "s311", "text": ". "},
{"id": "s33", "text": "Najlepši "},
{"id": "s34", "text": "del "},
{"id": "s35", "text": "potovanja "},
{"id": "s36", "text": "je "},
{"id": "s37", "text": "bil "},
{"id": "s38", "text": "moj "},
{"id": "s39", "text": "rojstni "},
{"id": "s40", "text": "dan "},
{"id": "s312", "text": "( "},
{"id": "s313", "text": "10 "},
{"id": "s314", "text": "let "},
{"id": "s316", "text": ") "},
{"id": "s317", "text": ". "},
{"id": "s43", "text": "Mama "},
{"id": "s44", "text": "in "},
{"id": "s45", "text": "stric "},
{"id": "s46", "text": "sta "},
{"id": "s47", "text": "napravila "},
{"id": "s48", "text": "parti "},
{"id": "s49", "text": "in "},
{"id": "s50", "text": "kupila "},
{"id": "s51", "text": "veliko "},
{"id": "s318", "text": "poklonov "},
{"id": "s319", "text": ". "},
{"id": "s53", "text": "S "},
{"id": "s54", "text": "tim "},
{"id": "s55", "text": "so "},
{"id": "s56", "text": "me "},
{"id": "s57", "text": "iznenadil "},
{"id": "s58", "text": "in "},
{"id": "s59", "text": "sem "},
{"id": "s60", "text": "bil "},
{"id": "s61", "text": "zelo "},
{"id": "s320", "text": "srečen "},
{"id": "s321", "text": ". "},
{"id": "s63", "text": "10 "},
{"id": "s64", "text": "dan "},
{"id": "s65", "text": "sm "},
{"id": "s66", "text": "se "},
{"id": "s67", "text": "vozili "},
{"id": "s68", "text": "z "},
{"id": "s69", "text": "gliserjem "},
{"id": "s70", "text": "po "},
{"id": "s71", "text": "morju "},
{"id": "s72", "text": "in "},
{"id": "s73", "text": "mi "},
{"id": "s74", "text": "je "},
{"id": "s75", "text": "bilo "},
{"id": "s76", "text": "zelo "},
{"id": "s322", "text": "všeč "},
{"id": "s323", "text": ". "},
{"id": "s78", "text": "Tako "},
{"id": "s79", "text": "posebno "},
{"id": "s80", "text": "je "},
{"id": "s81", "text": "bilo "},
{"id": "s82", "text": "zaradi "},
{"id": "s83", "text": "mojega "},
{"id": "s84", "text": "rojstnega "},
{"id": "s324", "text": "dana "},
{"id": "s325", "text": ". "},
{"id": "s86", "text": "Stric "},
{"id": "s87", "text": "je "},
{"id": "s88", "text": "imal "},
{"id": "s89", "text": "prijatelja "},
{"id": "s326", "text": "[XImeX] "},
{"id": "s327", "text": ", "},
{"id": "s91", "text": "ki "},
{"id": "s92", "text": "je "},
{"id": "s93", "text": "imal "},
{"id": "s328", "text": "restavracijo "},
{"id": "s329", "text": ". "},
{"id": "s95", "text": "[XImeX] "},
{"id": "s96", "text": "me "},
{"id": "s97", "text": "je "},
{"id": "s98", "text": "rad "},
{"id": "s99", "text": "imal "},
{"id": "s100", "text": "in "},
{"id": "s101", "text": "jaz "},
{"id": "s102", "text": "sem "},
{"id": "s103", "text": "mu "},
{"id": "s104", "text": "peval "},
{"id": "s330", "text": "pjesme "},
{"id": "s331", "text": ". "}
],
"target": [
{"id": "t106", "text": "Moje "},
{"id": "t107", "text": "najlepše "},
{"id": "t108", "text": "potovanje "},
{"id": "t109", "text": "je "},
{"id": "t110", "text": "bilo "},
{"id": "t367", "text": "pred "},
{"id": "t112", "text": "9 "},
{"id": "t368", "text": "leti "},
{"id": "t333", "text": ". "},
{"id": "t114", "text": "Potoval "},
{"id": "t115", "text": "sem "},
{"id": "t214", "text": "z "},
{"id": "t215", "text": "mamo "},
{"id": "t118", "text": "v "},
{"id": "t334", "text": "Opatijo "},
{"id": "t335", "text": ". "},
{"id": "t120", "text": "Tam "},
{"id": "t121", "text": "sem "},
{"id": "t122", "text": "šel "},
{"id": "t217", "text": "k "},
{"id": "t219", "text": "stricu "},
{"id": "t293", "text": "in "},
{"id": "t372", "text": "sem "},
{"id": "t373", "text": "pri "},
{"id": "t229", "text": "njem "},
{"id": "t336", "text": "prenočeval "},
{"id": "t337", "text": ". "},
{"id": "t230", "text": "Tam "},
{"id": "t131", "text": "sem "},
{"id": "t132", "text": "bil "},
{"id": "t133", "text": "10 "},
{"id": "t338", "text": "dni "},
{"id": "t339", "text": ". "},
{"id": "t231", "text": "Tam "},
{"id": "t136", "text": "sem "},
{"id": "t137", "text": "vse "},
{"id": "t340", "text": "jedel "},
{"id": "t341", "text": ". "},
{"id": "t139", "text": "Najlepši "},
{"id": "t140", "text": "del "},
{"id": "t141", "text": "potovanja "},
{"id": "t142", "text": "je "},
{"id": "t143", "text": "bil "},
{"id": "t144", "text": "moj "},
{"id": "t145", "text": "rojstni "},
{"id": "t146", "text": "dan "},
{"id": "t342", "text": "( "},
{"id": "t343", "text": "10 "},
{"id": "t344", "text": "let "},
{"id": "t346", "text": ") "},
{"id": "t347", "text": ". "},
{"id": "t149", "text": "Mama "},
{"id": "t150", "text": "in "},
{"id": "t151", "text": "stric "},
{"id": "t152", "text": "sta "},
{"id": "t153", "text": "napravila "},
{"id": "t154", "text": "parti "},
{"id": "t155", "text": "in "},
{"id": "t156", "text": "kupila "},
{"id": "t157", "text": "veliko "},
{"id": "t348", "text": "daril "},
{"id": "t349", "text": ". "},
{"id": "t159", "text": "S "},
{"id": "t244", "text": "tem "},
{"id": "t297", "text": "sta "},
{"id": "t162", "text": "me "},
{"id": "t384", "text": "presenetila "},
{"id": "t164", "text": "in "},
{"id": "t388", "text": "sem "},
{"id": "t389", "text": "bil "},
{"id": "t390", "text": "zelo "},
{"id": "t350", "text": "srečen "},
{"id": "t351", "text": ". "},
{"id": "t352", "text": "10 "},
{"id": "t353", "text": ". "},
{"id": "t170", "text": "dan "},
{"id": "t255", "text": "sem "},
{"id": "t172", "text": "se "},
{"id": "t391", "text": "vozil "},
{"id": "t174", "text": "z "},
{"id": "t175", "text": "gliserjem "},
{"id": "t176", "text": "po "},
{"id": "t177", "text": "morju "},
{"id": "t178", "text": "in "},
{"id": "t392", "text": "mi "},
{"id": "t265", "text": "je "},
{"id": "t398", "text": "bilo "},
{"id": "t395", "text": "zelo "},
{"id": "t354", "text": "všeč "},
{"id": "t355", "text": ". "},
{"id": "t184", "text": "Tako "},
{"id": "t185", "text": "posebno "},
{"id": "t186", "text": "je "},
{"id": "t187", "text": "bilo "},
{"id": "t188", "text": "zaradi "},
{"id": "t189", "text": "mojega "},
{"id": "t190", "text": "rojstnega "},
{"id": "t356", "text": "dne "},
{"id": "t357", "text": ". "},
{"id": "t192", "text": "Stric "},
{"id": "t193", "text": "je "},
{"id": "t271", "text": "imel "},
{"id": "t195", "text": "prijatelja "},
{"id": "t358", "text": "[XImeX] "},
{"id": "t359", "text": ", "},
{"id": "t197", "text": "ki "},
{"id": "t198", "text": "je "},
{"id": "t273", "text": "imel "},
{"id": "t360", "text": "restavracijo "},
{"id": "t361", "text": ". "},
{"id": "t201", "text": "[XImeX] "},
{"id": "t202", "text": "me "},
{"id": "t203", "text": "je "},
{"id": "t279", "text": "imel "},
{"id": "t284", "text": "rad "},
{"id": "t282", "text": "in "},
{"id": "t207", "text": "jaz "},
{"id": "t208", "text": "sem "},
{"id": "t209", "text": "mu "},
{"id": "t286", "text": "pel "},
{"id": "t362", "text": "pesmi "},
{"id": "t363", "text": ". "}
],
"edges": {
"e-s10-t214": {
"id": "e-s10-t214",
"ids": ["s10", "t214"],
"labels": ["Z-CRK"],
"manual": true
},
"e-s21-t373": {
"id": "e-s21-t373",
"ids": ["s21", "t373"],
"labels": ["B-PRED"],
"manual": true
},
"e-s98-t284": {
"id": "e-s98-t284",
"ids": ["s98", "t284"],
"labels": ["S-BR"],
"manual": true
},
"e-s0-t106": {
"id": "e-s0-t106",
"ids": ["s0", "t106"],
"labels": [],
"manual": false
},
"e-s1-t107": {
"id": "e-s1-t107",
"ids": ["s1", "t107"],
"labels": [],
"manual": false
},
"e-s2-t108": {
"id": "e-s2-t108",
"ids": ["s2", "t108"],
"labels": [],
"manual": false
},
"e-s3-t109": {
"id": "e-s3-t109",
"ids": ["s3", "t109"],
"labels": [],
"manual": false
},
"e-s4-t110": {
"id": "e-s4-t110",
"ids": ["s4", "t110"],
"labels": [],
"manual": false
},
"e-s5-t367": {
"id": "e-s5-t367",
"ids": ["s5", "t367"],
"labels": ["B-PRED"],
"manual": false
},
"e-s6-t112": {
"id": "e-s6-t112",
"ids": ["s6", "t112"],
"labels": [],
"manual": false
},
"e-s300-t368": {
"id": "e-s300-t368",
"ids": ["s300", "t368"],
"labels": ["O-SAM"],
"manual": false
},
"e-s301-t333": {
"id": "e-s301-t333",
"ids": ["s301", "t333"],
"labels": [],
"manual": false
},
"e-s8-t114": {
"id": "e-s8-t114",
"ids": ["s8", "t114"],
"labels": [],
"manual": false
},
"e-s9-t115": {
"id": "e-s9-t115",
"ids": ["s9", "t115"],
"labels": [],
"manual": false
},
"e-s11-t215": {
"id": "e-s11-t215",
"ids": ["s11", "t215"],
"labels": ["O-SAM"],
"manual": false
},
"e-s12-t118": {
"id": "e-s12-t118",
"ids": ["s12", "t118"],
"labels": [],
"manual": false
},
"e-s302-t334": {
"id": "e-s302-t334",
"ids": ["s302", "t334"],
"labels": [],
"manual": false
},
"e-s303-t335": {
"id": "e-s303-t335",
"ids": ["s303", "t335"],
"labels": [],
"manual": false
},
"e-s14-t120": {
"id": "e-s14-t120",
"ids": ["s14", "t120"],
"labels": [],
"manual": false
},
"e-s15-t121": {
"id": "e-s15-t121",
"ids": ["s15", "t121"],
"labels": [],
"manual": false
},
"e-s16-t122": {
"id": "e-s16-t122",
"ids": ["s16", "t122"],
"labels": [],
"manual": false
},
"e-s17-t217": {
"id": "e-s17-t217",
"ids": ["s17", "t217"],
"labels": ["B-PRED"],
"manual": false
},
"e-s18-t219": {
"id": "e-s18-t219",
"ids": ["s18", "t219"],
"labels": ["O-SAM"],
"manual": false
},
"e-s19-t293": {
"id": "e-s19-t293",
"ids": ["s19", "t293"],
"labels": [],
"manual": false
},
"e-s20-t372": {
"id": "e-s20-t372",
"ids": ["s20", "t372"],
"labels": [],
"manual": false
},
"e-s22-t229": {
"id": "e-s22-t229",
"ids": ["s22", "t229"],
"labels": ["O-ZAIM"],
"manual": false
},
"e-s304-t336": {
"id": "e-s304-t336",
"ids": ["s304", "t336"],
"labels": [],
"manual": false
},
"e-s305-t337": {
"id": "e-s305-t337",
"ids": ["s305", "t337"],
"labels": [],
"manual": false
},
"e-s24-t230": {
"id": "e-s24-t230",
"ids": ["s24", "t230"],
"labels": ["B-PRISL"],
"manual": false
},
"e-s25-t131": {
"id": "e-s25-t131",
"ids": ["s25", "t131"],
"labels": [],
"manual": false
},
"e-s26-t132": {
"id": "e-s26-t132",
"ids": ["s26", "t132"],
"labels": [],
"manual": false
},
"e-s27-t133": {
"id": "e-s27-t133",
"ids": ["s27", "t133"],
"labels": [],
"manual": false
},
"e-s306-t338": {
"id": "e-s306-t338",
"ids": ["s306", "t338"],
"labels": [],
"manual": false
},
"e-s307-t339": {
"id": "e-s307-t339",
"ids": ["s307", "t339"],
"labels": [],
"manual": false
},
"e-s29-t231": {
"id": "e-s29-t231",
"ids": ["s29", "t231"],
"labels": ["B-PRISL"],
"manual": false
},
"e-s30-t136": {
"id": "e-s30-t136",
"ids": ["s30", "t136"],
"labels": [],
"manual": false
},
"e-s31-t137": {
"id": "e-s31-t137",
"ids": ["s31", "t137"],
"labels": [],
"manual": false
},
"e-s309-t340": {
"id": "e-s309-t340",
"ids": ["s309", "t340"],
"labels": [],
"manual": false
},
"e-s311-t341": {
"id": "e-s311-t341",
"ids": ["s311", "t341"],
"labels": [],
"manual": false
},
"e-s33-t139": {
"id": "e-s33-t139",
"ids": ["s33", "t139"],
"labels": [],
"manual": false
},
"e-s34-t140": {
"id": "e-s34-t140",
"ids": ["s34", "t140"],
"labels": [],
"manual": false
},
"e-s35-t141": {
"id": "e-s35-t141",
"ids": ["s35", "t141"],
"labels": [],
"manual": false
},
"e-s36-t142": {
"id": "e-s36-t142",
"ids": ["s36", "t142"],
"labels": [],
"manual": false
},
"e-s37-t143": {
"id": "e-s37-t143",
"ids": ["s37", "t143"],
"labels": [],
"manual": false
},
"e-s38-t144": {
"id": "e-s38-t144",
"ids": ["s38", "t144"],
"labels": [],
"manual": false
},
"e-s39-t145": {
"id": "e-s39-t145",
"ids": ["s39", "t145"],
"labels": [],
"manual": false
},
"e-s40-t146": {
"id": "e-s40-t146",
"ids": ["s40", "t146"],
"labels": [],
"manual": false
},
"e-s312-t342": {
"id": "e-s312-t342",
"ids": ["s312", "t342"],
"labels": [],
"manual": false
},
"e-s313-t343": {
"id": "e-s313-t343",
"ids": ["s313", "t343"],
"labels": [],
"manual": false
},
"e-s314-t344": {
"id": "e-s314-t344",
"ids": ["s314", "t344"],
"labels": [],
"manual": false
},
"e-s316-t346": {
"id": "e-s316-t346",
"ids": ["s316", "t346"],
"labels": [],
"manual": false
},
"e-s317-t347": {
"id": "e-s317-t347",
"ids": ["s317", "t347"],
"labels": [],
"manual": false
},
"e-s43-t149": {
"id": "e-s43-t149",
"ids": ["s43", "t149"],
"labels": [],
"manual": false
},
"e-s44-t150": {
"id": "e-s44-t150",
"ids": ["s44", "t150"],
"labels": [],
"manual": false
},
"e-s45-t151": {
"id": "e-s45-t151",
"ids": ["s45", "t151"],
"labels": [],
"manual": false
},
"e-s46-t152": {
"id": "e-s46-t152",
"ids": ["s46", "t152"],
"labels": [],
"manual": false
},
"e-s47-t153": {
"id": "e-s47-t153",
"ids": ["s47", "t153"],
"labels": [],
"manual": false
},
"e-s48-t154": {
"id": "e-s48-t154",
"ids": ["s48", "t154"],
"labels": [],
"manual": false
},
"e-s49-t155": {
"id": "e-s49-t155",
"ids": ["s49", "t155"],
"labels": [],
"manual": false
},
"e-s50-t156": {
"id": "e-s50-t156",
"ids": ["s50", "t156"],
"labels": [],
"manual": false
},
"e-s51-t157": {
"id": "e-s51-t157",
"ids": ["s51", "t157"],
"labels": [],
"manual": false
},
"e-s318-t348": {
"id": "e-s318-t348",
"ids": ["s318", "t348"],
"labels": ["B-SAM"],
"manual": false
},
"e-s319-t349": {
"id": "e-s319-t349",
"ids": ["s319", "t349"],
"labels": [],
"manual": false
},
"e-s53-t159": {
"id": "e-s53-t159",
"ids": ["s53", "t159"],
"labels": [],
"manual": false
},
"e-s54-t244": {
"id": "e-s54-t244",
"ids": ["s54", "t244"],
"labels": ["O-ZAIM"],
"manual": false
},
"e-s55-t297": {
"id": "e-s55-t297",
"ids": ["s55", "t297"],
"labels": ["O-GLAG"],
"manual": false
},
"e-s56-t162": {
"id": "e-s56-t162",
"ids": ["s56", "t162"],
"labels": [],
"manual": false
},
"e-s57-t384": {
"id": "e-s57-t384",
"ids": ["s57", "t384"],
"labels": ["B-GLAG", "O-GLAG"],
"manual": false
},
"e-s58-t164": {
"id": "e-s58-t164",
"ids": ["s58", "t164"],
"labels": [],
"manual": false
},
"e-s59-t388": {
"id": "e-s59-t388",
"ids": ["s59", "t388"],
"labels": [],
"manual": false
},
"e-s60-t389": {
"id": "e-s60-t389",
"ids": ["s60", "t389"],
"labels": [],
"manual": false
},
"e-s61-t390": {
"id": "e-s61-t390",
"ids": ["s61", "t390"],
"labels": [],
"manual": false
},
"e-s320-t350": {
"id": "e-s320-t350",
"ids": ["s320", "t350"],
"labels": [],
"manual": false
},
"e-s321-t351": {
"id": "e-s321-t351",
"ids": ["s321", "t351"],
"labels": [],
"manual": false
},
"e-s63-t352": {
"id": "e-s63-t352",
"ids": ["s63", "t352"],
"labels": [],
"manual": false
},
"e-s64-t170": {
"id": "e-s64-t170",
"ids": ["s64", "t170"],
"labels": [],
"manual": false
},
"e-s65-t255": {
"id": "e-s65-t255",
"ids": ["s65", "t255"],
"labels": ["Z-CRK"],
"manual": false
},
"e-s66-t172": {
"id": "e-s66-t172",
"ids": ["s66", "t172"],
"labels": [],
"manual": false
},
"e-s67-t391": {
"id": "e-s67-t391",
"ids": ["s67", "t391"],
"labels": ["O-GLAG"],
"manual": false
},
"e-s68-t174": {
"id": "e-s68-t174",
"ids": ["s68", "t174"],
"labels": [],
"manual": false
},
"e-s69-t175": {
"id": "e-s69-t175",
"ids": ["s69", "t175"],
"labels": [],
"manual": false
},
"e-s70-t176": {
"id": "e-s70-t176",
"ids": ["s70", "t176"],
"labels": [],
"manual": false
},
"e-s71-t177": {
"id": "e-s71-t177",
"ids": ["s71", "t177"],
"labels": [],
"manual": false
},
"e-s72-t178": {
"id": "e-s72-t178",
"ids": ["s72", "t178"],
"labels": [],
"manual": false
},
"e-s73-t392": {
"id": "e-s73-t392",
"ids": ["s73", "t392"],
"labels": [],
"manual": false
},
"e-s74-t265": {
"id": "e-s74-t265",
"ids": ["s74", "t265"],
"labels": [],
"manual": false
},
"e-s75-t398": {
"id": "e-s75-t398",
"ids": ["s75", "t398"],
"labels": [],
"manual": false
},
"e-s76-t395": {
"id": "e-s76-t395",
"ids": ["s76", "t395"],
"labels": [],
"manual": false
},
"e-s322-t354": {
"id": "e-s322-t354",
"ids": ["s322", "t354"],
"labels": [],
"manual": false
},
"e-s323-t355": {
"id": "e-s323-t355",
"ids": ["s323", "t355"],
"labels": [],
"manual": false
},
"e-s78-t184": {
"id": "e-s78-t184",
"ids": ["s78", "t184"],
"labels": [],
"manual": false
},
"e-s79-t185": {
"id": "e-s79-t185",
"ids": ["s79", "t185"],
"labels": [],
"manual": false
},
"e-s80-t186": {
"id": "e-s80-t186",
"ids": ["s80", "t186"],
"labels": [],
"manual": false
},
"e-s81-t187": {
"id": "e-s81-t187",
"ids": ["s81", "t187"],
"labels": [],
"manual": false
},
"e-s82-t188": {
"id": "e-s82-t188",
"ids": ["s82", "t188"],
"labels": [],
"manual": false
},
"e-s83-t189": {
"id": "e-s83-t189",
"ids": ["s83", "t189"],
"labels": [],
"manual": false
},
"e-s84-t190": {
"id": "e-s84-t190",
"ids": ["s84", "t190"],
"labels": [],
"manual": false
},
"e-s324-t356": {
"id": "e-s324-t356",
"ids": ["s324", "t356"],
"labels": ["O-SAM"],
"manual": false
},
"e-s325-t357": {
"id": "e-s325-t357",
"ids": ["s325", "t357"],
"labels": [],
"manual": false
},
"e-s86-t192": {
"id": "e-s86-t192",
"ids": ["s86", "t192"],
"labels": [],
"manual": false
},
"e-s87-t193": {
"id": "e-s87-t193",
"ids": ["s87", "t193"],
"labels": [],
"manual": false
},
"e-s88-t271": {
"id": "e-s88-t271",
"ids": ["s88", "t271"],
"labels": ["O-GLAG"],
"manual": false
},
"e-s89-t195": {
"id": "e-s89-t195",
"ids": ["s89", "t195"],
"labels": [],
"manual": false
},
"e-s326-t358": {
"id": "e-s326-t358",
"ids": ["s326", "t358"],
"labels": [],
"manual": false
},
"e-s327-t359": {
"id": "e-s327-t359",
"ids": ["s327", "t359"],
"labels": [],
"manual": false
},
"e-s91-t197": {
"id": "e-s91-t197",
"ids": ["s91", "t197"],
"labels": [],
"manual": false
},
"e-s92-t198": {
"id": "e-s92-t198",
"ids": ["s92", "t198"],
"labels": [],
"manual": false
},
"e-s93-t273": {
"id": "e-s93-t273",
"ids": ["s93", "t273"],
"labels": ["B-GLAG"],
"manual": false
},
"e-s328-t360": {
"id": "e-s328-t360",
"ids": ["s328", "t360"],
"labels": [],
"manual": false
},
"e-s329-t361": {
"id": "e-s329-t361",
"ids": ["s329", "t361"],
"labels": [],
"manual": false
},
"e-s95-t201": {
"id": "e-s95-t201",
"ids": ["s95", "t201"],
"labels": [],
"manual": false
},
"e-s96-t202": {
"id": "e-s96-t202",
"ids": ["s96", "t202"],
"labels": [],
"manual": false
},
"e-s97-t203": {
"id": "e-s97-t203",
"ids": ["s97", "t203"],
"labels": [],
"manual": false
},
"e-s99-t279": {
"id": "e-s99-t279",
"ids": ["s99", "t279"],
"labels": ["O-GLAG"],
"manual": false
},
"e-s100-t282": {
"id": "e-s100-t282",
"ids": ["s100", "t282"],
"labels": [],
"manual": false
},
"e-s101-t207": {
"id": "e-s101-t207",
"ids": ["s101", "t207"],
"labels": [],
"manual": false
},
"e-s102-t208": {
"id": "e-s102-t208",
"ids": ["s102", "t208"],
"labels": [],
"manual": false
},
"e-s103-t209": {
"id": "e-s103-t209",
"ids": ["s103", "t209"],
"labels": [],
"manual": false
},
"e-s104-t286": {
"id": "e-s104-t286",
"ids": ["s104", "t286"],
"labels": ["O-GLAG"],
"manual": false
},
"e-s330-t362": {
"id": "e-s330-t362",
"ids": ["s330", "t362"],
"labels": ["B-SAM"],
"manual": false
},
"e-s331-t363": {
"id": "e-s331-t363",
"ids": ["s331", "t363"],
"labels": [],
"manual": false
},
"e-t353": {
"id": "e-t353",
"ids": ["t353"],
"labels": ["Z-LOC"],
"manual": false
}
}
}

View File

@ -0,0 +1,963 @@
{
"source": [
{"id": "s0", "text": "Pero "},
{"id": "s1", "text": "Antič "},
{"id": "s2", "text": "je "},
{"id": "s3", "text": "najpomembnejši "},
{"id": "s4", "text": "človek "},
{"id": "s5", "text": "v "},
{"id": "s6", "text": "zgodovini "},
{"id": "s7", "text": "sporta "},
{"id": "s8", "text": "v "},
{"id": "s9", "text": "Makedoniji "},
{"id": "s10", "text": ". "},
{"id": "s11", "text": "To "},
{"id": "s12", "text": "je "},
{"id": "s13", "text": "moje "},
{"id": "s14", "text": "mnenje "},
{"id": "s15", "text": ". "},
{"id": "s16", "text": "Pero "},
{"id": "s17", "text": "Antič "},
{"id": "s18", "text": "je "},
{"id": "s19", "text": "makedonski "},
{"id": "s20", "text": "reprezentant "},
{"id": "s21", "text": ". "},
{"id": "s22", "text": "On "},
{"id": "s23", "text": "je "},
{"id": "s24", "text": "prvi "},
{"id": "s25", "text": "makedonski "},
{"id": "s26", "text": "košarkar "},
{"id": "s27", "text": "ki "},
{"id": "s28", "text": "je "},
{"id": "s29", "text": "igral "},
{"id": "s30", "text": "v "},
{"id": "s31", "text": "NBA "},
{"id": "s32", "text": ". "},
{"id": "s33", "text": "Igral "},
{"id": "s34", "text": "je "},
{"id": "s35", "text": "za "},
{"id": "s36", "text": "Atlanta "},
{"id": "s37", "text": "Hawks "},
{"id": "s38", "text": "in "},
{"id": "s39", "text": "je "},
{"id": "s40", "text": "tekmoval "},
{"id": "s41", "text": "3 "},
{"id": "s42", "text": "leta "},
{"id": "s43", "text": ". "},
{"id": "s44", "text": "Tudi "},
{"id": "s45", "text": "je "},
{"id": "s46", "text": "igral "},
{"id": "s47", "text": "za "},
{"id": "s48", "text": "Crveno "},
{"id": "s49", "text": "Zvezdo "},
{"id": "s50", "text": ", "},
{"id": "s51", "text": "Fenerbahče "},
{"id": "s52", "text": ", "},
{"id": "s53", "text": "Olimpijakos "},
{"id": "s54", "text": ". "},
{"id": "s55", "text": "Ko "},
{"id": "s56", "text": "je "},
{"id": "s57", "text": "tekmoval "},
{"id": "s58", "text": "za "},
{"id": "s59", "text": "Olimpijakos "},
{"id": "s60", "text": "dvakrat "},
{"id": "s61", "text": "je "},
{"id": "s62", "text": "osvojil "},
{"id": "s63", "text": "Euroligo "},
{"id": "s64", "text": ". "},
{"id": "s65", "text": "Pero "},
{"id": "s66", "text": "Antič "},
{"id": "s67", "text": "je "},
{"id": "s68", "text": "bil "},
{"id": "s69", "text": "med "},
{"id": "s70", "text": "najboljšimi "},
{"id": "s71", "text": "na "},
{"id": "s72", "text": "Evropskem "},
{"id": "s73", "text": "turniru "},
{"id": "s74", "text": "v "},
{"id": "s75", "text": "košarki "},
{"id": "s76", "text": "in "},
{"id": "s77", "text": "Makedonija "},
{"id": "s78", "text": "je "},
{"id": "s79", "text": "končala "},
{"id": "s80", "text": "na "},
{"id": "s81", "text": "4 "},
{"id": "s82", "text": "mestu "},
{"id": "s83", "text": ". "},
{"id": "s84", "text": "V "},
{"id": "s85", "text": "NBA "},
{"id": "s86", "text": "svoj "},
{"id": "s87", "text": "največi "},
{"id": "s88", "text": "uspeh "},
{"id": "s89", "text": "je "},
{"id": "s90", "text": "imal "},
{"id": "s91", "text": "kot "},
{"id": "s92", "text": "je "},
{"id": "s93", "text": "prišel "},
{"id": "s94", "text": "do "},
{"id": "s95", "text": "finale "},
{"id": "s96", "text": "Istočne "},
{"id": "s97", "text": "konferencije "},
{"id": "s98", "text": ", "},
{"id": "s99", "text": "ampak "},
{"id": "s100", "text": "Atlanta "},
{"id": "s101", "text": "je "},
{"id": "s102", "text": "izgubilo "},
{"id": "s103", "text": "to "},
{"id": "s104", "text": "serijo "},
{"id": "s105", "text": ". "},
{"id": "s106", "text": "Pero "},
{"id": "s107", "text": "Antič "},
{"id": "s108", "text": "je "},
{"id": "s109", "text": "skoraj "},
{"id": "s110", "text": "bil "},
{"id": "s111", "text": "izbran "},
{"id": "s112", "text": "za "},
{"id": "s113", "text": "presednik "},
{"id": "s114", "text": "makedonsko "},
{"id": "s115", "text": "košarkarsko "},
{"id": "s116", "text": "federacijo "},
{"id": "s117", "text": ". "}
],
"target": [
{"id": "t0", "text": "Pero "},
{"id": "t1", "text": "Antič "},
{"id": "t2", "text": "je "},
{"id": "t3", "text": "najpomembnejši "},
{"id": "t4", "text": "človek "},
{"id": "t5", "text": "v "},
{"id": "t6", "text": "zgodovini "},
{"id": "t119", "text": "športa "},
{"id": "t8", "text": "v "},
{"id": "t9", "text": "Makedoniji "},
{"id": "t10", "text": ". "},
{"id": "t11", "text": "To "},
{"id": "t12", "text": "je "},
{"id": "t13", "text": "moje "},
{"id": "t14", "text": "mnenje "},
{"id": "t15", "text": ". "},
{"id": "t16", "text": "Pero "},
{"id": "t17", "text": "Antič "},
{"id": "t18", "text": "je "},
{"id": "t19", "text": "makedonski "},
{"id": "t20", "text": "reprezentant "},
{"id": "t21", "text": ". "},
{"id": "t22", "text": "On "},
{"id": "t23", "text": "je "},
{"id": "t24", "text": "prvi "},
{"id": "t25", "text": "makedonski "},
{"id": "t26", "text": "košarkar "},
{"id": "t121", "text": ", "},
{"id": "t122", "text": "ki "},
{"id": "t28", "text": "je "},
{"id": "t29", "text": "igral "},
{"id": "t30", "text": "v "},
{"id": "t31", "text": "NBA "},
{"id": "t32", "text": ". "},
{"id": "t33", "text": "Igral "},
{"id": "t34", "text": "je "},
{"id": "t35", "text": "za "},
{"id": "t36", "text": "Atlanta "},
{"id": "t37", "text": "Hawks "},
{"id": "t38", "text": "in "},
{"id": "t39", "text": "je "},
{"id": "t40", "text": "tekmoval "},
{"id": "t41", "text": "3 "},
{"id": "t42", "text": "leta "},
{"id": "t43", "text": ". "},
{"id": "t128", "text": "Igral "},
{"id": "t132", "text": "je "},
{"id": "t135", "text": "tudi "},
{"id": "t137", "text": "za "},
{"id": "t48", "text": "Crveno "},
{"id": "t139", "text": "zvezdo "},
{"id": "t50", "text": ", "},
{"id": "t51", "text": "Fenerbahče "},
{"id": "t52", "text": ", "},
{"id": "t140", "text": "Olimpiakos "},
{"id": "t54", "text": ". "},
{"id": "t55", "text": "Ko "},
{"id": "t56", "text": "je "},
{"id": "t57", "text": "tekmoval "},
{"id": "t58", "text": "za "},
{"id": "t141", "text": "Olimpiakos "},
{"id": "t146", "text": ", "},
{"id": "t148", "text": "je "},
{"id": "t150", "text": "dvakrat "},
{"id": "t151", "text": "osvojil "},
{"id": "t207", "text": "Evroligo "},
{"id": "t64", "text": ". "},
{"id": "t65", "text": "Pero "},
{"id": "t66", "text": "Antič "},
{"id": "t67", "text": "je "},
{"id": "t68", "text": "bil "},
{"id": "t69", "text": "med "},
{"id": "t70", "text": "najboljšimi "},
{"id": "t71", "text": "na "},
{"id": "t153", "text": "evropskem "},
{"id": "t154", "text": "turnirju "},
{"id": "t74", "text": "v "},
{"id": "t75", "text": "košarki "},
{"id": "t76", "text": "in "},
{"id": "t77", "text": "Makedonija "},
{"id": "t78", "text": "je "},
{"id": "t79", "text": "končala "},
{"id": "t80", "text": "na "},
{"id": "t81", "text": "4 "},
{"id": "t156", "text": ". "},
{"id": "t157", "text": "mestu "},
{"id": "t83", "text": ". "},
{"id": "t84", "text": "V "},
{"id": "t85", "text": "NBA "},
{"id": "t161", "text": "je "},
{"id": "t160", "text": "svoj "},
{"id": "t162", "text": "največji "},
{"id": "t88", "text": "uspeh "},
{"id": "t165", "text": "imel "},
{"id": "t167", "text": ", "},
{"id": "t169", "text": "ko "},
{"id": "t92", "text": "je "},
{"id": "t93", "text": "prišel "},
{"id": "t94", "text": "do "},
{"id": "t171", "text": "finala "},
{"id": "t180", "text": "vzhodne "},
{"id": "t185", "text": "konference "},
{"id": "t98", "text": ", "},
{"id": "t99", "text": "ampak "},
{"id": "t100", "text": "Atlanta "},
{"id": "t101", "text": "je "},
{"id": "t102", "text": "izgubilo "},
{"id": "t103", "text": "to "},
{"id": "t104", "text": "serijo "},
{"id": "t105", "text": ". "},
{"id": "t106", "text": "Pero "},
{"id": "t107", "text": "Antič "},
{"id": "t108", "text": "je "},
{"id": "t189", "text": "pred "},
{"id": "t198", "text": "kratkim "},
{"id": "t192", "text": "bil "},
{"id": "t111", "text": "izbran "},
{"id": "t112", "text": "za "},
{"id": "t208", "text": "predsednika "},
{"id": "t201", "text": "makedonske "},
{"id": "t203", "text": "košarkarske "},
{"id": "t205", "text": "federacije "},
{"id": "t117", "text": ". "}
],
"edges": {
"e-s44-s45-s46-t128-t132-t135": {
"id": "e-s44-s45-s46-t128-t132-t135",
"ids": ["s44", "s45", "s46", "t128", "t132", "t135"],
"labels": ["S-BR"],
"manual": true
},
"e-s60-s61-t148-t150": {
"id": "e-s60-s61-t148-t150",
"ids": ["s60", "s61", "t148", "t150"],
"labels": ["S-BR"],
"manual": true
},
"e-s89-t161": {
"id": "e-s89-t161",
"ids": ["s89", "t161"],
"labels": ["S-BR"],
"manual": true
},
"e-s109-t189-t198": {
"id": "e-s109-t189-t198",
"ids": ["s109", "t189", "t198"],
"labels": ["B-OST"],
"manual": true
},
"e-s0-t0": {
"id": "e-s0-t0",
"ids": ["s0", "t0"],
"labels": [],
"manual": false
},
"e-s1-t1": {
"id": "e-s1-t1",
"ids": ["s1", "t1"],
"labels": [],
"manual": false
},
"e-s2-t2": {
"id": "e-s2-t2",
"ids": ["s2", "t2"],
"labels": [],
"manual": false
},
"e-s3-t3": {
"id": "e-s3-t3",
"ids": ["s3", "t3"],
"labels": [],
"manual": false
},
"e-s4-t4": {
"id": "e-s4-t4",
"ids": ["s4", "t4"],
"labels": [],
"manual": false
},
"e-s5-t5": {
"id": "e-s5-t5",
"ids": ["s5", "t5"],
"labels": [],
"manual": false
},
"e-s6-t6": {
"id": "e-s6-t6",
"ids": ["s6", "t6"],
"labels": [],
"manual": false
},
"e-s7-t119": {
"id": "e-s7-t119",
"ids": ["s7", "t119"],
"labels": ["Z-CRK"],
"manual": false
},
"e-s8-t8": {
"id": "e-s8-t8",
"ids": ["s8", "t8"],
"labels": [],
"manual": false
},
"e-s9-t9": {
"id": "e-s9-t9",
"ids": ["s9", "t9"],
"labels": [],
"manual": false
},
"e-s10-t10": {
"id": "e-s10-t10",
"ids": ["s10", "t10"],
"labels": [],
"manual": false
},
"e-s11-t11": {
"id": "e-s11-t11",
"ids": ["s11", "t11"],
"labels": [],
"manual": false
},
"e-s12-t12": {
"id": "e-s12-t12",
"ids": ["s12", "t12"],
"labels": [],
"manual": false
},
"e-s13-t13": {
"id": "e-s13-t13",
"ids": ["s13", "t13"],
"labels": [],
"manual": false
},
"e-s14-t14": {
"id": "e-s14-t14",
"ids": ["s14", "t14"],
"labels": [],
"manual": false
},
"e-s15-t15": {
"id": "e-s15-t15",
"ids": ["s15", "t15"],
"labels": [],
"manual": false
},
"e-s16-t16": {
"id": "e-s16-t16",
"ids": ["s16", "t16"],
"labels": [],
"manual": false
},
"e-s17-t17": {
"id": "e-s17-t17",
"ids": ["s17", "t17"],
"labels": [],
"manual": false
},
"e-s18-t18": {
"id": "e-s18-t18",
"ids": ["s18", "t18"],
"labels": [],
"manual": false
},
"e-s19-t19": {
"id": "e-s19-t19",
"ids": ["s19", "t19"],
"labels": [],
"manual": false
},
"e-s20-t20": {
"id": "e-s20-t20",
"ids": ["s20", "t20"],
"labels": [],
"manual": false
},
"e-s21-t21": {
"id": "e-s21-t21",
"ids": ["s21", "t21"],
"labels": [],
"manual": false
},
"e-s22-t22": {
"id": "e-s22-t22",
"ids": ["s22", "t22"],
"labels": [],
"manual": false
},
"e-s23-t23": {
"id": "e-s23-t23",
"ids": ["s23", "t23"],
"labels": [],
"manual": false
},
"e-s24-t24": {
"id": "e-s24-t24",
"ids": ["s24", "t24"],
"labels": [],
"manual": false
},
"e-s25-t25": {
"id": "e-s25-t25",
"ids": ["s25", "t25"],
"labels": [],
"manual": false
},
"e-s26-t26": {
"id": "e-s26-t26",
"ids": ["s26", "t26"],
"labels": [],
"manual": false
},
"e-s27-t122": {
"id": "e-s27-t122",
"ids": ["s27", "t122"],
"labels": [],
"manual": false
},
"e-s28-t28": {
"id": "e-s28-t28",
"ids": ["s28", "t28"],
"labels": [],
"manual": false
},
"e-s29-t29": {
"id": "e-s29-t29",
"ids": ["s29", "t29"],
"labels": [],
"manual": false
},
"e-s30-t30": {
"id": "e-s30-t30",
"ids": ["s30", "t30"],
"labels": [],
"manual": false
},
"e-s31-t31": {
"id": "e-s31-t31",
"ids": ["s31", "t31"],
"labels": [],
"manual": false
},
"e-s32-t32": {
"id": "e-s32-t32",
"ids": ["s32", "t32"],
"labels": [],
"manual": false
},
"e-s33-t33": {
"id": "e-s33-t33",
"ids": ["s33", "t33"],
"labels": [],
"manual": false
},
"e-s34-t34": {
"id": "e-s34-t34",
"ids": ["s34", "t34"],
"labels": [],
"manual": false
},
"e-s35-t35": {
"id": "e-s35-t35",
"ids": ["s35", "t35"],
"labels": [],
"manual": false
},
"e-s36-t36": {
"id": "e-s36-t36",
"ids": ["s36", "t36"],
"labels": [],
"manual": false
},
"e-s37-t37": {
"id": "e-s37-t37",
"ids": ["s37", "t37"],
"labels": [],
"manual": false
},
"e-s38-t38": {
"id": "e-s38-t38",
"ids": ["s38", "t38"],
"labels": [],
"manual": false
},
"e-s39-t39": {
"id": "e-s39-t39",
"ids": ["s39", "t39"],
"labels": [],
"manual": false
},
"e-s40-t40": {
"id": "e-s40-t40",
"ids": ["s40", "t40"],
"labels": [],
"manual": false
},
"e-s41-t41": {
"id": "e-s41-t41",
"ids": ["s41", "t41"],
"labels": [],
"manual": false
},
"e-s42-t42": {
"id": "e-s42-t42",
"ids": ["s42", "t42"],
"labels": [],
"manual": false
},
"e-s43-t43": {
"id": "e-s43-t43",
"ids": ["s43", "t43"],
"labels": [],
"manual": false
},
"e-s47-t137": {
"id": "e-s47-t137",
"ids": ["s47", "t137"],
"labels": [],
"manual": false
},
"e-s48-t48": {
"id": "e-s48-t48",
"ids": ["s48", "t48"],
"labels": [],
"manual": false
},
"e-s49-t139": {
"id": "e-s49-t139",
"ids": ["s49", "t139"],
"labels": ["Z-MV"],
"manual": false
},
"e-s50-t50": {
"id": "e-s50-t50",
"ids": ["s50", "t50"],
"labels": [],
"manual": false
},
"e-s51-t51": {
"id": "e-s51-t51",
"ids": ["s51", "t51"],
"labels": [],
"manual": false
},
"e-s52-t52": {
"id": "e-s52-t52",
"ids": ["s52", "t52"],
"labels": [],
"manual": false
},
"e-s53-t140": {
"id": "e-s53-t140",
"ids": ["s53", "t140"],
"labels": ["Z-CRK"],
"manual": false
},
"e-s54-t54": {
"id": "e-s54-t54",
"ids": ["s54", "t54"],
"labels": [],
"manual": false
},
"e-s55-t55": {
"id": "e-s55-t55",
"ids": ["s55", "t55"],
"labels": [],
"manual": false
},
"e-s56-t56": {
"id": "e-s56-t56",
"ids": ["s56", "t56"],
"labels": [],
"manual": false
},
"e-s57-t57": {
"id": "e-s57-t57",
"ids": ["s57", "t57"],
"labels": [],
"manual": false
},
"e-s58-t58": {
"id": "e-s58-t58",
"ids": ["s58", "t58"],
"labels": [],
"manual": false
},
"e-s59-t141": {
"id": "e-s59-t141",
"ids": ["s59", "t141"],
"labels": ["Z-CRK"],
"manual": false
},
"e-s62-t151": {
"id": "e-s62-t151",
"ids": ["s62", "t151"],
"labels": [],
"manual": false
},
"e-s63-t207": {
"id": "e-s63-t207",
"ids": ["s63", "t207"],
"labels": ["Z-CRK"],
"manual": false
},
"e-s64-t64": {
"id": "e-s64-t64",
"ids": ["s64", "t64"],
"labels": [],
"manual": false
},
"e-s65-t65": {
"id": "e-s65-t65",
"ids": ["s65", "t65"],
"labels": [],
"manual": false
},
"e-s66-t66": {
"id": "e-s66-t66",
"ids": ["s66", "t66"],
"labels": [],
"manual": false
},
"e-s67-t67": {
"id": "e-s67-t67",
"ids": ["s67", "t67"],
"labels": [],
"manual": false
},
"e-s68-t68": {
"id": "e-s68-t68",
"ids": ["s68", "t68"],
"labels": [],
"manual": false
},
"e-s69-t69": {
"id": "e-s69-t69",
"ids": ["s69", "t69"],
"labels": [],
"manual": false
},
"e-s70-t70": {
"id": "e-s70-t70",
"ids": ["s70", "t70"],
"labels": [],
"manual": false
},
"e-s71-t71": {
"id": "e-s71-t71",
"ids": ["s71", "t71"],
"labels": [],
"manual": false
},
"e-s72-t153": {
"id": "e-s72-t153",
"ids": ["s72", "t153"],
"labels": ["Z-MV"],
"manual": false
},
"e-s73-t154": {
"id": "e-s73-t154",
"ids": ["s73", "t154"],
"labels": ["O-SAM"],
"manual": false
},
"e-s74-t74": {
"id": "e-s74-t74",
"ids": ["s74", "t74"],
"labels": [],
"manual": false
},
"e-s75-t75": {
"id": "e-s75-t75",
"ids": ["s75", "t75"],
"labels": [],
"manual": false
},
"e-s76-t76": {
"id": "e-s76-t76",
"ids": ["s76", "t76"],
"labels": [],
"manual": false
},
"e-s77-t77": {
"id": "e-s77-t77",
"ids": ["s77", "t77"],
"labels": [],
"manual": false
},
"e-s78-t78": {
"id": "e-s78-t78",
"ids": ["s78", "t78"],
"labels": [],
"manual": false
},
"e-s79-t79": {
"id": "e-s79-t79",
"ids": ["s79", "t79"],
"labels": [],
"manual": false
},
"e-s80-t80": {
"id": "e-s80-t80",
"ids": ["s80", "t80"],
"labels": [],
"manual": false
},
"e-s81-t81": {
"id": "e-s81-t81",
"ids": ["s81", "t81"],
"labels": [],
"manual": false
},
"e-s82-t157": {
"id": "e-s82-t157",
"ids": ["s82", "t157"],
"labels": [],
"manual": false
},
"e-s83-t83": {
"id": "e-s83-t83",
"ids": ["s83", "t83"],
"labels": [],
"manual": false
},
"e-s84-t84": {
"id": "e-s84-t84",
"ids": ["s84", "t84"],
"labels": [],
"manual": false
},
"e-s85-t85": {
"id": "e-s85-t85",
"ids": ["s85", "t85"],
"labels": [],
"manual": false
},
"e-s86-t160": {
"id": "e-s86-t160",
"ids": ["s86", "t160"],
"labels": [],
"manual": false
},
"e-s87-t162": {
"id": "e-s87-t162",
"ids": ["s87", "t162"],
"labels": ["O-PRID"],
"manual": false
},
"e-s88-t88": {
"id": "e-s88-t88",
"ids": ["s88", "t88"],
"labels": [],
"manual": false
},
"e-s90-t165": {
"id": "e-s90-t165",
"ids": ["s90", "t165"],
"labels": ["O-GLAG"],
"manual": false
},
"e-s91-t169": {
"id": "e-s91-t169",
"ids": ["s91", "t169"],
"labels": ["B-VEZ"],
"manual": false
},
"e-s92-t92": {
"id": "e-s92-t92",
"ids": ["s92", "t92"],
"labels": [],
"manual": false
},
"e-s93-t93": {
"id": "e-s93-t93",
"ids": ["s93", "t93"],
"labels": [],
"manual": false
},
"e-s94-t94": {
"id": "e-s94-t94",
"ids": ["s94", "t94"],
"labels": [],
"manual": false
},
"e-s95-t171": {
"id": "e-s95-t171",
"ids": ["s95", "t171"],
"labels": ["O-SAM"],
"manual": false
},
"e-s96-t180": {
"id": "e-s96-t180",
"ids": ["s96", "t180"],
"labels": ["B-PRID"],
"manual": false
},
"e-s97-t185": {
"id": "e-s97-t185",
"ids": ["s97", "t185"],
"labels": ["B-SAM"],
"manual": false
},
"e-s98-t98": {
"id": "e-s98-t98",
"ids": ["s98", "t98"],
"labels": [],
"manual": false
},
"e-s99-t99": {
"id": "e-s99-t99",
"ids": ["s99", "t99"],
"labels": [],
"manual": false
},
"e-s100-t100": {
"id": "e-s100-t100",
"ids": ["s100", "t100"],
"labels": [],
"manual": false
},
"e-s101-t101": {
"id": "e-s101-t101",
"ids": ["s101", "t101"],
"labels": [],
"manual": false
},
"e-s102-t102": {
"id": "e-s102-t102",
"ids": ["s102", "t102"],
"labels": [],
"manual": false
},
"e-s103-t103": {
"id": "e-s103-t103",
"ids": ["s103", "t103"],
"labels": [],
"manual": false
},
"e-s104-t104": {
"id": "e-s104-t104",
"ids": ["s104", "t104"],
"labels": [],
"manual": false
},
"e-s105-t105": {
"id": "e-s105-t105",
"ids": ["s105", "t105"],
"labels": [],
"manual": false
},
"e-s106-t106": {
"id": "e-s106-t106",
"ids": ["s106", "t106"],
"labels": [],
"manual": false
},
"e-s107-t107": {
"id": "e-s107-t107",
"ids": ["s107", "t107"],
"labels": [],
"manual": false
},
"e-s108-t108": {
"id": "e-s108-t108",
"ids": ["s108", "t108"],
"labels": [],
"manual": false
},
"e-s110-t192": {
"id": "e-s110-t192",
"ids": ["s110", "t192"],
"labels": [],
"manual": false
},
"e-s111-t111": {
"id": "e-s111-t111",
"ids": ["s111", "t111"],
"labels": [],
"manual": false
},
"e-s112-t112": {
"id": "e-s112-t112",
"ids": ["s112", "t112"],
"labels": [],
"manual": false
},
"e-s113-t208": {
"id": "e-s113-t208",
"ids": ["s113", "t208"],
"labels": ["O-SAM", "Z-LOC"],
"manual": false
},
"e-s114-t201": {
"id": "e-s114-t201",
"ids": ["s114", "t201"],
"labels": ["O-PRID"],
"manual": false
},
"e-s115-t203": {
"id": "e-s115-t203",
"ids": ["s115", "t203"],
"labels": ["O-PRID"],
"manual": false
},
"e-s116-t205": {
"id": "e-s116-t205",
"ids": ["s116", "t205"],
"labels": ["O-PRID"],
"manual": false
},
"e-s117-t117": {
"id": "e-s117-t117",
"ids": ["s117", "t117"],
"labels": [],
"manual": false
},
"e-t121": {
"id": "e-t121",
"ids": ["t121"],
"labels": ["Z-LOC"],
"manual": false
},
"e-t146": {
"id": "e-t146",
"ids": ["t146"],
"labels": ["Z-LOC"],
"manual": false
},
"e-t156": {
"id": "e-t156",
"ids": ["t156"],
"labels": ["Z-LOC"],
"manual": false
},
"e-t167": {
"id": "e-t167",
"ids": ["t167"],
"labels": ["Z-LOC"],
"manual": false
}
}
}

View File

@ -0,0 +1,551 @@
{
"source": [
{"id": "s0", "text": "Ko "},
{"id": "s1", "text": "sem "},
{"id": "s2", "text": "imela "},
{"id": "s3", "text": "štirinajst "},
{"id": "s4", "text": "let "},
{"id": "s5", "text": "sem "},
{"id": "s6", "text": "šla "},
{"id": "s7", "text": "v "},
{"id": "s8", "text": "Španijo "},
{"id": "s9", "text": "z "},
{"id": "s10", "text": "svojo "},
{"id": "s11", "text": "plesno "},
{"id": "s12", "text": "skupino "},
{"id": "s13", "text": "in "},
{"id": "s14", "text": "to "},
{"id": "s15", "text": "je "},
{"id": "s16", "text": "še "},
{"id": "s17", "text": "vedno "},
{"id": "s18", "text": "moje "},
{"id": "s19", "text": "najlepše "},
{"id": "s20", "text": "potovanje "},
{"id": "s21", "text": ". "},
{"id": "s22", "text": "Prenočevali "},
{"id": "s23", "text": "smo "},
{"id": "s24", "text": "v "},
{"id": "s25", "text": "lepom "},
{"id": "s26", "text": "hotelu "},
{"id": "s27", "text": "ampak "},
{"id": "s28", "text": "ne "},
{"id": "s29", "text": "morem "},
{"id": "s30", "text": "se "},
{"id": "s31", "text": "spomniti "},
{"id": "s32", "text": "kako "},
{"id": "s33", "text": "je "},
{"id": "s34", "text": "ime "},
{"id": "s35", "text": "mesta "},
{"id": "s36", "text": ". "},
{"id": "s37", "text": "Hrana "},
{"id": "s38", "text": "mi "},
{"id": "s39", "text": "je "},
{"id": "s40", "text": "bila "},
{"id": "s41", "text": "zelo "},
{"id": "s42", "text": "všeč "},
{"id": "s43", "text": "in "},
{"id": "s44", "text": "spomnim "},
{"id": "s45", "text": "se "},
{"id": "s46", "text": "sladoleda "},
{"id": "s47", "text": ", "},
{"id": "s48", "text": "ki "},
{"id": "s49", "text": "sem "},
{"id": "s50", "text": "tam "},
{"id": "s51", "text": "jedla "},
{"id": "s52", "text": ". "},
{"id": "s53", "text": "Vsak "},
{"id": "s54", "text": "dan "},
{"id": "s55", "text": "smo "},
{"id": "s56", "text": "šli "},
{"id": "s57", "text": "na "},
{"id": "s58", "text": "plažo "},
{"id": "s59", "text": "in "},
{"id": "s60", "text": "nisem "},
{"id": "s61", "text": "skrbila "},
{"id": "s62", "text": "za "},
{"id": "s63", "text": "svoje "},
{"id": "s64", "text": "probleme "},
{"id": "s65", "text": ". "}
],
"target": [
{"id": "t0", "text": "Ko "},
{"id": "t1", "text": "sem "},
{"id": "t2", "text": "imela "},
{"id": "t3", "text": "štirinajst "},
{"id": "t4", "text": "let "},
{"id": "t67", "text": ". "},
{"id": "t68", "text": "sem "},
{"id": "t6", "text": "šla "},
{"id": "t7", "text": "v "},
{"id": "t8", "text": "Španijo "},
{"id": "t70", "text": "s "},
{"id": "t10", "text": "svojo "},
{"id": "t11", "text": "plesno "},
{"id": "t12", "text": "skupino "},
{"id": "t13", "text": "in "},
{"id": "t14", "text": "to "},
{"id": "t15", "text": "je "},
{"id": "t16", "text": "še "},
{"id": "t17", "text": "vedno "},
{"id": "t18", "text": "moje "},
{"id": "t19", "text": "najlepše "},
{"id": "t20", "text": "potovanje "},
{"id": "t21", "text": ". "},
{"id": "t22", "text": "Prenočevali "},
{"id": "t23", "text": "smo "},
{"id": "t24", "text": "v "},
{"id": "t72", "text": "lepem "},
{"id": "t26", "text": "hotelu "},
{"id": "t74", "text": ", "},
{"id": "t75", "text": "ampak "},
{"id": "t28", "text": "ne "},
{"id": "t29", "text": "morem "},
{"id": "t30", "text": "se "},
{"id": "t31", "text": "spomniti "},
{"id": "t77", "text": ", "},
{"id": "t78", "text": "kako "},
{"id": "t33", "text": "je "},
{"id": "t34", "text": "ime "},
{"id": "t35", "text": "mesta "},
{"id": "t36", "text": ". "},
{"id": "t37", "text": "Hrana "},
{"id": "t38", "text": "mi "},
{"id": "t39", "text": "je "},
{"id": "t40", "text": "bila "},
{"id": "t41", "text": "zelo "},
{"id": "t42", "text": "všeč "},
{"id": "t43", "text": "in "},
{"id": "t44", "text": "spomnim "},
{"id": "t45", "text": "se "},
{"id": "t46", "text": "sladoleda "},
{"id": "t47", "text": ", "},
{"id": "t48", "text": "ki "},
{"id": "t49", "text": "sem "},
{"id": "t82", "text": "ga "},
{"id": "t81", "text": "tam "},
{"id": "t51", "text": "jedla "},
{"id": "t52", "text": ". "},
{"id": "t53", "text": "Vsak "},
{"id": "t54", "text": "dan "},
{"id": "t55", "text": "smo "},
{"id": "t56", "text": "šli "},
{"id": "t57", "text": "na "},
{"id": "t58", "text": "plažo "},
{"id": "t59", "text": "in "},
{"id": "t86", "text": "moji "},
{"id": "t96", "text": "problemi "},
{"id": "t100", "text": "me "},
{"id": "t106", "text": "niso "},
{"id": "t115", "text": "skrbeli "},
{"id": "t109", "text": ". "}
],
"edges": {
"e-s9-t70": {
"id": "e-s9-t70",
"ids": ["s9", "t70"],
"labels": ["Z-CRK"],
"manual": true
},
"e-s60-s61-s62-s63-s64-t100-t106-t115-t86-t96": {
"id": "e-s60-s61-s62-s63-s64-t100-t106-t115-t86-t96",
"ids": [
"s60",
"s61",
"s62",
"s63",
"s64",
"t100",
"t106",
"t115",
"t86",
"t96"
],
"labels": ["S-STR"],
"manual": true
},
"e-s0-t0": {
"id": "e-s0-t0",
"ids": ["s0", "t0"],
"labels": [],
"manual": false
},
"e-s1-t1": {
"id": "e-s1-t1",
"ids": ["s1", "t1"],
"labels": [],
"manual": false
},
"e-s2-t2": {
"id": "e-s2-t2",
"ids": ["s2", "t2"],
"labels": [],
"manual": false
},
"e-s3-t3": {
"id": "e-s3-t3",
"ids": ["s3", "t3"],
"labels": [],
"manual": false
},
"e-s4-t4": {
"id": "e-s4-t4",
"ids": ["s4", "t4"],
"labels": [],
"manual": false
},
"e-s5-t68": {
"id": "e-s5-t68",
"ids": ["s5", "t68"],
"labels": [],
"manual": false
},
"e-s6-t6": {
"id": "e-s6-t6",
"ids": ["s6", "t6"],
"labels": [],
"manual": false
},
"e-s7-t7": {
"id": "e-s7-t7",
"ids": ["s7", "t7"],
"labels": [],
"manual": false
},
"e-s8-t8": {
"id": "e-s8-t8",
"ids": ["s8", "t8"],
"labels": [],
"manual": false
},
"e-s10-t10": {
"id": "e-s10-t10",
"ids": ["s10", "t10"],
"labels": [],
"manual": false
},
"e-s11-t11": {
"id": "e-s11-t11",
"ids": ["s11", "t11"],
"labels": [],
"manual": false
},
"e-s12-t12": {
"id": "e-s12-t12",
"ids": ["s12", "t12"],
"labels": [],
"manual": false
},
"e-s13-t13": {
"id": "e-s13-t13",
"ids": ["s13", "t13"],
"labels": [],
"manual": false
},
"e-s14-t14": {
"id": "e-s14-t14",
"ids": ["s14", "t14"],
"labels": [],
"manual": false
},
"e-s15-t15": {
"id": "e-s15-t15",
"ids": ["s15", "t15"],
"labels": [],
"manual": false
},
"e-s16-t16": {
"id": "e-s16-t16",
"ids": ["s16", "t16"],
"labels": [],
"manual": false
},
"e-s17-t17": {
"id": "e-s17-t17",
"ids": ["s17", "t17"],
"labels": [],
"manual": false
},
"e-s18-t18": {
"id": "e-s18-t18",
"ids": ["s18", "t18"],
"labels": [],
"manual": false
},
"e-s19-t19": {
"id": "e-s19-t19",
"ids": ["s19", "t19"],
"labels": [],
"manual": false
},
"e-s20-t20": {
"id": "e-s20-t20",
"ids": ["s20", "t20"],
"labels": [],
"manual": false
},
"e-s21-t21": {
"id": "e-s21-t21",
"ids": ["s21", "t21"],
"labels": [],
"manual": false
},
"e-s22-t22": {
"id": "e-s22-t22",
"ids": ["s22", "t22"],
"labels": [],
"manual": false
},
"e-s23-t23": {
"id": "e-s23-t23",
"ids": ["s23", "t23"],
"labels": [],
"manual": false
},
"e-s24-t24": {
"id": "e-s24-t24",
"ids": ["s24", "t24"],
"labels": [],
"manual": false
},
"e-s25-t72": {
"id": "e-s25-t72",
"ids": ["s25", "t72"],
"labels": ["O-PRID"],
"manual": false
},
"e-s26-t26": {
"id": "e-s26-t26",
"ids": ["s26", "t26"],
"labels": [],
"manual": false
},
"e-s27-t75": {
"id": "e-s27-t75",
"ids": ["s27", "t75"],
"labels": [],
"manual": false
},
"e-s28-t28": {
"id": "e-s28-t28",
"ids": ["s28", "t28"],
"labels": [],
"manual": false
},
"e-s29-t29": {
"id": "e-s29-t29",
"ids": ["s29", "t29"],
"labels": [],
"manual": false
},
"e-s30-t30": {
"id": "e-s30-t30",
"ids": ["s30", "t30"],
"labels": [],
"manual": false
},
"e-s31-t31": {
"id": "e-s31-t31",
"ids": ["s31", "t31"],
"labels": [],
"manual": false
},
"e-s32-t78": {
"id": "e-s32-t78",
"ids": ["s32", "t78"],
"labels": [],
"manual": false
},
"e-s33-t33": {
"id": "e-s33-t33",
"ids": ["s33", "t33"],
"labels": [],
"manual": false
},
"e-s34-t34": {
"id": "e-s34-t34",
"ids": ["s34", "t34"],
"labels": [],
"manual": false
},
"e-s35-t35": {
"id": "e-s35-t35",
"ids": ["s35", "t35"],
"labels": [],
"manual": false
},
"e-s36-t36": {
"id": "e-s36-t36",
"ids": ["s36", "t36"],
"labels": [],
"manual": false
},
"e-s37-t37": {
"id": "e-s37-t37",
"ids": ["s37", "t37"],
"labels": [],
"manual": false
},
"e-s38-t38": {
"id": "e-s38-t38",
"ids": ["s38", "t38"],
"labels": [],
"manual": false
},
"e-s39-t39": {
"id": "e-s39-t39",
"ids": ["s39", "t39"],
"labels": [],
"manual": false
},
"e-s40-t40": {
"id": "e-s40-t40",
"ids": ["s40", "t40"],
"labels": [],
"manual": false
},
"e-s41-t41": {
"id": "e-s41-t41",
"ids": ["s41", "t41"],
"labels": [],
"manual": false
},
"e-s42-t42": {
"id": "e-s42-t42",
"ids": ["s42", "t42"],
"labels": [],
"manual": false
},
"e-s43-t43": {
"id": "e-s43-t43",
"ids": ["s43", "t43"],
"labels": [],
"manual": false
},
"e-s44-t44": {
"id": "e-s44-t44",
"ids": ["s44", "t44"],
"labels": [],
"manual": false
},
"e-s45-t45": {
"id": "e-s45-t45",
"ids": ["s45", "t45"],
"labels": [],
"manual": false
},
"e-s46-t46": {
"id": "e-s46-t46",
"ids": ["s46", "t46"],
"labels": [],
"manual": false
},
"e-s47-t47": {
"id": "e-s47-t47",
"ids": ["s47", "t47"],
"labels": [],
"manual": false
},
"e-s48-t48": {
"id": "e-s48-t48",
"ids": ["s48", "t48"],
"labels": [],
"manual": false
},
"e-s49-t49": {
"id": "e-s49-t49",
"ids": ["s49", "t49"],
"labels": [],
"manual": false
},
"e-s50-t81": {
"id": "e-s50-t81",
"ids": ["s50", "t81"],
"labels": [],
"manual": false
},
"e-s51-t51": {
"id": "e-s51-t51",
"ids": ["s51", "t51"],
"labels": [],
"manual": false
},
"e-s52-t52": {
"id": "e-s52-t52",
"ids": ["s52", "t52"],
"labels": [],
"manual": false
},
"e-s53-t53": {
"id": "e-s53-t53",
"ids": ["s53", "t53"],
"labels": [],
"manual": false
},
"e-s54-t54": {
"id": "e-s54-t54",
"ids": ["s54", "t54"],
"labels": [],
"manual": false
},
"e-s55-t55": {
"id": "e-s55-t55",
"ids": ["s55", "t55"],
"labels": [],
"manual": false
},
"e-s56-t56": {
"id": "e-s56-t56",
"ids": ["s56", "t56"],
"labels": [],
"manual": false
},
"e-s57-t57": {
"id": "e-s57-t57",
"ids": ["s57", "t57"],
"labels": [],
"manual": false
},
"e-s58-t58": {
"id": "e-s58-t58",
"ids": ["s58", "t58"],
"labels": [],
"manual": false
},
"e-s59-t59": {
"id": "e-s59-t59",
"ids": ["s59", "t59"],
"labels": [],
"manual": false
},
"e-s65-t109": {
"id": "e-s65-t109",
"ids": ["s65", "t109"],
"labels": [],
"manual": false
},
"e-t67": {
"id": "e-t67",
"ids": ["t67"],
"labels": ["Z-LOC"],
"manual": false
},
"e-t74": {
"id": "e-t74",
"ids": ["t74"],
"labels": ["Z-LOC"],
"manual": false
},
"e-t77": {
"id": "e-t77",
"ids": ["t77"],
"labels": ["Z-LOC"],
"manual": false
},
"e-t82": {
"id": "e-t82",
"ids": ["t82"],
"labels": ["S-IZP"],
"manual": false
}
}
}

View File

@ -0,0 +1,832 @@
{
"source": [
{"id": "s0", "text": "Spoznali "},
{"id": "s1", "text": "smo "},
{"id": "s2", "text": "veliko "},
{"id": "s3", "text": "novih "},
{"id": "s4", "text": "oseb "},
{"id": "s5", "text": "iz "},
{"id": "s6", "text": "drugih "},
{"id": "s7", "text": "držav "},
{"id": "s8", "text": ", "},
{"id": "s9", "text": "ampak "},
{"id": "s10", "text": "to "},
{"id": "s11", "text": "potovanje "},
{"id": "s12", "text": "je "},
{"id": "s13", "text": "bilo "},
{"id": "s14", "text": "posebno "},
{"id": "s15", "text": ", "},
{"id": "s16", "text": "ker "},
{"id": "s17", "text": "sem "},
{"id": "s18", "text": "bila "},
{"id": "s19", "text": "z "},
{"id": "s20", "text": "svoji "},
{"id": "s21", "text": "prijatelji "},
{"id": "s22", "text": ", "},
{"id": "s23", "text": "s "},
{"id": "s24", "text": "kateri "},
{"id": "s25", "text": "se "},
{"id": "s26", "text": "ne "},
{"id": "s27", "text": "vidim "},
{"id": "s28", "text": "več "},
{"id": "s29", "text": ". "},
{"id": "s30", "text": "Po "},
{"id": "s31", "text": "povratku "},
{"id": "s32", "text": "doma "},
{"id": "s33", "text": "nam "},
{"id": "s34", "text": "se "},
{"id": "s35", "text": "pokvaril "},
{"id": "s36", "text": "avtobus "},
{"id": "s37", "text": "v "},
{"id": "s38", "text": "Italiji "},
{"id": "s39", "text": "in "},
{"id": "s40", "text": "smo "},
{"id": "s41", "text": "na "},
{"id": "s42", "text": "eni "},
{"id": "s43", "text": "bencinski "},
{"id": "s44", "text": "črpalki "},
{"id": "s45", "text": "dolgo "},
{"id": "s46", "text": "čakali "},
{"id": "s47", "text": "da "},
{"id": "s48", "text": "novi "},
{"id": "s49", "text": "avtobus "},
{"id": "s50", "text": "pride "},
{"id": "s51", "text": "po "},
{"id": "s52", "text": "nas "},
{"id": "s53", "text": ". "},
{"id": "s54", "text": "Bilo "},
{"id": "s55", "text": "je "},
{"id": "s56", "text": "hladno "},
{"id": "s57", "text": "in "},
{"id": "s58", "text": "bili "},
{"id": "s59", "text": "smo "},
{"id": "s60", "text": "lačni "},
{"id": "s61", "text": "in "},
{"id": "s62", "text": "utrujeni "},
{"id": "s63", "text": ", "},
{"id": "s64", "text": "ampak "},
{"id": "s65", "text": "smo "},
{"id": "s66", "text": "vsi "},
{"id": "s67", "text": "bili "},
{"id": "s68", "text": "skupaj "},
{"id": "s69", "text": ". "},
{"id": "s70", "text": "Ko "},
{"id": "s71", "text": "smo "},
{"id": "s72", "text": "čakali "},
{"id": "s73", "text": "na "},
{"id": "s74", "text": "bus "},
{"id": "s75", "text": ", "},
{"id": "s76", "text": "sem "},
{"id": "s77", "text": "bila "},
{"id": "s78", "text": "jezna "},
{"id": "s79", "text": "ker "},
{"id": "s80", "text": "sem "},
{"id": "s81", "text": "mislila "},
{"id": "s82", "text": "da "},
{"id": "s83", "text": "je "},
{"id": "s84", "text": "to "},
{"id": "s85", "text": "bil "},
{"id": "s86", "text": "slab "},
{"id": "s87", "text": "konec "},
{"id": "s88", "text": "mojih "},
{"id": "s89", "text": "počitnic "},
{"id": "s90", "text": ", "},
{"id": "s91", "text": "ampak "},
{"id": "s92", "text": "zdaj "},
{"id": "s93", "text": "mi "},
{"id": "s94", "text": "se "},
{"id": "s95", "text": "zdi "},
{"id": "s96", "text": "kot "},
{"id": "s97", "text": "ena "},
{"id": "s98", "text": "super "},
{"id": "s99", "text": "zgodba "},
{"id": "s100", "text": ". "}
],
"target": [
{"id": "t0", "text": "Spoznali "},
{"id": "t1", "text": "smo "},
{"id": "t2", "text": "veliko "},
{"id": "t3", "text": "novih "},
{"id": "t4", "text": "oseb "},
{"id": "t5", "text": "iz "},
{"id": "t6", "text": "drugih "},
{"id": "t7", "text": "držav "},
{"id": "t8", "text": ", "},
{"id": "t9", "text": "ampak "},
{"id": "t10", "text": "to "},
{"id": "t11", "text": "potovanje "},
{"id": "t12", "text": "je "},
{"id": "t13", "text": "bilo "},
{"id": "t14", "text": "posebno "},
{"id": "t15", "text": ", "},
{"id": "t16", "text": "ker "},
{"id": "t17", "text": "sem "},
{"id": "t102", "text": "bila "},
{"id": "t103", "text": "s "},
{"id": "t105", "text": "svojimi "},
{"id": "t21", "text": "prijatelji "},
{"id": "t22", "text": ", "},
{"id": "t23", "text": "s "},
{"id": "t107", "text": "katerimi "},
{"id": "t25", "text": "se "},
{"id": "t26", "text": "ne "},
{"id": "t27", "text": "vidim "},
{"id": "t28", "text": "več "},
{"id": "t29", "text": ". "},
{"id": "t30", "text": "Po "},
{"id": "t31", "text": "povratku "},
{"id": "t110", "text": "domov "},
{"id": "t114", "text": "se "},
{"id": "t113", "text": "nam "},
{"id": "t117", "text": "je "},
{"id": "t35", "text": "pokvaril "},
{"id": "t36", "text": "avtobus "},
{"id": "t37", "text": "v "},
{"id": "t38", "text": "Italiji "},
{"id": "t39", "text": "in "},
{"id": "t40", "text": "smo "},
{"id": "t41", "text": "na "},
{"id": "t121", "text": "neki "},
{"id": "t43", "text": "bencinski "},
{"id": "t44", "text": "črpalki "},
{"id": "t45", "text": "dolgo "},
{"id": "t46", "text": "čakali "},
{"id": "t123", "text": ", "},
{"id": "t124", "text": "da "},
{"id": "t48", "text": "novi "},
{"id": "t49", "text": "avtobus "},
{"id": "t50", "text": "pride "},
{"id": "t51", "text": "po "},
{"id": "t52", "text": "nas "},
{"id": "t53", "text": ". "},
{"id": "t54", "text": "Bilo "},
{"id": "t55", "text": "je "},
{"id": "t56", "text": "hladno "},
{"id": "t57", "text": "in "},
{"id": "t58", "text": "bili "},
{"id": "t59", "text": "smo "},
{"id": "t60", "text": "lačni "},
{"id": "t61", "text": "in "},
{"id": "t62", "text": "utrujeni "},
{"id": "t63", "text": ", "},
{"id": "t64", "text": "ampak "},
{"id": "t65", "text": "smo "},
{"id": "t66", "text": "vsi "},
{"id": "t67", "text": "bili "},
{"id": "t68", "text": "skupaj "},
{"id": "t69", "text": ". "},
{"id": "t70", "text": "Ko "},
{"id": "t71", "text": "smo "},
{"id": "t72", "text": "čakali "},
{"id": "t73", "text": "na "},
{"id": "t74", "text": "bus "},
{"id": "t75", "text": ", "},
{"id": "t76", "text": "sem "},
{"id": "t77", "text": "bila "},
{"id": "t78", "text": "jezna "},
{"id": "t126", "text": ", "},
{"id": "t127", "text": "ker "},
{"id": "t80", "text": "sem "},
{"id": "t81", "text": "mislila "},
{"id": "t129", "text": ", "},
{"id": "t130", "text": "da "},
{"id": "t83", "text": "je "},
{"id": "t84", "text": "to "},
{"id": "t85", "text": "bil "},
{"id": "t86", "text": "slab "},
{"id": "t87", "text": "konec "},
{"id": "t88", "text": "mojih "},
{"id": "t89", "text": "počitnic "},
{"id": "t90", "text": ", "},
{"id": "t91", "text": "ampak "},
{"id": "t92", "text": "zdaj "},
{"id": "t133", "text": "se "},
{"id": "t134", "text": "mi "},
{"id": "t135", "text": "zdi "},
{"id": "t96", "text": "kot "},
{"id": "t97", "text": "ena "},
{"id": "t98", "text": "super "},
{"id": "t99", "text": "zgodba "},
{"id": "t100", "text": ". "}
],
"edges": {
"e-s19-t103": {
"id": "e-s19-t103",
"ids": ["s19", "t103"],
"labels": ["Z-CRK"],
"manual": true
},
"e-t117": {
"id": "e-t117",
"ids": ["t117"],
"labels": ["S-IZP"],
"manual": true
},
"e-s33-s34-t113-t114": {
"id": "e-s33-s34-t113-t114",
"ids": ["s33", "s34", "t113", "t114"],
"labels": ["S-BR"],
"manual": true
},
"e-s93-s94-t133-t134": {
"id": "e-s93-s94-t133-t134",
"ids": ["s93", "s94", "t133", "t134"],
"labels": ["S-BR"],
"manual": true
},
"e-s0-t0": {
"id": "e-s0-t0",
"ids": ["s0", "t0"],
"labels": [],
"manual": false
},
"e-s1-t1": {
"id": "e-s1-t1",
"ids": ["s1", "t1"],
"labels": [],
"manual": false
},
"e-s2-t2": {
"id": "e-s2-t2",
"ids": ["s2", "t2"],
"labels": [],
"manual": false
},
"e-s3-t3": {
"id": "e-s3-t3",
"ids": ["s3", "t3"],
"labels": [],
"manual": false
},
"e-s4-t4": {
"id": "e-s4-t4",
"ids": ["s4", "t4"],
"labels": [],
"manual": false
},
"e-s5-t5": {
"id": "e-s5-t5",
"ids": ["s5", "t5"],
"labels": [],
"manual": false
},
"e-s6-t6": {
"id": "e-s6-t6",
"ids": ["s6", "t6"],
"labels": [],
"manual": false
},
"e-s7-t7": {
"id": "e-s7-t7",
"ids": ["s7", "t7"],
"labels": [],
"manual": false
},
"e-s8-t8": {
"id": "e-s8-t8",
"ids": ["s8", "t8"],
"labels": [],
"manual": false
},
"e-s9-t9": {
"id": "e-s9-t9",
"ids": ["s9", "t9"],
"labels": [],
"manual": false
},
"e-s10-t10": {
"id": "e-s10-t10",
"ids": ["s10", "t10"],
"labels": [],
"manual": false
},
"e-s11-t11": {
"id": "e-s11-t11",
"ids": ["s11", "t11"],
"labels": [],
"manual": false
},
"e-s12-t12": {
"id": "e-s12-t12",
"ids": ["s12", "t12"],
"labels": [],
"manual": false
},
"e-s13-t13": {
"id": "e-s13-t13",
"ids": ["s13", "t13"],
"labels": [],
"manual": false
},
"e-s14-t14": {
"id": "e-s14-t14",
"ids": ["s14", "t14"],
"labels": [],
"manual": false
},
"e-s15-t15": {
"id": "e-s15-t15",
"ids": ["s15", "t15"],
"labels": [],
"manual": false
},
"e-s16-t16": {
"id": "e-s16-t16",
"ids": ["s16", "t16"],
"labels": [],
"manual": false
},
"e-s17-t17": {
"id": "e-s17-t17",
"ids": ["s17", "t17"],
"labels": [],
"manual": false
},
"e-s18-t102": {
"id": "e-s18-t102",
"ids": ["s18", "t102"],
"labels": [],
"manual": false
},
"e-s20-t105": {
"id": "e-s20-t105",
"ids": ["s20", "t105"],
"labels": ["O-ZAIM"],
"manual": false
},
"e-s21-t21": {
"id": "e-s21-t21",
"ids": ["s21", "t21"],
"labels": [],
"manual": false
},
"e-s22-t22": {
"id": "e-s22-t22",
"ids": ["s22", "t22"],
"labels": [],
"manual": false
},
"e-s23-t23": {
"id": "e-s23-t23",
"ids": ["s23", "t23"],
"labels": [],
"manual": false
},
"e-s24-t107": {
"id": "e-s24-t107",
"ids": ["s24", "t107"],
"labels": ["O-ZAIM"],
"manual": false
},
"e-s25-t25": {
"id": "e-s25-t25",
"ids": ["s25", "t25"],
"labels": [],
"manual": false
},
"e-s26-t26": {
"id": "e-s26-t26",
"ids": ["s26", "t26"],
"labels": [],
"manual": false
},
"e-s27-t27": {
"id": "e-s27-t27",
"ids": ["s27", "t27"],
"labels": [],
"manual": false
},
"e-s28-t28": {
"id": "e-s28-t28",
"ids": ["s28", "t28"],
"labels": [],
"manual": false
},
"e-s29-t29": {
"id": "e-s29-t29",
"ids": ["s29", "t29"],
"labels": [],
"manual": false
},
"e-s30-t30": {
"id": "e-s30-t30",
"ids": ["s30", "t30"],
"labels": [],
"manual": false
},
"e-s31-t31": {
"id": "e-s31-t31",
"ids": ["s31", "t31"],
"labels": [],
"manual": false
},
"e-s32-t110": {
"id": "e-s32-t110",
"ids": ["s32", "t110"],
"labels": ["B-PRISL"],
"manual": false
},
"e-s35-t35": {
"id": "e-s35-t35",
"ids": ["s35", "t35"],
"labels": [],
"manual": false
},
"e-s36-t36": {
"id": "e-s36-t36",
"ids": ["s36", "t36"],
"labels": [],
"manual": false
},
"e-s37-t37": {
"id": "e-s37-t37",
"ids": ["s37", "t37"],
"labels": [],
"manual": false
},
"e-s38-t38": {
"id": "e-s38-t38",
"ids": ["s38", "t38"],
"labels": [],
"manual": false
},
"e-s39-t39": {
"id": "e-s39-t39",
"ids": ["s39", "t39"],
"labels": [],
"manual": false
},
"e-s40-t40": {
"id": "e-s40-t40",
"ids": ["s40", "t40"],
"labels": [],
"manual": false
},
"e-s41-t41": {
"id": "e-s41-t41",
"ids": ["s41", "t41"],
"labels": [],
"manual": false
},
"e-s42-t121": {
"id": "e-s42-t121",
"ids": ["s42", "t121"],
"labels": ["B-ZAIM"],
"manual": false
},
"e-s43-t43": {
"id": "e-s43-t43",
"ids": ["s43", "t43"],
"labels": [],
"manual": false
},
"e-s44-t44": {
"id": "e-s44-t44",
"ids": ["s44", "t44"],
"labels": [],
"manual": false
},
"e-s45-t45": {
"id": "e-s45-t45",
"ids": ["s45", "t45"],
"labels": [],
"manual": false
},
"e-s46-t46": {
"id": "e-s46-t46",
"ids": ["s46", "t46"],
"labels": [],
"manual": false
},
"e-s47-t124": {
"id": "e-s47-t124",
"ids": ["s47", "t124"],
"labels": [],
"manual": false
},
"e-s48-t48": {
"id": "e-s48-t48",
"ids": ["s48", "t48"],
"labels": [],
"manual": false
},
"e-s49-t49": {
"id": "e-s49-t49",
"ids": ["s49", "t49"],
"labels": [],
"manual": false
},
"e-s50-t50": {
"id": "e-s50-t50",
"ids": ["s50", "t50"],
"labels": [],
"manual": false
},
"e-s51-t51": {
"id": "e-s51-t51",
"ids": ["s51", "t51"],
"labels": [],
"manual": false
},
"e-s52-t52": {
"id": "e-s52-t52",
"ids": ["s52", "t52"],
"labels": [],
"manual": false
},
"e-s53-t53": {
"id": "e-s53-t53",
"ids": ["s53", "t53"],
"labels": [],
"manual": false
},
"e-s54-t54": {
"id": "e-s54-t54",
"ids": ["s54", "t54"],
"labels": [],
"manual": false
},
"e-s55-t55": {
"id": "e-s55-t55",
"ids": ["s55", "t55"],
"labels": [],
"manual": false
},
"e-s56-t56": {
"id": "e-s56-t56",
"ids": ["s56", "t56"],
"labels": [],
"manual": false
},
"e-s57-t57": {
"id": "e-s57-t57",
"ids": ["s57", "t57"],
"labels": [],
"manual": false
},
"e-s58-t58": {
"id": "e-s58-t58",
"ids": ["s58", "t58"],
"labels": [],
"manual": false
},
"e-s59-t59": {
"id": "e-s59-t59",
"ids": ["s59", "t59"],
"labels": [],
"manual": false
},
"e-s60-t60": {
"id": "e-s60-t60",
"ids": ["s60", "t60"],
"labels": [],
"manual": false
},
"e-s61-t61": {
"id": "e-s61-t61",
"ids": ["s61", "t61"],
"labels": [],
"manual": false
},
"e-s62-t62": {
"id": "e-s62-t62",
"ids": ["s62", "t62"],
"labels": [],
"manual": false
},
"e-s63-t63": {
"id": "e-s63-t63",
"ids": ["s63", "t63"],
"labels": [],
"manual": false
},
"e-s64-t64": {
"id": "e-s64-t64",
"ids": ["s64", "t64"],
"labels": [],
"manual": false
},
"e-s65-t65": {
"id": "e-s65-t65",
"ids": ["s65", "t65"],
"labels": [],
"manual": false
},
"e-s66-t66": {
"id": "e-s66-t66",
"ids": ["s66", "t66"],
"labels": [],
"manual": false
},
"e-s67-t67": {
"id": "e-s67-t67",
"ids": ["s67", "t67"],
"labels": [],
"manual": false
},
"e-s68-t68": {
"id": "e-s68-t68",
"ids": ["s68", "t68"],
"labels": [],
"manual": false
},
"e-s69-t69": {
"id": "e-s69-t69",
"ids": ["s69", "t69"],
"labels": [],
"manual": false
},
"e-s70-t70": {
"id": "e-s70-t70",
"ids": ["s70", "t70"],
"labels": [],
"manual": false
},
"e-s71-t71": {
"id": "e-s71-t71",
"ids": ["s71", "t71"],
"labels": [],
"manual": false
},
"e-s72-t72": {
"id": "e-s72-t72",
"ids": ["s72", "t72"],
"labels": [],
"manual": false
},
"e-s73-t73": {
"id": "e-s73-t73",
"ids": ["s73", "t73"],
"labels": [],
"manual": false
},
"e-s74-t74": {
"id": "e-s74-t74",
"ids": ["s74", "t74"],
"labels": [],
"manual": false
},
"e-s75-t75": {
"id": "e-s75-t75",
"ids": ["s75", "t75"],
"labels": [],
"manual": false
},
"e-s76-t76": {
"id": "e-s76-t76",
"ids": ["s76", "t76"],
"labels": [],
"manual": false
},
"e-s77-t77": {
"id": "e-s77-t77",
"ids": ["s77", "t77"],
"labels": [],
"manual": false
},
"e-s78-t78": {
"id": "e-s78-t78",
"ids": ["s78", "t78"],
"labels": [],
"manual": false
},
"e-s79-t127": {
"id": "e-s79-t127",
"ids": ["s79", "t127"],
"labels": [],
"manual": false
},
"e-s80-t80": {
"id": "e-s80-t80",
"ids": ["s80", "t80"],
"labels": [],
"manual": false
},
"e-s81-t81": {
"id": "e-s81-t81",
"ids": ["s81", "t81"],
"labels": [],
"manual": false
},
"e-s82-t130": {
"id": "e-s82-t130",
"ids": ["s82", "t130"],
"labels": [],
"manual": false
},
"e-s83-t83": {
"id": "e-s83-t83",
"ids": ["s83", "t83"],
"labels": [],
"manual": false
},
"e-s84-t84": {
"id": "e-s84-t84",
"ids": ["s84", "t84"],
"labels": [],
"manual": false
},
"e-s85-t85": {
"id": "e-s85-t85",
"ids": ["s85", "t85"],
"labels": [],
"manual": false
},
"e-s86-t86": {
"id": "e-s86-t86",
"ids": ["s86", "t86"],
"labels": [],
"manual": false
},
"e-s87-t87": {
"id": "e-s87-t87",
"ids": ["s87", "t87"],
"labels": [],
"manual": false
},
"e-s88-t88": {
"id": "e-s88-t88",
"ids": ["s88", "t88"],
"labels": [],
"manual": false
},
"e-s89-t89": {
"id": "e-s89-t89",
"ids": ["s89", "t89"],
"labels": [],
"manual": false
},
"e-s90-t90": {
"id": "e-s90-t90",
"ids": ["s90", "t90"],
"labels": [],
"manual": false
},
"e-s91-t91": {
"id": "e-s91-t91",
"ids": ["s91", "t91"],
"labels": [],
"manual": false
},
"e-s92-t92": {
"id": "e-s92-t92",
"ids": ["s92", "t92"],
"labels": [],
"manual": false
},
"e-s95-t135": {
"id": "e-s95-t135",
"ids": ["s95", "t135"],
"labels": [],
"manual": false
},
"e-s96-t96": {
"id": "e-s96-t96",
"ids": ["s96", "t96"],
"labels": [],
"manual": false
},
"e-s97-t97": {
"id": "e-s97-t97",
"ids": ["s97", "t97"],
"labels": [],
"manual": false
},
"e-s98-t98": {
"id": "e-s98-t98",
"ids": ["s98", "t98"],
"labels": [],
"manual": false
},
"e-s99-t99": {
"id": "e-s99-t99",
"ids": ["s99", "t99"],
"labels": [],
"manual": false
},
"e-s100-t100": {
"id": "e-s100-t100",
"ids": ["s100", "t100"],
"labels": [],
"manual": false
},
"e-t123": {
"id": "e-t123",
"ids": ["t123"],
"labels": ["Z-LOC"],
"manual": false
},
"e-t126": {
"id": "e-t126",
"ids": ["t126"],
"labels": ["Z-LOC"],
"manual": false
},
"e-t129": {
"id": "e-t129",
"ids": ["t129"],
"labels": ["Z-LOC"],
"manual": false
}
}
}

View File

@ -0,0 +1,549 @@
{
"source": [
{"id": "s0", "text": "Zdaj "},
{"id": "s1", "text": "živim "},
{"id": "s2", "text": "v "},
{"id": "s3", "text": "Ljubljani "},
{"id": "s4", "text": ". "},
{"id": "s5", "text": "Moj "},
{"id": "s6", "text": "naslov "},
{"id": "s7", "text": "je "},
{"id": "s8", "text": "[XNaslovX] "},
{"id": "s9", "text": ". "},
{"id": "s10", "text": "Študiram "},
{"id": "s11", "text": "na "},
{"id": "s12", "text": "[XFakultetaX] "},
{"id": "s13", "text": ". "},
{"id": "s14", "text": "Študiram "},
{"id": "s15", "text": "[XStudijskaSmerX] "},
{"id": "s16", "text": "in "},
{"id": "s17", "text": "sem "},
{"id": "s18", "text": "1 "},
{"id": "s19", "text": "letnik "},
{"id": "s20", "text": "2 "},
{"id": "s21", "text": "stopnja "},
{"id": "s22", "text": ". "},
{"id": "s23", "text": "V "},
{"id": "s24", "text": "Ljubljani "},
{"id": "s25", "text": "sem "},
{"id": "s26", "text": "prišla "},
{"id": "s27", "text": "v "},
{"id": "s28", "text": "Septembru "},
{"id": "s29", "text": ". "},
{"id": "s30", "text": "Na "},
{"id": "s31", "text": "koncu "},
{"id": "s32", "text": "Septembra "},
{"id": "s33", "text": ". "},
{"id": "s34", "text": "Slovenija "},
{"id": "s35", "text": "mi "},
{"id": "s36", "text": "je "},
{"id": "s37", "text": "všeč "},
{"id": "s38", "text": ". "},
{"id": "s39", "text": "Tudi "},
{"id": "s40", "text": "Ljubjana "},
{"id": "s41", "text": ". "},
{"id": "s42", "text": "Dežala "},
{"id": "s43", "text": "je "},
{"id": "s44", "text": "zelo "},
{"id": "s45", "text": "slikovita "},
{"id": "s46", "text": ". "},
{"id": "s47", "text": "Ima "},
{"id": "s48", "text": "čudovito "},
{"id": "s49", "text": "naravo "},
{"id": "s50", "text": ". "},
{"id": "s51", "text": "Všeč "},
{"id": "s52", "text": "so "},
{"id": "s53", "text": "mi "},
{"id": "s54", "text": "centar "},
{"id": "s55", "text": "mesta "},
{"id": "s56", "text": ", "},
{"id": "s57", "text": "stari "},
{"id": "s58", "text": "grad "},
{"id": "s59", "text": ". "},
{"id": "s60", "text": "Vse "},
{"id": "s61", "text": "je "},
{"id": "s62", "text": "zeleno "},
{"id": "s63", "text": ", "},
{"id": "s64", "text": "hribe "},
{"id": "s65", "text": ", "},
{"id": "s66", "text": "reki "},
{"id": "s67", "text": ". "}
],
"target": [
{"id": "t0", "text": "Zdaj "},
{"id": "t1", "text": "živim "},
{"id": "t2", "text": "v "},
{"id": "t3", "text": "Ljubljani "},
{"id": "t4", "text": ". "},
{"id": "t5", "text": "Moj "},
{"id": "t6", "text": "naslov "},
{"id": "t7", "text": "je "},
{"id": "t8", "text": "[XNaslovX] "},
{"id": "t9", "text": ". "},
{"id": "t10", "text": "Študiram "},
{"id": "t11", "text": "na "},
{"id": "t12", "text": "[XFakultetaX] "},
{"id": "t13", "text": ". "},
{"id": "t14", "text": "Študiram "},
{"id": "t15", "text": "[XStudijskaSmerX] "},
{"id": "t16", "text": "in "},
{"id": "t17", "text": "sem "},
{"id": "t69", "text": "v "},
{"id": "t70", "text": "1 "},
{"id": "t72", "text": ". "},
{"id": "t74", "text": "letniku "},
{"id": "t20", "text": "2 "},
{"id": "t76", "text": ". "},
{"id": "t79", "text": "stopnje "},
{"id": "t22", "text": ". "},
{"id": "t23", "text": "V "},
{"id": "t81", "text": "Ljubljano "},
{"id": "t25", "text": "sem "},
{"id": "t26", "text": "prišla "},
{"id": "t27", "text": "v "},
{"id": "t83", "text": "septembru "},
{"id": "t29", "text": ". "},
{"id": "t30", "text": "Na "},
{"id": "t31", "text": "koncu "},
{"id": "t85", "text": "septembra "},
{"id": "t33", "text": ". "},
{"id": "t34", "text": "Slovenija "},
{"id": "t35", "text": "mi "},
{"id": "t36", "text": "je "},
{"id": "t37", "text": "všeč "},
{"id": "t38", "text": ". "},
{"id": "t39", "text": "Tudi "},
{"id": "t86", "text": "Ljubljana "},
{"id": "t41", "text": ". "},
{"id": "t88", "text": "Dežela "},
{"id": "t43", "text": "je "},
{"id": "t44", "text": "zelo "},
{"id": "t45", "text": "slikovita "},
{"id": "t46", "text": ". "},
{"id": "t47", "text": "Ima "},
{"id": "t48", "text": "čudovito "},
{"id": "t49", "text": "naravo "},
{"id": "t50", "text": ". "},
{"id": "t51", "text": "Všeč "},
{"id": "t52", "text": "so "},
{"id": "t53", "text": "mi "},
{"id": "t90", "text": "center "},
{"id": "t55", "text": "mesta "},
{"id": "t56", "text": ", "},
{"id": "t57", "text": "stari "},
{"id": "t58", "text": "grad "},
{"id": "t59", "text": ". "},
{"id": "t60", "text": "Vse "},
{"id": "t61", "text": "je "},
{"id": "t62", "text": "zeleno "},
{"id": "t63", "text": ", "},
{"id": "t92", "text": "hribi "},
{"id": "t65", "text": ", "},
{"id": "t94", "text": "reke "},
{"id": "t67", "text": ". "}
],
"edges": {
"e-s18-s19-s20-s21-t20-t69-t70-t72-t74-t76-t79": {
"id": "e-s18-s19-s20-s21-t20-t69-t70-t72-t74-t76-t79",
"ids": [
"s18",
"s19",
"s20",
"s21",
"t20",
"t69",
"t70",
"t72",
"t74",
"t76",
"t79"
],
"labels": ["O-SAM", "S-STR", "Z-LOC"],
"manual": true
},
"e-s0-t0": {
"id": "e-s0-t0",
"ids": ["s0", "t0"],
"labels": [],
"manual": false
},
"e-s1-t1": {
"id": "e-s1-t1",
"ids": ["s1", "t1"],
"labels": [],
"manual": false
},
"e-s2-t2": {
"id": "e-s2-t2",
"ids": ["s2", "t2"],
"labels": [],
"manual": false
},
"e-s3-t3": {
"id": "e-s3-t3",
"ids": ["s3", "t3"],
"labels": [],
"manual": false
},
"e-s4-t4": {
"id": "e-s4-t4",
"ids": ["s4", "t4"],
"labels": [],
"manual": false
},
"e-s5-t5": {
"id": "e-s5-t5",
"ids": ["s5", "t5"],
"labels": [],
"manual": false
},
"e-s6-t6": {
"id": "e-s6-t6",
"ids": ["s6", "t6"],
"labels": [],
"manual": false
},
"e-s7-t7": {
"id": "e-s7-t7",
"ids": ["s7", "t7"],
"labels": [],
"manual": false
},
"e-s8-t8": {
"id": "e-s8-t8",
"ids": ["s8", "t8"],
"labels": [],
"manual": false
},
"e-s9-t9": {
"id": "e-s9-t9",
"ids": ["s9", "t9"],
"labels": [],
"manual": false
},
"e-s10-t10": {
"id": "e-s10-t10",
"ids": ["s10", "t10"],
"labels": [],
"manual": false
},
"e-s11-t11": {
"id": "e-s11-t11",
"ids": ["s11", "t11"],
"labels": [],
"manual": false
},
"e-s12-t12": {
"id": "e-s12-t12",
"ids": ["s12", "t12"],
"labels": [],
"manual": false
},
"e-s13-t13": {
"id": "e-s13-t13",
"ids": ["s13", "t13"],
"labels": [],
"manual": false
},
"e-s14-t14": {
"id": "e-s14-t14",
"ids": ["s14", "t14"],
"labels": [],
"manual": false
},
"e-s15-t15": {
"id": "e-s15-t15",
"ids": ["s15", "t15"],
"labels": [],
"manual": false
},
"e-s16-t16": {
"id": "e-s16-t16",
"ids": ["s16", "t16"],
"labels": [],
"manual": false
},
"e-s17-t17": {
"id": "e-s17-t17",
"ids": ["s17", "t17"],
"labels": [],
"manual": false
},
"e-s22-t22": {
"id": "e-s22-t22",
"ids": ["s22", "t22"],
"labels": [],
"manual": false
},
"e-s23-t23": {
"id": "e-s23-t23",
"ids": ["s23", "t23"],
"labels": [],
"manual": false
},
"e-s24-t81": {
"id": "e-s24-t81",
"ids": ["s24", "t81"],
"labels": ["O-SAM"],
"manual": false
},
"e-s25-t25": {
"id": "e-s25-t25",
"ids": ["s25", "t25"],
"labels": [],
"manual": false
},
"e-s26-t26": {
"id": "e-s26-t26",
"ids": ["s26", "t26"],
"labels": [],
"manual": false
},
"e-s27-t27": {
"id": "e-s27-t27",
"ids": ["s27", "t27"],
"labels": [],
"manual": false
},
"e-s28-t83": {
"id": "e-s28-t83",
"ids": ["s28", "t83"],
"labels": ["Z-MV"],
"manual": false
},
"e-s29-t29": {
"id": "e-s29-t29",
"ids": ["s29", "t29"],
"labels": [],
"manual": false
},
"e-s30-t30": {
"id": "e-s30-t30",
"ids": ["s30", "t30"],
"labels": [],
"manual": false
},
"e-s31-t31": {
"id": "e-s31-t31",
"ids": ["s31", "t31"],
"labels": [],
"manual": false
},
"e-s32-t85": {
"id": "e-s32-t85",
"ids": ["s32", "t85"],
"labels": ["Z-MV"],
"manual": false
},
"e-s33-t33": {
"id": "e-s33-t33",
"ids": ["s33", "t33"],
"labels": [],
"manual": false
},
"e-s34-t34": {
"id": "e-s34-t34",
"ids": ["s34", "t34"],
"labels": [],
"manual": false
},
"e-s35-t35": {
"id": "e-s35-t35",
"ids": ["s35", "t35"],
"labels": [],
"manual": false
},
"e-s36-t36": {
"id": "e-s36-t36",
"ids": ["s36", "t36"],
"labels": [],
"manual": false
},
"e-s37-t37": {
"id": "e-s37-t37",
"ids": ["s37", "t37"],
"labels": [],
"manual": false
},
"e-s38-t38": {
"id": "e-s38-t38",
"ids": ["s38", "t38"],
"labels": [],
"manual": false
},
"e-s39-t39": {
"id": "e-s39-t39",
"ids": ["s39", "t39"],
"labels": [],
"manual": false
},
"e-s40-t86": {
"id": "e-s40-t86",
"ids": ["s40", "t86"],
"labels": ["Z-CRK"],
"manual": false
},
"e-s41-t41": {
"id": "e-s41-t41",
"ids": ["s41", "t41"],
"labels": [],
"manual": false
},
"e-s42-t88": {
"id": "e-s42-t88",
"ids": ["s42", "t88"],
"labels": ["B-SAM"],
"manual": false
},
"e-s43-t43": {
"id": "e-s43-t43",
"ids": ["s43", "t43"],
"labels": [],
"manual": false
},
"e-s44-t44": {
"id": "e-s44-t44",
"ids": ["s44", "t44"],
"labels": [],
"manual": false
},
"e-s45-t45": {
"id": "e-s45-t45",
"ids": ["s45", "t45"],
"labels": [],
"manual": false
},
"e-s46-t46": {
"id": "e-s46-t46",
"ids": ["s46", "t46"],
"labels": [],
"manual": false
},
"e-s47-t47": {
"id": "e-s47-t47",
"ids": ["s47", "t47"],
"labels": [],
"manual": false
},
"e-s48-t48": {
"id": "e-s48-t48",
"ids": ["s48", "t48"],
"labels": [],
"manual": false
},
"e-s49-t49": {
"id": "e-s49-t49",
"ids": ["s49", "t49"],
"labels": [],
"manual": false
},
"e-s50-t50": {
"id": "e-s50-t50",
"ids": ["s50", "t50"],
"labels": [],
"manual": false
},
"e-s51-t51": {
"id": "e-s51-t51",
"ids": ["s51", "t51"],
"labels": [],
"manual": false
},
"e-s52-t52": {
"id": "e-s52-t52",
"ids": ["s52", "t52"],
"labels": [],
"manual": false
},
"e-s53-t53": {
"id": "e-s53-t53",
"ids": ["s53", "t53"],
"labels": [],
"manual": false
},
"e-s54-t90": {
"id": "e-s54-t90",
"ids": ["s54", "t90"],
"labels": ["B-SAM"],
"manual": false
},
"e-s55-t55": {
"id": "e-s55-t55",
"ids": ["s55", "t55"],
"labels": [],
"manual": false
},
"e-s56-t56": {
"id": "e-s56-t56",
"ids": ["s56", "t56"],
"labels": [],
"manual": false
},
"e-s57-t57": {
"id": "e-s57-t57",
"ids": ["s57", "t57"],
"labels": [],
"manual": false
},
"e-s58-t58": {
"id": "e-s58-t58",
"ids": ["s58", "t58"],
"labels": [],
"manual": false
},
"e-s59-t59": {
"id": "e-s59-t59",
"ids": ["s59", "t59"],
"labels": [],
"manual": false
},
"e-s60-t60": {
"id": "e-s60-t60",
"ids": ["s60", "t60"],
"labels": [],
"manual": false
},
"e-s61-t61": {
"id": "e-s61-t61",
"ids": ["s61", "t61"],
"labels": [],
"manual": false
},
"e-s62-t62": {
"id": "e-s62-t62",
"ids": ["s62", "t62"],
"labels": [],
"manual": false
},
"e-s63-t63": {
"id": "e-s63-t63",
"ids": ["s63", "t63"],
"labels": [],
"manual": false
},
"e-s64-t92": {
"id": "e-s64-t92",
"ids": ["s64", "t92"],
"labels": ["O-SAM"],
"manual": false
},
"e-s65-t65": {
"id": "e-s65-t65",
"ids": ["s65", "t65"],
"labels": [],
"manual": false
},
"e-s66-t94": {
"id": "e-s66-t94",
"ids": ["s66", "t94"],
"labels": ["O-SAM"],
"manual": false
},
"e-s67-t67": {
"id": "e-s67-t67",
"ids": ["s67", "t67"],
"labels": [],
"manual": false
}
}
}

View File

@ -0,0 +1,184 @@
{
"source": [
{"id": "s0", "text": "Nije "},
{"id": "s1", "text": "mi "},
{"id": "s2", "text": "všeč "},
{"id": "s3", "text": "vreme "},
{"id": "s4", "text": ". "},
{"id": "s5", "text": "Pogosto "},
{"id": "s6", "text": "dždi "},
{"id": "s7", "text": "ali "},
{"id": "s8", "text": "je "},
{"id": "s9", "text": "maglevito "},
{"id": "s10", "text": ". "},
{"id": "s11", "text": "Preveč "},
{"id": "s12", "text": "ima "},
{"id": "s13", "text": "promet "},
{"id": "s14", "text": "po "},
{"id": "s15", "text": "cesti "},
{"id": "s16", "text": ". "},
{"id": "s17", "text": "Hrana "},
{"id": "s18", "text": "je "},
{"id": "s19", "text": "brez "},
{"id": "s20", "text": "okusa "},
{"id": "s21", "text": ". "}
],
"target": [
{"id": "t23", "text": "Ni "},
{"id": "t1", "text": "mi "},
{"id": "t2", "text": "všeč "},
{"id": "t3", "text": "vreme "},
{"id": "t4", "text": ". "},
{"id": "t5", "text": "Pogosto "},
{"id": "t29", "text": "dežuje "},
{"id": "t7", "text": "ali "},
{"id": "t8", "text": "je "},
{"id": "t37", "text": "megleno "},
{"id": "t10", "text": ". "},
{"id": "t11", "text": "Preveč "},
{"id": "t40", "text": "je "},
{"id": "t45", "text": "prometa "},
{"id": "t46", "text": "po "},
{"id": "t15", "text": "cesti "},
{"id": "t16", "text": ". "},
{"id": "t17", "text": "Hrana "},
{"id": "t18", "text": "je "},
{"id": "t19", "text": "brez "},
{"id": "t20", "text": "okusa "},
{"id": "t21", "text": ". "}
],
"edges": {
"e-s12-t40": {
"id": "e-s12-t40",
"ids": ["s12", "t40"],
"labels": ["B-GLAG"],
"manual": true
},
"e-s0-t23": {
"id": "e-s0-t23",
"ids": ["s0", "t23"],
"labels": ["O-GLAG"],
"manual": false
},
"e-s1-t1": {
"id": "e-s1-t1",
"ids": ["s1", "t1"],
"labels": [],
"manual": false
},
"e-s2-t2": {
"id": "e-s2-t2",
"ids": ["s2", "t2"],
"labels": [],
"manual": false
},
"e-s3-t3": {
"id": "e-s3-t3",
"ids": ["s3", "t3"],
"labels": [],
"manual": false
},
"e-s4-t4": {
"id": "e-s4-t4",
"ids": ["s4", "t4"],
"labels": [],
"manual": false
},
"e-s5-t5": {
"id": "e-s5-t5",
"ids": ["s5", "t5"],
"labels": [],
"manual": false
},
"e-s6-t29": {
"id": "e-s6-t29",
"ids": ["s6", "t29"],
"labels": ["B-GLAG"],
"manual": false
},
"e-s7-t7": {
"id": "e-s7-t7",
"ids": ["s7", "t7"],
"labels": [],
"manual": false
},
"e-s8-t8": {
"id": "e-s8-t8",
"ids": ["s8", "t8"],
"labels": [],
"manual": false
},
"e-s9-t37": {
"id": "e-s9-t37",
"ids": ["s9", "t37"],
"labels": ["B-PRISL"],
"manual": false
},
"e-s10-t10": {
"id": "e-s10-t10",
"ids": ["s10", "t10"],
"labels": [],
"manual": false
},
"e-s11-t11": {
"id": "e-s11-t11",
"ids": ["s11", "t11"],
"labels": [],
"manual": false
},
"e-s13-t45": {
"id": "e-s13-t45",
"ids": ["s13", "t45"],
"labels": ["O-SAM"],
"manual": false
},
"e-s14-t46": {
"id": "e-s14-t46",
"ids": ["s14", "t46"],
"labels": [],
"manual": false
},
"e-s15-t15": {
"id": "e-s15-t15",
"ids": ["s15", "t15"],
"labels": [],
"manual": false
},
"e-s16-t16": {
"id": "e-s16-t16",
"ids": ["s16", "t16"],
"labels": [],
"manual": false
},
"e-s17-t17": {
"id": "e-s17-t17",
"ids": ["s17", "t17"],
"labels": [],
"manual": false
},
"e-s18-t18": {
"id": "e-s18-t18",
"ids": ["s18", "t18"],
"labels": [],
"manual": false
},
"e-s19-t19": {
"id": "e-s19-t19",
"ids": ["s19", "t19"],
"labels": [],
"manual": false
},
"e-s20-t20": {
"id": "e-s20-t20",
"ids": ["s20", "t20"],
"labels": [],
"manual": false
},
"e-s21-t21": {
"id": "e-s21-t21",
"ids": ["s21", "t21"],
"labels": [],
"manual": false
}
}
}

View File

@ -0,0 +1,234 @@
{
"source": [
{"id": "s0", "text": "Kolegi "},
{"id": "s1", "text": "so "},
{"id": "s2", "text": "mi "},
{"id": "s3", "text": "Slovenci "},
{"id": "s4", "text": ", "},
{"id": "s5", "text": "samo "},
{"id": "s6", "text": "jaz "},
{"id": "s7", "text": "sem "},
{"id": "s8", "text": "iz "},
{"id": "s9", "text": "Makedonije "},
{"id": "s10", "text": ". "},
{"id": "s11", "text": "Oni "},
{"id": "s12", "text": "so "},
{"id": "s13", "text": "zelo "},
{"id": "s14", "text": "dobri "},
{"id": "s15", "text": ". "},
{"id": "s16", "text": "Pomagajo "},
{"id": "s17", "text": "mi "},
{"id": "s18", "text": ". "},
{"id": "s19", "text": "Slovenci "},
{"id": "s20", "text": "so "},
{"id": "s21", "text": "gostoljubivi "},
{"id": "s22", "text": ". "},
{"id": "s23", "text": "Niso "},
{"id": "s24", "text": "visoki "},
{"id": "s25", "text": "preveč "},
{"id": "s26", "text": "Slovenci "},
{"id": "s27", "text": ", "},
{"id": "s28", "text": "Slovenke "},
{"id": "s29", "text": "so "},
{"id": "s30", "text": "nizke "},
{"id": "s31", "text": ". "}
],
"target": [
{"id": "t36", "text": "Moji "},
{"id": "t39", "text": "kolegi "},
{"id": "t1", "text": "so "},
{"id": "t40", "text": "Slovenci "},
{"id": "t4", "text": ", "},
{"id": "t5", "text": "samo "},
{"id": "t6", "text": "jaz "},
{"id": "t7", "text": "sem "},
{"id": "t8", "text": "iz "},
{"id": "t9", "text": "Makedonije "},
{"id": "t10", "text": ". "},
{"id": "t11", "text": "Oni "},
{"id": "t12", "text": "so "},
{"id": "t13", "text": "zelo "},
{"id": "t14", "text": "dobri "},
{"id": "t15", "text": ". "},
{"id": "t16", "text": "Pomagajo "},
{"id": "t17", "text": "mi "},
{"id": "t18", "text": ". "},
{"id": "t19", "text": "Slovenci "},
{"id": "t20", "text": "so "},
{"id": "t43", "text": "gostoljubni "},
{"id": "t45", "text": ". "},
{"id": "t53", "text": "Slovenci "},
{"id": "t55", "text": "niso "},
{"id": "t63", "text": "preveč "},
{"id": "t64", "text": "visoki "},
{"id": "t65", "text": ", "},
{"id": "t28", "text": "Slovenke "},
{"id": "t29", "text": "so "},
{"id": "t30", "text": "nizke "},
{"id": "t31", "text": ". "}
],
"edges": {
"e-s0-s1-s2-t1-t36-t39": {
"id": "e-s0-s1-s2-t1-t36-t39",
"ids": ["s0", "s1", "s2", "t1", "t36", "t39"],
"labels": ["S-STR"],
"manual": true
},
"e-s23-s24-s25-s26-t53-t55-t63-t64": {
"id": "e-s23-s24-s25-s26-t53-t55-t63-t64",
"ids": ["s23", "s24", "s25", "s26", "t53", "t55", "t63", "t64"],
"labels": ["S-BR"],
"manual": true
},
"e-s3-t40": {
"id": "e-s3-t40",
"ids": ["s3", "t40"],
"labels": [],
"manual": false
},
"e-s4-t4": {
"id": "e-s4-t4",
"ids": ["s4", "t4"],
"labels": [],
"manual": false
},
"e-s5-t5": {
"id": "e-s5-t5",
"ids": ["s5", "t5"],
"labels": [],
"manual": false
},
"e-s6-t6": {
"id": "e-s6-t6",
"ids": ["s6", "t6"],
"labels": [],
"manual": false
},
"e-s7-t7": {
"id": "e-s7-t7",
"ids": ["s7", "t7"],
"labels": [],
"manual": false
},
"e-s8-t8": {
"id": "e-s8-t8",
"ids": ["s8", "t8"],
"labels": [],
"manual": false
},
"e-s9-t9": {
"id": "e-s9-t9",
"ids": ["s9", "t9"],
"labels": [],
"manual": false
},
"e-s10-t10": {
"id": "e-s10-t10",
"ids": ["s10", "t10"],
"labels": [],
"manual": false
},
"e-s11-t11": {
"id": "e-s11-t11",
"ids": ["s11", "t11"],
"labels": [],
"manual": false
},
"e-s12-t12": {
"id": "e-s12-t12",
"ids": ["s12", "t12"],
"labels": [],
"manual": false
},
"e-s13-t13": {
"id": "e-s13-t13",
"ids": ["s13", "t13"],
"labels": [],
"manual": false
},
"e-s14-t14": {
"id": "e-s14-t14",
"ids": ["s14", "t14"],
"labels": [],
"manual": false
},
"e-s15-t15": {
"id": "e-s15-t15",
"ids": ["s15", "t15"],
"labels": [],
"manual": false
},
"e-s16-t16": {
"id": "e-s16-t16",
"ids": ["s16", "t16"],
"labels": [],
"manual": false
},
"e-s17-t17": {
"id": "e-s17-t17",
"ids": ["s17", "t17"],
"labels": [],
"manual": false
},
"e-s18-t18": {
"id": "e-s18-t18",
"ids": ["s18", "t18"],
"labels": [],
"manual": false
},
"e-s19-t19": {
"id": "e-s19-t19",
"ids": ["s19", "t19"],
"labels": [],
"manual": false
},
"e-s20-t20": {
"id": "e-s20-t20",
"ids": ["s20", "t20"],
"labels": [],
"manual": false
},
"e-s21-t43": {
"id": "e-s21-t43",
"ids": ["s21", "t43"],
"labels": ["B-PRID"],
"manual": false
},
"e-s22-t45": {
"id": "e-s22-t45",
"ids": ["s22", "t45"],
"labels": [],
"manual": false
},
"e-s27-t65": {
"id": "e-s27-t65",
"ids": ["s27", "t65"],
"labels": [],
"manual": false
},
"e-s28-t28": {
"id": "e-s28-t28",
"ids": ["s28", "t28"],
"labels": [],
"manual": false
},
"e-s29-t29": {
"id": "e-s29-t29",
"ids": ["s29", "t29"],
"labels": [],
"manual": false
},
"e-s30-t30": {
"id": "e-s30-t30",
"ids": ["s30", "t30"],
"labels": [],
"manual": false
},
"e-s31-t31": {
"id": "e-s31-t31",
"ids": ["s31", "t31"],
"labels": [],
"manual": false
}
}
}

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

View File

@ -0,0 +1,884 @@
{
"source": [
{"id": "s0", "text": "Imam "},
{"id": "s1", "text": "dve "},
{"id": "s2", "text": "nečakinje "},
{"id": "s3", "text": ", "},
{"id": "s4", "text": "ena "},
{"id": "s5", "text": "je "},
{"id": "s6", "text": "na "},
{"id": "s7", "text": "dva "},
{"id": "s8", "text": "leta "},
{"id": "s9", "text": "starejša "},
{"id": "s10", "text": "od "},
{"id": "s11", "text": "druge "},
{"id": "s12", "text": ". "},
{"id": "s135", "text": "[XImeX] "},
{"id": "s14", "text": "je "},
{"id": "s15", "text": "starejša "},
{"id": "s16", "text": ", "},
{"id": "s17", "text": "sedaj "},
{"id": "s18", "text": "že "},
{"id": "s19", "text": "je "},
{"id": "s20", "text": "5 "},
{"id": "s21", "text": "let "},
{"id": "s22", "text": "stara "},
{"id": "s23", "text": "in "},
{"id": "s24", "text": "njena "},
{"id": "s25", "text": "mama "},
{"id": "s26", "text": "je "},
{"id": "s27", "text": "vedno "},
{"id": "s28", "text": "vklopila "},
{"id": "s29", "text": "risanke "},
{"id": "s30", "text": "med "},
{"id": "s31", "text": "obrokom "},
{"id": "s32", "text": ", "},
{"id": "s33", "text": "potem "},
{"id": "s34", "text": "pa "},
{"id": "s35", "text": "dala "},
{"id": "s36", "text": "igrati "},
{"id": "s37", "text": "s "},
{"id": "s38", "text": "tablico "},
{"id": "s39", "text": "tekom "},
{"id": "s40", "text": "dneva "},
{"id": "s41", "text": ". "},
{"id": "s42", "text": "S "},
{"id": "s43", "text": "časom "},
{"id": "s44", "text": "je "},
{"id": "s45", "text": "zasledila "},
{"id": "s46", "text": ", "},
{"id": "s47", "text": "da "},
{"id": "s48", "text": "otrok "},
{"id": "s49", "text": "ne "},
{"id": "s50", "text": "more "},
{"id": "s51", "text": "jesti "},
{"id": "s52", "text": "brez "},
{"id": "s53", "text": "vklopljene "},
{"id": "s54", "text": "risanke "},
{"id": "s55", "text": "in "},
{"id": "s56", "text": "zelo "},
{"id": "s57", "text": "veliko "},
{"id": "s58", "text": "časa "},
{"id": "s59", "text": "igra "},
{"id": "s60", "text": "videoigre "},
{"id": "s61", "text": "in "},
{"id": "s62", "text": "ne "},
{"id": "s63", "text": "želi "},
{"id": "s64", "text": "iti "},
{"id": "s65", "text": "ven "},
{"id": "s66", "text": ". "},
{"id": "s67", "text": "Ko "},
{"id": "s68", "text": "se "},
{"id": "s69", "text": "je "},
{"id": "s70", "text": "pojavil "},
{"id": "s71", "text": "drugi "},
{"id": "s72", "text": "otrok "},
{"id": "s73", "text": ", "},
{"id": "s74", "text": "poskusila "},
{"id": "s75", "text": "odstraniti "},
{"id": "s76", "text": "ta "},
{"id": "s77", "text": "problem "},
{"id": "s78", "text": "in "},
{"id": "s79", "text": "dati "},
{"id": "s80", "text": "manj "},
{"id": "s81", "text": "igrati "},
{"id": "s82", "text": "na "},
{"id": "s83", "text": "tablici "},
{"id": "s84", "text": "in "},
{"id": "s85", "text": "manj "},
{"id": "s86", "text": "gledati "},
{"id": "s87", "text": "televezijo "},
{"id": "s88", "text": ". "},
{"id": "s89", "text": "Upam "},
{"id": "s90", "text": ", "},
{"id": "s91", "text": "da "},
{"id": "s92", "text": "njej "},
{"id": "s93", "text": "je "},
{"id": "s94", "text": "uspelo "},
{"id": "s95", "text": ", "},
{"id": "s96", "text": "ker "},
{"id": "s97", "text": "sedaj "},
{"id": "s98", "text": "je "},
{"id": "s99", "text": "tako "},
{"id": "s100", "text": "važno "},
{"id": "s101", "text": "preživljati "},
{"id": "s102", "text": "več "},
{"id": "s103", "text": "časa "},
{"id": "s104", "text": "zunaj "},
{"id": "s105", "text": "in "},
{"id": "s106", "text": "se "},
{"id": "s107", "text": "aktivno "},
{"id": "s108", "text": "gibati "},
{"id": "s109", "text": ". "}
],
"target": [
{"id": "t0", "text": "Imam "},
{"id": "t1", "text": "dve "},
{"id": "t111", "text": "nečakinji "},
{"id": "t3", "text": ", "},
{"id": "t4", "text": "ena "},
{"id": "t5", "text": "je "},
{"id": "t114", "text": "dve "},
{"id": "t116", "text": "leti "},
{"id": "t9", "text": "starejša "},
{"id": "t10", "text": "od "},
{"id": "t11", "text": "druge "},
{"id": "t12", "text": ". "},
{"id": "t126", "text": "[XImeX] "},
{"id": "t14", "text": "je "},
{"id": "t15", "text": "starejša "},
{"id": "t16", "text": ", "},
{"id": "t17", "text": "sedaj "},
{"id": "t138", "text": "je "},
{"id": "t145", "text": "stara "},
{"id": "t146", "text": "že "},
{"id": "t147", "text": "5 "},
{"id": "t161", "text": "let "},
{"id": "t162", "text": "in "},
{"id": "t24", "text": "njena "},
{"id": "t25", "text": "mama "},
{"id": "t26", "text": "je "},
{"id": "t27", "text": "vedno "},
{"id": "t28", "text": "vklopila "},
{"id": "t29", "text": "risanke "},
{"id": "t30", "text": "med "},
{"id": "t31", "text": "obrokom "},
{"id": "t32", "text": ", "},
{"id": "t33", "text": "potem "},
{"id": "t34", "text": "pa "},
{"id": "t166", "text": "ji "},
{"id": "t170", "text": "je "},
{"id": "t169", "text": "dala "},
{"id": "t36", "text": "igrati "},
{"id": "t37", "text": "s "},
{"id": "t38", "text": "tablico "},
{"id": "t39", "text": "tekom "},
{"id": "t40", "text": "dneva "},
{"id": "t41", "text": ". "},
{"id": "t42", "text": "S "},
{"id": "t43", "text": "časom "},
{"id": "t44", "text": "je "},
{"id": "t179", "text": "ugotovila "},
{"id": "t46", "text": ", "},
{"id": "t47", "text": "da "},
{"id": "t48", "text": "otrok "},
{"id": "t49", "text": "ne "},
{"id": "t50", "text": "more "},
{"id": "t51", "text": "jesti "},
{"id": "t52", "text": "brez "},
{"id": "t53", "text": "vklopljene "},
{"id": "t54", "text": "risanke "},
{"id": "t55", "text": "in "},
{"id": "t56", "text": "zelo "},
{"id": "t57", "text": "veliko "},
{"id": "t58", "text": "časa "},
{"id": "t59", "text": "igra "},
{"id": "t60", "text": "videoigre "},
{"id": "t61", "text": "in "},
{"id": "t62", "text": "ne "},
{"id": "t63", "text": "želi "},
{"id": "t64", "text": "iti "},
{"id": "t65", "text": "ven "},
{"id": "t66", "text": ". "},
{"id": "t67", "text": "Ko "},
{"id": "t68", "text": "se "},
{"id": "t69", "text": "je "},
{"id": "t70", "text": "pojavil "},
{"id": "t71", "text": "drugi "},
{"id": "t72", "text": "otrok "},
{"id": "t73", "text": ", "},
{"id": "t150", "text": "je "},
{"id": "t151", "text": "poskusila "},
{"id": "t75", "text": "odstraniti "},
{"id": "t76", "text": "ta "},
{"id": "t77", "text": "problem "},
{"id": "t78", "text": "in "},
{"id": "t79", "text": "dati "},
{"id": "t80", "text": "manj "},
{"id": "t81", "text": "igrati "},
{"id": "t82", "text": "na "},
{"id": "t83", "text": "tablici "},
{"id": "t84", "text": "in "},
{"id": "t85", "text": "manj "},
{"id": "t86", "text": "gledati "},
{"id": "t181", "text": "televizijo "},
{"id": "t88", "text": ". "},
{"id": "t89", "text": "Upam "},
{"id": "t90", "text": ", "},
{"id": "t91", "text": "da "},
{"id": "t154", "text": "ji "},
{"id": "t93", "text": "je "},
{"id": "t94", "text": "uspelo "},
{"id": "t95", "text": ", "},
{"id": "t96", "text": "ker "},
{"id": "t158", "text": "je "},
{"id": "t157", "text": "sedaj "},
{"id": "t159", "text": "tako "},
{"id": "t100", "text": "važno "},
{"id": "t101", "text": "preživljati "},
{"id": "t102", "text": "več "},
{"id": "t103", "text": "časa "},
{"id": "t104", "text": "zunaj "},
{"id": "t105", "text": "in "},
{"id": "t106", "text": "se "},
{"id": "t107", "text": "aktivno "},
{"id": "t108", "text": "gibati "},
{"id": "t109", "text": ". "}
],
"edges": {
"e-s23-t162": {
"id": "e-s23-t162",
"ids": ["s23", "t162"],
"labels": [],
"manual": true
},
"e-s18-s19-s20-s21-s22-t138-t145-t146-t147-t161": {
"id": "e-s18-s19-s20-s21-s22-t138-t145-t146-t147-t161",
"ids": [
"s18",
"s19",
"s20",
"s21",
"s22",
"t138",
"t145",
"t146",
"t147",
"t161"
],
"labels": ["S-BR"],
"manual": true
},
"e-s97-s98-t157-t158": {
"id": "e-s97-s98-t157-t158",
"ids": ["s97", "s98", "t157", "t158"],
"labels": ["S-BR"],
"manual": true
},
"e-s0-t0": {
"id": "e-s0-t0",
"ids": ["s0", "t0"],
"labels": [],
"manual": false
},
"e-s1-t1": {
"id": "e-s1-t1",
"ids": ["s1", "t1"],
"labels": [],
"manual": false
},
"e-s2-t111": {
"id": "e-s2-t111",
"ids": ["s2", "t111"],
"labels": ["O-SAM"],
"manual": false
},
"e-s3-t3": {
"id": "e-s3-t3",
"ids": ["s3", "t3"],
"labels": [],
"manual": false
},
"e-s4-t4": {
"id": "e-s4-t4",
"ids": ["s4", "t4"],
"labels": [],
"manual": false
},
"e-s5-t5": {
"id": "e-s5-t5",
"ids": ["s5", "t5"],
"labels": [],
"manual": false
},
"e-s6": {"id": "e-s6", "ids": ["s6"], "labels": ["S-ODV"], "manual": false},
"e-s7-t114": {
"id": "e-s7-t114",
"ids": ["s7", "t114"],
"labels": ["O-OST", "POV"],
"manual": false
},
"e-s8-t116": {
"id": "e-s8-t116",
"ids": ["s8", "t116"],
"labels": ["O-SAM"],
"manual": false
},
"e-s9-t9": {
"id": "e-s9-t9",
"ids": ["s9", "t9"],
"labels": [],
"manual": false
},
"e-s10-t10": {
"id": "e-s10-t10",
"ids": ["s10", "t10"],
"labels": [],
"manual": false
},
"e-s11-t11": {
"id": "e-s11-t11",
"ids": ["s11", "t11"],
"labels": [],
"manual": false
},
"e-s12-t12": {
"id": "e-s12-t12",
"ids": ["s12", "t12"],
"labels": [],
"manual": false
},
"e-s135-t126": {
"id": "e-s135-t126",
"ids": ["s135", "t126"],
"labels": [],
"manual": false
},
"e-s14-t14": {
"id": "e-s14-t14",
"ids": ["s14", "t14"],
"labels": [],
"manual": false
},
"e-s15-t15": {
"id": "e-s15-t15",
"ids": ["s15", "t15"],
"labels": [],
"manual": false
},
"e-s16-t16": {
"id": "e-s16-t16",
"ids": ["s16", "t16"],
"labels": [],
"manual": false
},
"e-s17-t17": {
"id": "e-s17-t17",
"ids": ["s17", "t17"],
"labels": [],
"manual": false
},
"e-s24-t24": {
"id": "e-s24-t24",
"ids": ["s24", "t24"],
"labels": [],
"manual": false
},
"e-s25-t25": {
"id": "e-s25-t25",
"ids": ["s25", "t25"],
"labels": [],
"manual": false
},
"e-s26-t26": {
"id": "e-s26-t26",
"ids": ["s26", "t26"],
"labels": [],
"manual": false
},
"e-s27-t27": {
"id": "e-s27-t27",
"ids": ["s27", "t27"],
"labels": [],
"manual": false
},
"e-s28-t28": {
"id": "e-s28-t28",
"ids": ["s28", "t28"],
"labels": [],
"manual": false
},
"e-s29-t29": {
"id": "e-s29-t29",
"ids": ["s29", "t29"],
"labels": [],
"manual": false
},
"e-s30-t30": {
"id": "e-s30-t30",
"ids": ["s30", "t30"],
"labels": [],
"manual": false
},
"e-s31-t31": {
"id": "e-s31-t31",
"ids": ["s31", "t31"],
"labels": [],
"manual": false
},
"e-s32-t32": {
"id": "e-s32-t32",
"ids": ["s32", "t32"],
"labels": [],
"manual": false
},
"e-s33-t33": {
"id": "e-s33-t33",
"ids": ["s33", "t33"],
"labels": [],
"manual": false
},
"e-s34-t34": {
"id": "e-s34-t34",
"ids": ["s34", "t34"],
"labels": [],
"manual": false
},
"e-s35-t169": {
"id": "e-s35-t169",
"ids": ["s35", "t169"],
"labels": [],
"manual": false
},
"e-s36-t36": {
"id": "e-s36-t36",
"ids": ["s36", "t36"],
"labels": [],
"manual": false
},
"e-s37-t37": {
"id": "e-s37-t37",
"ids": ["s37", "t37"],
"labels": [],
"manual": false
},
"e-s38-t38": {
"id": "e-s38-t38",
"ids": ["s38", "t38"],
"labels": [],
"manual": false
},
"e-s39-t39": {
"id": "e-s39-t39",
"ids": ["s39", "t39"],
"labels": [],
"manual": false
},
"e-s40-t40": {
"id": "e-s40-t40",
"ids": ["s40", "t40"],
"labels": [],
"manual": false
},
"e-s41-t41": {
"id": "e-s41-t41",
"ids": ["s41", "t41"],
"labels": [],
"manual": false
},
"e-s42-t42": {
"id": "e-s42-t42",
"ids": ["s42", "t42"],
"labels": [],
"manual": false
},
"e-s43-t43": {
"id": "e-s43-t43",
"ids": ["s43", "t43"],
"labels": [],
"manual": false
},
"e-s44-t44": {
"id": "e-s44-t44",
"ids": ["s44", "t44"],
"labels": [],
"manual": false
},
"e-s45-t179": {
"id": "e-s45-t179",
"ids": ["s45", "t179"],
"labels": ["B-GLAG"],
"manual": false
},
"e-s46-t46": {
"id": "e-s46-t46",
"ids": ["s46", "t46"],
"labels": [],
"manual": false
},
"e-s47-t47": {
"id": "e-s47-t47",
"ids": ["s47", "t47"],
"labels": [],
"manual": false
},
"e-s48-t48": {
"id": "e-s48-t48",
"ids": ["s48", "t48"],
"labels": [],
"manual": false
},
"e-s49-t49": {
"id": "e-s49-t49",
"ids": ["s49", "t49"],
"labels": [],
"manual": false
},
"e-s50-t50": {
"id": "e-s50-t50",
"ids": ["s50", "t50"],
"labels": [],
"manual": false
},
"e-s51-t51": {
"id": "e-s51-t51",
"ids": ["s51", "t51"],
"labels": [],
"manual": false
},
"e-s52-t52": {
"id": "e-s52-t52",
"ids": ["s52", "t52"],
"labels": [],
"manual": false
},
"e-s53-t53": {
"id": "e-s53-t53",
"ids": ["s53", "t53"],
"labels": [],
"manual": false
},
"e-s54-t54": {
"id": "e-s54-t54",
"ids": ["s54", "t54"],
"labels": [],
"manual": false
},
"e-s55-t55": {
"id": "e-s55-t55",
"ids": ["s55", "t55"],
"labels": [],
"manual": false
},
"e-s56-t56": {
"id": "e-s56-t56",
"ids": ["s56", "t56"],
"labels": [],
"manual": false
},
"e-s57-t57": {
"id": "e-s57-t57",
"ids": ["s57", "t57"],
"labels": [],
"manual": false
},
"e-s58-t58": {
"id": "e-s58-t58",
"ids": ["s58", "t58"],
"labels": [],
"manual": false
},
"e-s59-t59": {
"id": "e-s59-t59",
"ids": ["s59", "t59"],
"labels": [],
"manual": false
},
"e-s60-t60": {
"id": "e-s60-t60",
"ids": ["s60", "t60"],
"labels": [],
"manual": false
},
"e-s61-t61": {
"id": "e-s61-t61",
"ids": ["s61", "t61"],
"labels": [],
"manual": false
},
"e-s62-t62": {
"id": "e-s62-t62",
"ids": ["s62", "t62"],
"labels": [],
"manual": false
},
"e-s63-t63": {
"id": "e-s63-t63",
"ids": ["s63", "t63"],
"labels": [],
"manual": false
},
"e-s64-t64": {
"id": "e-s64-t64",
"ids": ["s64", "t64"],
"labels": [],
"manual": false
},
"e-s65-t65": {
"id": "e-s65-t65",
"ids": ["s65", "t65"],
"labels": [],
"manual": false
},
"e-s66-t66": {
"id": "e-s66-t66",
"ids": ["s66", "t66"],
"labels": [],
"manual": false
},
"e-s67-t67": {
"id": "e-s67-t67",
"ids": ["s67", "t67"],
"labels": [],
"manual": false
},
"e-s68-t68": {
"id": "e-s68-t68",
"ids": ["s68", "t68"],
"labels": [],
"manual": false
},
"e-s69-t69": {
"id": "e-s69-t69",
"ids": ["s69", "t69"],
"labels": [],
"manual": false
},
"e-s70-t70": {
"id": "e-s70-t70",
"ids": ["s70", "t70"],
"labels": [],
"manual": false
},
"e-s71-t71": {
"id": "e-s71-t71",
"ids": ["s71", "t71"],
"labels": [],
"manual": false
},
"e-s72-t72": {
"id": "e-s72-t72",
"ids": ["s72", "t72"],
"labels": [],
"manual": false
},
"e-s73-t73": {
"id": "e-s73-t73",
"ids": ["s73", "t73"],
"labels": [],
"manual": false
},
"e-s74-t151": {
"id": "e-s74-t151",
"ids": ["s74", "t151"],
"labels": [],
"manual": false
},
"e-s75-t75": {
"id": "e-s75-t75",
"ids": ["s75", "t75"],
"labels": [],
"manual": false
},
"e-s76-t76": {
"id": "e-s76-t76",
"ids": ["s76", "t76"],
"labels": [],
"manual": false
},
"e-s77-t77": {
"id": "e-s77-t77",
"ids": ["s77", "t77"],
"labels": [],
"manual": false
},
"e-s78-t78": {
"id": "e-s78-t78",
"ids": ["s78", "t78"],
"labels": [],
"manual": false
},
"e-s79-t79": {
"id": "e-s79-t79",
"ids": ["s79", "t79"],
"labels": [],
"manual": false
},
"e-s80-t80": {
"id": "e-s80-t80",
"ids": ["s80", "t80"],
"labels": [],
"manual": false
},
"e-s81-t81": {
"id": "e-s81-t81",
"ids": ["s81", "t81"],
"labels": [],
"manual": false
},
"e-s82-t82": {
"id": "e-s82-t82",
"ids": ["s82", "t82"],
"labels": [],
"manual": false
},
"e-s83-t83": {
"id": "e-s83-t83",
"ids": ["s83", "t83"],
"labels": [],
"manual": false
},
"e-s84-t84": {
"id": "e-s84-t84",
"ids": ["s84", "t84"],
"labels": [],
"manual": false
},
"e-s85-t85": {
"id": "e-s85-t85",
"ids": ["s85", "t85"],
"labels": [],
"manual": false
},
"e-s86-t86": {
"id": "e-s86-t86",
"ids": ["s86", "t86"],
"labels": [],
"manual": false
},
"e-s87-t181": {
"id": "e-s87-t181",
"ids": ["s87", "t181"],
"labels": ["Z-CRK"],
"manual": false
},
"e-s88-t88": {
"id": "e-s88-t88",
"ids": ["s88", "t88"],
"labels": [],
"manual": false
},
"e-s89-t89": {
"id": "e-s89-t89",
"ids": ["s89", "t89"],
"labels": [],
"manual": false
},
"e-s90-t90": {
"id": "e-s90-t90",
"ids": ["s90", "t90"],
"labels": [],
"manual": false
},
"e-s91-t91": {
"id": "e-s91-t91",
"ids": ["s91", "t91"],
"labels": [],
"manual": false
},
"e-s92-t154": {
"id": "e-s92-t154",
"ids": ["s92", "t154"],
"labels": ["B-ZAIM"],
"manual": false
},
"e-s93-t93": {
"id": "e-s93-t93",
"ids": ["s93", "t93"],
"labels": [],
"manual": false
},
"e-s94-t94": {
"id": "e-s94-t94",
"ids": ["s94", "t94"],
"labels": [],
"manual": false
},
"e-s95-t95": {
"id": "e-s95-t95",
"ids": ["s95", "t95"],
"labels": [],
"manual": false
},
"e-s96-t96": {
"id": "e-s96-t96",
"ids": ["s96", "t96"],
"labels": [],
"manual": false
},
"e-s99-t159": {
"id": "e-s99-t159",
"ids": ["s99", "t159"],
"labels": [],
"manual": false
},
"e-s100-t100": {
"id": "e-s100-t100",
"ids": ["s100", "t100"],
"labels": [],
"manual": false
},
"e-s101-t101": {
"id": "e-s101-t101",
"ids": ["s101", "t101"],
"labels": [],
"manual": false
},
"e-s102-t102": {
"id": "e-s102-t102",
"ids": ["s102", "t102"],
"labels": [],
"manual": false
},
"e-s103-t103": {
"id": "e-s103-t103",
"ids": ["s103", "t103"],
"labels": [],
"manual": false
},
"e-s104-t104": {
"id": "e-s104-t104",
"ids": ["s104", "t104"],
"labels": [],
"manual": false
},
"e-s105-t105": {
"id": "e-s105-t105",
"ids": ["s105", "t105"],
"labels": [],
"manual": false
},
"e-s106-t106": {
"id": "e-s106-t106",
"ids": ["s106", "t106"],
"labels": [],
"manual": false
},
"e-s107-t107": {
"id": "e-s107-t107",
"ids": ["s107", "t107"],
"labels": [],
"manual": false
},
"e-s108-t108": {
"id": "e-s108-t108",
"ids": ["s108", "t108"],
"labels": [],
"manual": false
},
"e-s109-t109": {
"id": "e-s109-t109",
"ids": ["s109", "t109"],
"labels": [],
"manual": false
},
"e-t166": {
"id": "e-t166",
"ids": ["t166"],
"labels": ["S-IZP"],
"manual": false
},
"e-t170": {
"id": "e-t170",
"ids": ["t170"],
"labels": ["S-IZP"],
"manual": false
},
"e-t150": {
"id": "e-t150",
"ids": ["t150"],
"labels": ["S-IZP"],
"manual": false
}
}
}

View File

@ -0,0 +1,389 @@
{
"source": [
{"id": "s0", "text": "Moje "},
{"id": "s1", "text": "mnenje "},
{"id": "s2", "text": "je "},
{"id": "s3", "text": ", "},
{"id": "s4", "text": "da "},
{"id": "s5", "text": "starši "},
{"id": "s6", "text": "morajo "},
{"id": "s7", "text": "sami "},
{"id": "s8", "text": "manj "},
{"id": "s9", "text": "gledati "},
{"id": "s10", "text": "televizijo "},
{"id": "s11", "text": "in "},
{"id": "s12", "text": "manj "},
{"id": "s13", "text": "vklopiti "},
{"id": "s14", "text": "jo "},
{"id": "s15", "text": "otrokam "},
{"id": "s16", "text": ". "},
{"id": "s17", "text": "Naj "},
{"id": "s18", "text": "sami "},
{"id": "s19", "text": "starši "},
{"id": "s20", "text": "več "},
{"id": "s21", "text": "časa "},
{"id": "s22", "text": "preživljajo "},
{"id": "s23", "text": "in "},
{"id": "s24", "text": "se "},
{"id": "s25", "text": "igrajo "},
{"id": "s26", "text": "z "},
{"id": "s27", "text": "svojimi "},
{"id": "s28", "text": "otroki "},
{"id": "s29", "text": ", "},
{"id": "s30", "text": "ker "},
{"id": "s31", "text": "za "},
{"id": "s32", "text": "nas "},
{"id": "s33", "text": "je "},
{"id": "s34", "text": "tako "},
{"id": "s35", "text": "pomembna "},
{"id": "s36", "text": "pozornost "},
{"id": "s37", "text": "starših "},
{"id": "s38", "text": ". "},
{"id": "s39", "text": "Seveda "},
{"id": "s40", "text": ", "},
{"id": "s41", "text": "ko "},
{"id": "s42", "text": "so "},
{"id": "s43", "text": "straši "},
{"id": "s44", "text": "zelo "},
{"id": "s45", "text": "utrujeni "},
{"id": "s46", "text": "lahko "},
{"id": "s47", "text": "izkoriščajo "},
{"id": "s48", "text": "možnost "},
{"id": "s49", "text": "varuške-televizije "},
{"id": "s50", "text": ". "}
],
"target": [
{"id": "t0", "text": "Moje "},
{"id": "t1", "text": "mnenje "},
{"id": "t2", "text": "je "},
{"id": "t3", "text": ", "},
{"id": "t4", "text": "da "},
{"id": "t5", "text": "starši "},
{"id": "t6", "text": "morajo "},
{"id": "t7", "text": "sami "},
{"id": "t8", "text": "manj "},
{"id": "t9", "text": "gledati "},
{"id": "t10", "text": "televizijo "},
{"id": "t11", "text": "in "},
{"id": "t57", "text": "jo "},
{"id": "t58", "text": "manj "},
{"id": "t13", "text": "vklopiti "},
{"id": "t61", "text": "otrokom "},
{"id": "t16", "text": ". "},
{"id": "t17", "text": "Naj "},
{"id": "t18", "text": "sami "},
{"id": "t19", "text": "starši "},
{"id": "t74", "text": "preživljajo "},
{"id": "t64", "text": "več "},
{"id": "t76", "text": "časa "},
{"id": "t77", "text": "in "},
{"id": "t24", "text": "se "},
{"id": "t79", "text": "igrajo "},
{"id": "t80", "text": "s "},
{"id": "t27", "text": "svojimi "},
{"id": "t28", "text": "otroki "},
{"id": "t29", "text": ", "},
{"id": "t30", "text": "ker "},
{"id": "t84", "text": "je "},
{"id": "t83", "text": "za "},
{"id": "t86", "text": "nas "},
{"id": "t87", "text": "tako "},
{"id": "t35", "text": "pomembna "},
{"id": "t36", "text": "pozornost "},
{"id": "t91", "text": "staršev "},
{"id": "t38", "text": ". "},
{"id": "t39", "text": "Seveda "},
{"id": "t40", "text": ", "},
{"id": "t41", "text": "ko "},
{"id": "t42", "text": "so "},
{"id": "t93", "text": "starši "},
{"id": "t44", "text": "zelo "},
{"id": "t45", "text": "utrujeni "},
{"id": "t95", "text": ", "},
{"id": "t96", "text": "lahko "},
{"id": "t108", "text": "izkoristijo "},
{"id": "t48", "text": "možnost "},
{"id": "t51", "text": "varuške "},
{"id": "t53", "text": "- "},
{"id": "t54", "text": "televizije "},
{"id": "t50", "text": ". "}
],
"edges": {
"e-s12-s13-s14-t13-t57-t58": {
"id": "e-s12-s13-s14-t13-t57-t58",
"ids": ["s12", "s13", "s14", "t13", "t57", "t58"],
"labels": ["S-BR"],
"manual": true
},
"e-s20-s21-s22-t64-t74-t76": {
"id": "e-s20-s21-s22-t64-t74-t76",
"ids": ["s20", "s21", "s22", "t64", "t74", "t76"],
"labels": ["S-BR"],
"manual": true
},
"e-s26-t80": {
"id": "e-s26-t80",
"ids": ["s26", "t80"],
"labels": ["Z-CRK"],
"manual": true
},
"e-s31-s32-s33-t83-t84-t86": {
"id": "e-s31-s32-s33-t83-t84-t86",
"ids": ["s31", "s32", "s33", "t83", "t84", "t86"],
"labels": ["S-BR"],
"manual": true
},
"e-s0-t0": {
"id": "e-s0-t0",
"ids": ["s0", "t0"],
"labels": [],
"manual": false
},
"e-s1-t1": {
"id": "e-s1-t1",
"ids": ["s1", "t1"],
"labels": [],
"manual": false
},
"e-s2-t2": {
"id": "e-s2-t2",
"ids": ["s2", "t2"],
"labels": [],
"manual": false
},
"e-s3-t3": {
"id": "e-s3-t3",
"ids": ["s3", "t3"],
"labels": [],
"manual": false
},
"e-s4-t4": {
"id": "e-s4-t4",
"ids": ["s4", "t4"],
"labels": [],
"manual": false
},
"e-s5-t5": {
"id": "e-s5-t5",
"ids": ["s5", "t5"],
"labels": [],
"manual": false
},
"e-s6-t6": {
"id": "e-s6-t6",
"ids": ["s6", "t6"],
"labels": [],
"manual": false
},
"e-s7-t7": {
"id": "e-s7-t7",
"ids": ["s7", "t7"],
"labels": [],
"manual": false
},
"e-s8-t8": {
"id": "e-s8-t8",
"ids": ["s8", "t8"],
"labels": [],
"manual": false
},
"e-s9-t9": {
"id": "e-s9-t9",
"ids": ["s9", "t9"],
"labels": [],
"manual": false
},
"e-s10-t10": {
"id": "e-s10-t10",
"ids": ["s10", "t10"],
"labels": [],
"manual": false
},
"e-s11-t11": {
"id": "e-s11-t11",
"ids": ["s11", "t11"],
"labels": [],
"manual": false
},
"e-s15-t61": {
"id": "e-s15-t61",
"ids": ["s15", "t61"],
"labels": ["O-SAM"],
"manual": false
},
"e-s16-t16": {
"id": "e-s16-t16",
"ids": ["s16", "t16"],
"labels": [],
"manual": false
},
"e-s17-t17": {
"id": "e-s17-t17",
"ids": ["s17", "t17"],
"labels": [],
"manual": false
},
"e-s18-t18": {
"id": "e-s18-t18",
"ids": ["s18", "t18"],
"labels": [],
"manual": false
},
"e-s19-t19": {
"id": "e-s19-t19",
"ids": ["s19", "t19"],
"labels": [],
"manual": false
},
"e-s23-t77": {
"id": "e-s23-t77",
"ids": ["s23", "t77"],
"labels": [],
"manual": false
},
"e-s24-t24": {
"id": "e-s24-t24",
"ids": ["s24", "t24"],
"labels": [],
"manual": false
},
"e-s25-t79": {
"id": "e-s25-t79",
"ids": ["s25", "t79"],
"labels": [],
"manual": false
},
"e-s27-t27": {
"id": "e-s27-t27",
"ids": ["s27", "t27"],
"labels": [],
"manual": false
},
"e-s28-t28": {
"id": "e-s28-t28",
"ids": ["s28", "t28"],
"labels": [],
"manual": false
},
"e-s29-t29": {
"id": "e-s29-t29",
"ids": ["s29", "t29"],
"labels": [],
"manual": false
},
"e-s30-t30": {
"id": "e-s30-t30",
"ids": ["s30", "t30"],
"labels": [],
"manual": false
},
"e-s34-t87": {
"id": "e-s34-t87",
"ids": ["s34", "t87"],
"labels": [],
"manual": false
},
"e-s35-t35": {
"id": "e-s35-t35",
"ids": ["s35", "t35"],
"labels": [],
"manual": false
},
"e-s36-t36": {
"id": "e-s36-t36",
"ids": ["s36", "t36"],
"labels": [],
"manual": false
},
"e-s37-t91": {
"id": "e-s37-t91",
"ids": ["s37", "t91"],
"labels": ["O-SAM"],
"manual": false
},
"e-s38-t38": {
"id": "e-s38-t38",
"ids": ["s38", "t38"],
"labels": [],
"manual": false
},
"e-s39-t39": {
"id": "e-s39-t39",
"ids": ["s39", "t39"],
"labels": [],
"manual": false
},
"e-s40-t40": {
"id": "e-s40-t40",
"ids": ["s40", "t40"],
"labels": [],
"manual": false
},
"e-s41-t41": {
"id": "e-s41-t41",
"ids": ["s41", "t41"],
"labels": [],
"manual": false
},
"e-s42-t42": {
"id": "e-s42-t42",
"ids": ["s42", "t42"],
"labels": [],
"manual": false
},
"e-s43-t93": {
"id": "e-s43-t93",
"ids": ["s43", "t93"],
"labels": ["Z-CRK"],
"manual": false
},
"e-s44-t44": {
"id": "e-s44-t44",
"ids": ["s44", "t44"],
"labels": [],
"manual": false
},
"e-s45-t45": {
"id": "e-s45-t45",
"ids": ["s45", "t45"],
"labels": [],
"manual": false
},
"e-s46-t96": {
"id": "e-s46-t96",
"ids": ["s46", "t96"],
"labels": [],
"manual": false
},
"e-s47-t108": {
"id": "e-s47-t108",
"ids": ["s47", "t108"],
"labels": ["B-GLAG"],
"manual": false
},
"e-s48-t48": {
"id": "e-s48-t48",
"ids": ["s48", "t48"],
"labels": [],
"manual": false
},
"e-s49-t51-t53-t54": {
"id": "e-s49-t51-t53-t54",
"ids": ["s49", "t51", "t53", "t54"],
"labels": ["Z-SN"],
"manual": false
},
"e-s50-t50": {
"id": "e-s50-t50",
"ids": ["s50", "t50"],
"labels": [],
"manual": false
},
"e-t95": {
"id": "e-t95",
"ids": ["t95"],
"labels": ["Z-LOC"],
"manual": false
}
}
}

View File

@ -0,0 +1,311 @@
{
"source": [
{"id": "s0", "text": "Srečno "},
{"id": "s1", "text": "Novo "},
{"id": "s2", "text": "Leto "},
{"id": "s3", "text": "in "},
{"id": "s4", "text": "vessel "},
{"id": "s5", "text": "Božič "},
{"id": "s6", "text": ". \n"},
{"id": "s7", "text": "Torej "},
{"id": "s8", "text": ", "},
{"id": "s9", "text": "začnimo "},
{"id": "s10", "text": "z "},
{"id": "s11", "text": "najpomembnejšim "},
{"id": "s12", "text": ": "},
{"id": "s13", "text": "v "},
{"id": "s14", "text": "Ukrajini "},
{"id": "s15", "text": "Novo "},
{"id": "s16", "text": "Leto "},
{"id": "s17", "text": "je "},
{"id": "s18", "text": "bolj "},
{"id": "s19", "text": "pomemben "},
{"id": "s20", "text": "praznik "},
{"id": "s21", "text": ", "},
{"id": "s22", "text": "za "},
{"id": "s23", "text": "Bozič "},
{"id": "s24", "text": ". "},
{"id": "s25", "text": "In "},
{"id": "s26", "text": "Božič "},
{"id": "s27", "text": "mi "},
{"id": "s28", "text": "imamo "},
{"id": "s29", "text": "7 "},
{"id": "s30", "text": "januara "},
{"id": "s31", "text": ", "},
{"id": "s32", "text": "ne "},
{"id": "s33", "text": "pa "},
{"id": "s34", "text": "25 "},
{"id": "s35", "text": "decembera "},
{"id": "s36", "text": ". "}
],
"target": [
{"id": "t0", "text": "Srečno "},
{"id": "t38", "text": "novo "},
{"id": "t40", "text": "leto "},
{"id": "t3", "text": "in "},
{"id": "t41", "text": "vesel "},
{"id": "t43", "text": "božič "},
{"id": "t6", "text": ". \n"},
{"id": "t7", "text": "Torej "},
{"id": "t8", "text": ", "},
{"id": "t9", "text": "začnimo "},
{"id": "t10", "text": "z "},
{"id": "t11", "text": "najpomembnejšim "},
{"id": "t12", "text": ": "},
{"id": "t13", "text": "v "},
{"id": "t14", "text": "Ukrajini "},
{"id": "t46", "text": "je "},
{"id": "t49", "text": "novo "},
{"id": "t51", "text": "leto "},
{"id": "t52", "text": "bolj "},
{"id": "t19", "text": "pomemben "},
{"id": "t20", "text": "praznik "},
{"id": "t58", "text": "kot "},
{"id": "t65", "text": "božič "},
{"id": "t24", "text": ". "},
{"id": "t25", "text": "In "},
{"id": "t68", "text": "božič "},
{"id": "t75", "text": "imamo "},
{"id": "t77", "text": "mi "},
{"id": "t71", "text": "7 "},
{"id": "t79", "text": ". "},
{"id": "t81", "text": "januarja "},
{"id": "t31", "text": ", "},
{"id": "t32", "text": "ne "},
{"id": "t33", "text": "pa "},
{"id": "t34", "text": "25 "},
{"id": "t83", "text": ". "},
{"id": "t85", "text": "decembra "},
{"id": "t36", "text": ". "}
],
"edges": {
"e-s17-t46": {
"id": "e-s17-t46",
"ids": ["s17", "t46"],
"labels": ["S-BR"],
"manual": true
},
"e-s22-t58": {
"id": "e-s22-t58",
"ids": ["s22", "t58"],
"labels": ["B-VEZ"],
"manual": true
},
"e-s27-s28-t75-t77": {
"id": "e-s27-s28-t75-t77",
"ids": ["s27", "s28", "t75", "t77"],
"labels": ["S-BR"],
"manual": true
},
"e-s0-t0": {
"id": "e-s0-t0",
"ids": ["s0", "t0"],
"labels": [],
"manual": false
},
"e-s1-t38": {
"id": "e-s1-t38",
"ids": ["s1", "t38"],
"labels": ["Z-MV"],
"manual": false
},
"e-s2-t40": {
"id": "e-s2-t40",
"ids": ["s2", "t40"],
"labels": ["Z-MV"],
"manual": false
},
"e-s3-t3": {
"id": "e-s3-t3",
"ids": ["s3", "t3"],
"labels": [],
"manual": false
},
"e-s4-t41": {
"id": "e-s4-t41",
"ids": ["s4", "t41"],
"labels": ["Z-CRK"],
"manual": false
},
"e-s5-t43": {
"id": "e-s5-t43",
"ids": ["s5", "t43"],
"labels": ["Z-MV"],
"manual": false
},
"e-s6-t6": {
"id": "e-s6-t6",
"ids": ["s6", "t6"],
"labels": [],
"manual": false
},
"e-s7-t7": {
"id": "e-s7-t7",
"ids": ["s7", "t7"],
"labels": [],
"manual": false
},
"e-s8-t8": {
"id": "e-s8-t8",
"ids": ["s8", "t8"],
"labels": [],
"manual": false
},
"e-s9-t9": {
"id": "e-s9-t9",
"ids": ["s9", "t9"],
"labels": [],
"manual": false
},
"e-s10-t10": {
"id": "e-s10-t10",
"ids": ["s10", "t10"],
"labels": [],
"manual": false
},
"e-s11-t11": {
"id": "e-s11-t11",
"ids": ["s11", "t11"],
"labels": [],
"manual": false
},
"e-s12-t12": {
"id": "e-s12-t12",
"ids": ["s12", "t12"],
"labels": [],
"manual": false
},
"e-s13-t13": {
"id": "e-s13-t13",
"ids": ["s13", "t13"],
"labels": [],
"manual": false
},
"e-s14-t14": {
"id": "e-s14-t14",
"ids": ["s14", "t14"],
"labels": [],
"manual": false
},
"e-s15-t49": {
"id": "e-s15-t49",
"ids": ["s15", "t49"],
"labels": ["Z-MV"],
"manual": false
},
"e-s16-t51": {
"id": "e-s16-t51",
"ids": ["s16", "t51"],
"labels": ["Z-MV"],
"manual": false
},
"e-s18-t52": {
"id": "e-s18-t52",
"ids": ["s18", "t52"],
"labels": [],
"manual": false
},
"e-s19-t19": {
"id": "e-s19-t19",
"ids": ["s19", "t19"],
"labels": [],
"manual": false
},
"e-s20-t20": {
"id": "e-s20-t20",
"ids": ["s20", "t20"],
"labels": [],
"manual": false
},
"e-s21": {
"id": "e-s21",
"ids": ["s21"],
"labels": ["Z-LOC"],
"manual": false
},
"e-s23-t65": {
"id": "e-s23-t65",
"ids": ["s23", "t65"],
"labels": ["Z-CRK", "Z-MV"],
"manual": false
},
"e-s24-t24": {
"id": "e-s24-t24",
"ids": ["s24", "t24"],
"labels": [],
"manual": false
},
"e-s25-t25": {
"id": "e-s25-t25",
"ids": ["s25", "t25"],
"labels": [],
"manual": false
},
"e-s26-t68": {
"id": "e-s26-t68",
"ids": ["s26", "t68"],
"labels": ["Z-MV"],
"manual": false
},
"e-s29-t71": {
"id": "e-s29-t71",
"ids": ["s29", "t71"],
"labels": [],
"manual": false
},
"e-s30-t81": {
"id": "e-s30-t81",
"ids": ["s30", "t81"],
"labels": ["O-SAM"],
"manual": false
},
"e-s31-t31": {
"id": "e-s31-t31",
"ids": ["s31", "t31"],
"labels": [],
"manual": false
},
"e-s32-t32": {
"id": "e-s32-t32",
"ids": ["s32", "t32"],
"labels": [],
"manual": false
},
"e-s33-t33": {
"id": "e-s33-t33",
"ids": ["s33", "t33"],
"labels": [],
"manual": false
},
"e-s34-t34": {
"id": "e-s34-t34",
"ids": ["s34", "t34"],
"labels": [],
"manual": false
},
"e-s35-t85": {
"id": "e-s35-t85",
"ids": ["s35", "t85"],
"labels": ["O-SAM"],
"manual": false
},
"e-s36-t36": {
"id": "e-s36-t36",
"ids": ["s36", "t36"],
"labels": [],
"manual": false
},
"e-t79": {
"id": "e-t79",
"ids": ["t79"],
"labels": ["Z-LOC"],
"manual": false
},
"e-t83": {
"id": "e-t83",
"ids": ["t83"],
"labels": ["Z-LOC"],
"manual": false
}
}
}

View File

@ -0,0 +1,475 @@
{
"source": [
{"id": "s0", "text": "Na "},
{"id": "s1", "text": "Novo "},
{"id": "s2", "text": "Leto "},
{"id": "s3", "text": "vse "},
{"id": "s4", "text": "imajo "},
{"id": "s5", "text": "praznično "},
{"id": "s6", "text": "mizo "},
{"id": "s7", "text": ". "},
{"id": "s8", "text": "Glavna "},
{"id": "s9", "text": "jed "},
{"id": "s10", "text": "je "},
{"id": "s11", "text": "običajno "},
{"id": "s12", "text": "francoska "},
{"id": "s13", "text": "solata "},
{"id": "s14", "text": ". "},
{"id": "s15", "text": "Običajno "},
{"id": "s16", "text": "praznujejo "},
{"id": "s17", "text": "z "},
{"id": "s18", "text": "družino "},
{"id": "s19", "text": ". "},
{"id": "s20", "text": "Ob "},
{"id": "s21", "text": "12 "},
{"id": "s22", "text": "pogosto "},
{"id": "s23", "text": "ljudi "},
{"id": "s24", "text": "grejo "},
{"id": "s25", "text": "na "},
{"id": "s26", "text": "ulico "},
{"id": "s27", "text": "in "},
{"id": "s28", "text": "streljajo "},
{"id": "s29", "text": "ognjemet "},
{"id": "s30", "text": ". "},
{"id": "s31", "text": "Potem "},
{"id": "s32", "text": "otroki "},
{"id": "s33", "text": "najdejo "},
{"id": "s34", "text": "darila "},
{"id": "s35", "text": "pod "},
{"id": "s36", "text": "» "},
{"id": "s37", "text": "Božnično "},
{"id": "s38", "text": "« "},
{"id": "s39", "text": "jelko "},
{"id": "s40", "text": ", "},
{"id": "s41", "text": "ampak "},
{"id": "s42", "text": "mi "},
{"id": "s43", "text": "imamo "},
{"id": "s44", "text": "» "},
{"id": "s45", "text": "Novo-letno "},
{"id": "s46", "text": "« "},
{"id": "s47", "text": "jelko "},
{"id": "s48", "text": ". "},
{"id": "s49", "text": "Mislim "},
{"id": "s50", "text": ", "},
{"id": "s51", "text": "da "},
{"id": "s52", "text": "včasih "},
{"id": "s53", "text": "ljudje "},
{"id": "s54", "text": "dajejo "},
{"id": "s55", "text": "darila "},
{"id": "s56", "text": "prijatiljam "},
{"id": "s57", "text": ". "}
],
"target": [
{"id": "t0", "text": "Na "},
{"id": "t59", "text": "novo "},
{"id": "t61", "text": "leto "},
{"id": "t68", "text": "imajo "},
{"id": "t73", "text": "vsi "},
{"id": "t77", "text": "praznični "},
{"id": "t83", "text": "obed "},
{"id": "t84", "text": ". "},
{"id": "t8", "text": "Glavna "},
{"id": "t9", "text": "jed "},
{"id": "t10", "text": "je "},
{"id": "t11", "text": "običajno "},
{"id": "t12", "text": "francoska "},
{"id": "t13", "text": "solata "},
{"id": "t14", "text": ". "},
{"id": "t15", "text": "Običajno "},
{"id": "t16", "text": "praznujejo "},
{"id": "t17", "text": "z "},
{"id": "t18", "text": "družino "},
{"id": "t19", "text": ". "},
{"id": "t20", "text": "Ob "},
{"id": "t21", "text": "12 "},
{"id": "t86", "text": ". "},
{"id": "t87", "text": "pogosto "},
{"id": "t90", "text": "ljudje "},
{"id": "t24", "text": "grejo "},
{"id": "t25", "text": "na "},
{"id": "t26", "text": "ulico "},
{"id": "t27", "text": "in "},
{"id": "t95", "text": "imajo "},
{"id": "t29", "text": "ognjemet "},
{"id": "t30", "text": ". "},
{"id": "t31", "text": "Potem "},
{"id": "t97", "text": "otroci "},
{"id": "t33", "text": "najdejo "},
{"id": "t34", "text": "darila "},
{"id": "t99", "text": "pod "},
{"id": "t105", "text": "božično "},
{"id": "t39", "text": "jelko "},
{"id": "t40", "text": ", "},
{"id": "t41", "text": "ampak "},
{"id": "t42", "text": "mi "},
{"id": "t107", "text": "imamo "},
{"id": "t113", "text": "novoletno "},
{"id": "t47", "text": "jelko "},
{"id": "t48", "text": ". "},
{"id": "t49", "text": "Mislim "},
{"id": "t50", "text": ", "},
{"id": "t51", "text": "da "},
{"id": "t52", "text": "včasih "},
{"id": "t53", "text": "ljudje "},
{"id": "t54", "text": "dajejo "},
{"id": "t55", "text": "darila "},
{"id": "t117", "text": "prijateljem "},
{"id": "t57", "text": ". "}
],
"edges": {
"e-s3-t73": {
"id": "e-s3-t73",
"ids": ["s3", "t73"],
"labels": ["O-PRID", "S-BR"],
"manual": true
},
"e-s5-t77": {
"id": "e-s5-t77",
"ids": ["s5", "t77"],
"labels": ["O-PRID", "POV"],
"manual": true
},
"e-s6-t83": {
"id": "e-s6-t83",
"ids": ["s6", "t83"],
"labels": ["B-SAM"],
"manual": true
},
"e-s0-t0": {
"id": "e-s0-t0",
"ids": ["s0", "t0"],
"labels": [],
"manual": false
},
"e-s1-t59": {
"id": "e-s1-t59",
"ids": ["s1", "t59"],
"labels": ["Z-MV"],
"manual": false
},
"e-s2-t61": {
"id": "e-s2-t61",
"ids": ["s2", "t61"],
"labels": ["Z-MV"],
"manual": false
},
"e-s4-t68": {
"id": "e-s4-t68",
"ids": ["s4", "t68"],
"labels": [],
"manual": false
},
"e-s7-t84": {
"id": "e-s7-t84",
"ids": ["s7", "t84"],
"labels": [],
"manual": false
},
"e-s8-t8": {
"id": "e-s8-t8",
"ids": ["s8", "t8"],
"labels": [],
"manual": false
},
"e-s9-t9": {
"id": "e-s9-t9",
"ids": ["s9", "t9"],
"labels": [],
"manual": false
},
"e-s10-t10": {
"id": "e-s10-t10",
"ids": ["s10", "t10"],
"labels": [],
"manual": false
},
"e-s11-t11": {
"id": "e-s11-t11",
"ids": ["s11", "t11"],
"labels": [],
"manual": false
},
"e-s12-t12": {
"id": "e-s12-t12",
"ids": ["s12", "t12"],
"labels": [],
"manual": false
},
"e-s13-t13": {
"id": "e-s13-t13",
"ids": ["s13", "t13"],
"labels": [],
"manual": false
},
"e-s14-t14": {
"id": "e-s14-t14",
"ids": ["s14", "t14"],
"labels": [],
"manual": false
},
"e-s15-t15": {
"id": "e-s15-t15",
"ids": ["s15", "t15"],
"labels": [],
"manual": false
},
"e-s16-t16": {
"id": "e-s16-t16",
"ids": ["s16", "t16"],
"labels": [],
"manual": false
},
"e-s17-t17": {
"id": "e-s17-t17",
"ids": ["s17", "t17"],
"labels": [],
"manual": false
},
"e-s18-t18": {
"id": "e-s18-t18",
"ids": ["s18", "t18"],
"labels": [],
"manual": false
},
"e-s19-t19": {
"id": "e-s19-t19",
"ids": ["s19", "t19"],
"labels": [],
"manual": false
},
"e-s20-t20": {
"id": "e-s20-t20",
"ids": ["s20", "t20"],
"labels": [],
"manual": false
},
"e-s21-t21": {
"id": "e-s21-t21",
"ids": ["s21", "t21"],
"labels": [],
"manual": false
},
"e-s22-t87": {
"id": "e-s22-t87",
"ids": ["s22", "t87"],
"labels": [],
"manual": false
},
"e-s23-t90": {
"id": "e-s23-t90",
"ids": ["s23", "t90"],
"labels": ["O-SAM"],
"manual": false
},
"e-s24-t24": {
"id": "e-s24-t24",
"ids": ["s24", "t24"],
"labels": [],
"manual": false
},
"e-s25-t25": {
"id": "e-s25-t25",
"ids": ["s25", "t25"],
"labels": [],
"manual": false
},
"e-s26-t26": {
"id": "e-s26-t26",
"ids": ["s26", "t26"],
"labels": [],
"manual": false
},
"e-s27-t27": {
"id": "e-s27-t27",
"ids": ["s27", "t27"],
"labels": [],
"manual": false
},
"e-s28-t95": {
"id": "e-s28-t95",
"ids": ["s28", "t95"],
"labels": ["B-GLAG"],
"manual": false
},
"e-s29-t29": {
"id": "e-s29-t29",
"ids": ["s29", "t29"],
"labels": [],
"manual": false
},
"e-s30-t30": {
"id": "e-s30-t30",
"ids": ["s30", "t30"],
"labels": [],
"manual": false
},
"e-s31-t31": {
"id": "e-s31-t31",
"ids": ["s31", "t31"],
"labels": [],
"manual": false
},
"e-s32-t97": {
"id": "e-s32-t97",
"ids": ["s32", "t97"],
"labels": ["O-SAM"],
"manual": false
},
"e-s33-t33": {
"id": "e-s33-t33",
"ids": ["s33", "t33"],
"labels": [],
"manual": false
},
"e-s34-t34": {
"id": "e-s34-t34",
"ids": ["s34", "t34"],
"labels": [],
"manual": false
},
"e-s35-t99": {
"id": "e-s35-t99",
"ids": ["s35", "t99"],
"labels": [],
"manual": false
},
"e-s36": {
"id": "e-s36",
"ids": ["s36"],
"labels": ["Z-LOC"],
"manual": false
},
"e-s37-t105": {
"id": "e-s37-t105",
"ids": ["s37", "t105"],
"labels": ["B-PRID", "Z-MV"],
"manual": false
},
"e-s38": {
"id": "e-s38",
"ids": ["s38"],
"labels": ["Z-LOC"],
"manual": false
},
"e-s39-t39": {
"id": "e-s39-t39",
"ids": ["s39", "t39"],
"labels": [],
"manual": false
},
"e-s40-t40": {
"id": "e-s40-t40",
"ids": ["s40", "t40"],
"labels": [],
"manual": false
},
"e-s41-t41": {
"id": "e-s41-t41",
"ids": ["s41", "t41"],
"labels": [],
"manual": false
},
"e-s42-t42": {
"id": "e-s42-t42",
"ids": ["s42", "t42"],
"labels": [],
"manual": false
},
"e-s43-t107": {
"id": "e-s43-t107",
"ids": ["s43", "t107"],
"labels": [],
"manual": false
},
"e-s44": {
"id": "e-s44",
"ids": ["s44"],
"labels": ["Z-LOC"],
"manual": false
},
"e-s45-t113": {
"id": "e-s45-t113",
"ids": ["s45", "t113"],
"labels": ["Z-MV", "Z-SN"],
"manual": false
},
"e-s46": {
"id": "e-s46",
"ids": ["s46"],
"labels": ["Z-LOC"],
"manual": false
},
"e-s47-t47": {
"id": "e-s47-t47",
"ids": ["s47", "t47"],
"labels": [],
"manual": false
},
"e-s48-t48": {
"id": "e-s48-t48",
"ids": ["s48", "t48"],
"labels": [],
"manual": false
},
"e-s49-t49": {
"id": "e-s49-t49",
"ids": ["s49", "t49"],
"labels": [],
"manual": false
},
"e-s50-t50": {
"id": "e-s50-t50",
"ids": ["s50", "t50"],
"labels": [],
"manual": false
},
"e-s51-t51": {
"id": "e-s51-t51",
"ids": ["s51", "t51"],
"labels": [],
"manual": false
},
"e-s52-t52": {
"id": "e-s52-t52",
"ids": ["s52", "t52"],
"labels": [],
"manual": false
},
"e-s53-t53": {
"id": "e-s53-t53",
"ids": ["s53", "t53"],
"labels": [],
"manual": false
},
"e-s54-t54": {
"id": "e-s54-t54",
"ids": ["s54", "t54"],
"labels": [],
"manual": false
},
"e-s55-t55": {
"id": "e-s55-t55",
"ids": ["s55", "t55"],
"labels": [],
"manual": false
},
"e-s56-t117": {
"id": "e-s56-t117",
"ids": ["s56", "t117"],
"labels": ["O-SAM"],
"manual": false
},
"e-s57-t57": {
"id": "e-s57-t57",
"ids": ["s57", "t57"],
"labels": [],
"manual": false
},
"e-t86": {
"id": "e-t86",
"ids": ["t86"],
"labels": ["Z-LOC"],
"manual": false
}
}
}

View File

@ -0,0 +1,288 @@
{
"source": [
{"id": "s0", "text": "Jaz "},
{"id": "s1", "text": "ne "},
{"id": "s2", "text": "maram "},
{"id": "s3", "text": "zimo "},
{"id": "s4", "text": ", "},
{"id": "s5", "text": "vendar "},
{"id": "s6", "text": "Novo "},
{"id": "s7", "text": "leto "},
{"id": "s8", "text": "za "},
{"id": "s9", "text": "me "},
{"id": "s10", "text": "je "},
{"id": "s11", "text": "vsaj "},
{"id": "s12", "text": "nekaj "},
{"id": "s13", "text": "veselja "},
{"id": "s14", "text": ". "},
{"id": "s15", "text": "Kot "},
{"id": "s16", "text": "sem "},
{"id": "s17", "text": "bil "},
{"id": "s18", "text": "otrokom "},
{"id": "s19", "text": ", "},
{"id": "s20", "text": "je "},
{"id": "s21", "text": "bilo "},
{"id": "s22", "text": "zanimivo "},
{"id": "s23", "text": "in "},
{"id": "s24", "text": "zabavno "},
{"id": "s25", "text": "prejemati "},
{"id": "s26", "text": "darila "},
{"id": "s27", "text": ". "},
{"id": "s28", "text": "Zdaj "},
{"id": "s29", "text": "to "},
{"id": "s30", "text": "je "},
{"id": "s31", "text": "zelo "},
{"id": "s32", "text": "težko "},
{"id": "s33", "text": "izbrati "},
{"id": "s34", "text": "darila "},
{"id": "s35", "text": ". "}
],
"target": [
{"id": "t0", "text": "Jaz "},
{"id": "t1", "text": "ne "},
{"id": "t2", "text": "maram "},
{"id": "t37", "text": "zime "},
{"id": "t4", "text": ", "},
{"id": "t5", "text": "vendar "},
{"id": "t41", "text": "je "},
{"id": "t43", "text": "novo "},
{"id": "t7", "text": "leto "},
{"id": "t44", "text": "zame "},
{"id": "t45", "text": "vsaj "},
{"id": "t12", "text": "nekaj "},
{"id": "t13", "text": "veselja "},
{"id": "t14", "text": ". "},
{"id": "t46", "text": "Ko "},
{"id": "t16", "text": "sem "},
{"id": "t17", "text": "bil "},
{"id": "t48", "text": "otrok "},
{"id": "t19", "text": ", "},
{"id": "t20", "text": "je "},
{"id": "t21", "text": "bilo "},
{"id": "t22", "text": "zanimivo "},
{"id": "t23", "text": "in "},
{"id": "t24", "text": "zabavno "},
{"id": "t25", "text": "prejemati "},
{"id": "t26", "text": "darila "},
{"id": "t27", "text": ". "},
{"id": "t49", "text": "Zdaj "},
{"id": "t30", "text": "je "},
{"id": "t31", "text": "zelo "},
{"id": "t32", "text": "težko "},
{"id": "t33", "text": "izbrati "},
{"id": "t34", "text": "darila "},
{"id": "t35", "text": ". "}
],
"edges": {
"e-s10-t41": {
"id": "e-s10-t41",
"ids": ["s10", "t41"],
"labels": ["S-BR"],
"manual": true
},
"e-s0-t0": {
"id": "e-s0-t0",
"ids": ["s0", "t0"],
"labels": [],
"manual": false
},
"e-s1-t1": {
"id": "e-s1-t1",
"ids": ["s1", "t1"],
"labels": [],
"manual": false
},
"e-s2-t2": {
"id": "e-s2-t2",
"ids": ["s2", "t2"],
"labels": [],
"manual": false
},
"e-s3-t37": {
"id": "e-s3-t37",
"ids": ["s3", "t37"],
"labels": ["O-SAM"],
"manual": false
},
"e-s4-t4": {
"id": "e-s4-t4",
"ids": ["s4", "t4"],
"labels": [],
"manual": false
},
"e-s5-t5": {
"id": "e-s5-t5",
"ids": ["s5", "t5"],
"labels": [],
"manual": false
},
"e-s6-t43": {
"id": "e-s6-t43",
"ids": ["s6", "t43"],
"labels": ["Z-MV"],
"manual": false
},
"e-s7-t7": {
"id": "e-s7-t7",
"ids": ["s7", "t7"],
"labels": [],
"manual": false
},
"e-s8-s9-t44": {
"id": "e-s8-s9-t44",
"ids": ["s8", "s9", "t44"],
"labels": ["Z-SN"],
"manual": false
},
"e-s11-t45": {
"id": "e-s11-t45",
"ids": ["s11", "t45"],
"labels": [],
"manual": false
},
"e-s12-t12": {
"id": "e-s12-t12",
"ids": ["s12", "t12"],
"labels": [],
"manual": false
},
"e-s13-t13": {
"id": "e-s13-t13",
"ids": ["s13", "t13"],
"labels": [],
"manual": false
},
"e-s14-t14": {
"id": "e-s14-t14",
"ids": ["s14", "t14"],
"labels": [],
"manual": false
},
"e-s15-t46": {
"id": "e-s15-t46",
"ids": ["s15", "t46"],
"labels": ["B-VEZ"],
"manual": false
},
"e-s16-t16": {
"id": "e-s16-t16",
"ids": ["s16", "t16"],
"labels": [],
"manual": false
},
"e-s17-t17": {
"id": "e-s17-t17",
"ids": ["s17", "t17"],
"labels": [],
"manual": false
},
"e-s18-t48": {
"id": "e-s18-t48",
"ids": ["s18", "t48"],
"labels": ["O-SAM"],
"manual": false
},
"e-s19-t19": {
"id": "e-s19-t19",
"ids": ["s19", "t19"],
"labels": [],
"manual": false
},
"e-s20-t20": {
"id": "e-s20-t20",
"ids": ["s20", "t20"],
"labels": [],
"manual": false
},
"e-s21-t21": {
"id": "e-s21-t21",
"ids": ["s21", "t21"],
"labels": [],
"manual": false
},
"e-s22-t22": {
"id": "e-s22-t22",
"ids": ["s22", "t22"],
"labels": [],
"manual": false
},
"e-s23-t23": {
"id": "e-s23-t23",
"ids": ["s23", "t23"],
"labels": [],
"manual": false
},
"e-s24-t24": {
"id": "e-s24-t24",
"ids": ["s24", "t24"],
"labels": [],
"manual": false
},
"e-s25-t25": {
"id": "e-s25-t25",
"ids": ["s25", "t25"],
"labels": [],
"manual": false
},
"e-s26-t26": {
"id": "e-s26-t26",
"ids": ["s26", "t26"],
"labels": [],
"manual": false
},
"e-s27-t27": {
"id": "e-s27-t27",
"ids": ["s27", "t27"],
"labels": [],
"manual": false
},
"e-s28-t49": {
"id": "e-s28-t49",
"ids": ["s28", "t49"],
"labels": [],
"manual": false
},
"e-s29": {
"id": "e-s29",
"ids": ["s29"],
"labels": ["S-ODV"],
"manual": false
},
"e-s30-t30": {
"id": "e-s30-t30",
"ids": ["s30", "t30"],
"labels": [],
"manual": false
},
"e-s31-t31": {
"id": "e-s31-t31",
"ids": ["s31", "t31"],
"labels": [],
"manual": false
},
"e-s32-t32": {
"id": "e-s32-t32",
"ids": ["s32", "t32"],
"labels": [],
"manual": false
},
"e-s33-t33": {
"id": "e-s33-t33",
"ids": ["s33", "t33"],
"labels": [],
"manual": false
},
"e-s34-t34": {
"id": "e-s34-t34",
"ids": ["s34", "t34"],
"labels": [],
"manual": false
},
"e-s35-t35": {
"id": "e-s35-t35",
"ids": ["s35", "t35"],
"labels": [],
"manual": false
}
}
}

View File

@ -0,0 +1,622 @@
{
"source": [
{"id": "s0", "text": "Zelo "},
{"id": "s1", "text": "rada "},
{"id": "s2", "text": "imam "},
{"id": "s3", "text": "slovanske "},
{"id": "s4", "text": "jezike "},
{"id": "s5", "text": ", "},
{"id": "s6", "text": "saj "},
{"id": "s7", "text": "je "},
{"id": "s8", "text": "tudi "},
{"id": "s9", "text": "moj "},
{"id": "s10", "text": "materni "},
{"id": "s11", "text": "jezik "},
{"id": "s12", "text": ", "},
{"id": "s13", "text": "ukrajinščina "},
{"id": "s14", "text": ", "},
{"id": "s15", "text": "slovanski "},
{"id": "s16", "text": "jezik "},
{"id": "s17", "text": ". "},
{"id": "s18", "text": "Ko "},
{"id": "s19", "text": "sem "},
{"id": "s20", "text": "bila "},
{"id": "s21", "text": "v "},
{"id": "s22", "text": "šoli "},
{"id": "s23", "text": ", "},
{"id": "s24", "text": "veliko "},
{"id": "s25", "text": "slišala "},
{"id": "s26", "text": "o "},
{"id": "s27", "text": "Sloveniji "},
{"id": "s28", "text": ": "},
{"id": "s29", "text": "majhni "},
{"id": "s30", "text": ", "},
{"id": "s31", "text": "prijetni "},
{"id": "s32", "text": "in "},
{"id": "s33", "text": "neverjetno "},
{"id": "s34", "text": "lepi "},
{"id": "s35", "text": "deželi "},
{"id": "s36", "text": ". "},
{"id": "s37", "text": "Tako "},
{"id": "s38", "text": "zaradi "},
{"id": "s39", "text": "moje "},
{"id": "s40", "text": "ljubezni "},
{"id": "s41", "text": "do "},
{"id": "s42", "text": "slovanskih "},
{"id": "s43", "text": "jezikov "},
{"id": "s44", "text": "in "},
{"id": "s45", "text": "Slovenije "},
{"id": "s46", "text": ", "},
{"id": "s47", "text": "slovenščino "},
{"id": "s48", "text": "študiram "},
{"id": "s49", "text": "že "},
{"id": "s50", "text": "tri "},
{"id": "s51", "text": "leta "},
{"id": "s52", "text": ". "},
{"id": "s53", "text": "Všeč "},
{"id": "s54", "text": "mi "},
{"id": "s55", "text": "je "},
{"id": "s56", "text": "praksa "},
{"id": "s57", "text": "prevajanja "},
{"id": "s58", "text": ". "},
{"id": "s59", "text": "Na "},
{"id": "s60", "text": "univerzi "},
{"id": "s61", "text": "pogosto "},
{"id": "s62", "text": "prevajam "},
{"id": "s63", "text": "iz "},
{"id": "s64", "text": "slovenščine "},
{"id": "s65", "text": "v "},
{"id": "s66", "text": "ukrajinščino "},
{"id": "s67", "text": ". "},
{"id": "s68", "text": "V "},
{"id": "s69", "text": "prihodnosti "},
{"id": "s70", "text": "želim "},
{"id": "s71", "text": "delati "},
{"id": "s72", "text": "kot "},
{"id": "s73", "text": "slovensko-ukrajinska "},
{"id": "s74", "text": "prevajalka "},
{"id": "s75", "text": ". "}
],
"target": [
{"id": "t0", "text": "Zelo "},
{"id": "t1", "text": "rada "},
{"id": "t2", "text": "imam "},
{"id": "t3", "text": "slovanske "},
{"id": "t4", "text": "jezike "},
{"id": "t5", "text": ", "},
{"id": "t6", "text": "saj "},
{"id": "t7", "text": "je "},
{"id": "t8", "text": "tudi "},
{"id": "t9", "text": "moj "},
{"id": "t10", "text": "materni "},
{"id": "t11", "text": "jezik "},
{"id": "t12", "text": ", "},
{"id": "t13", "text": "ukrajinščina "},
{"id": "t14", "text": ", "},
{"id": "t15", "text": "slovanski "},
{"id": "t16", "text": "jezik "},
{"id": "t17", "text": ". "},
{"id": "t18", "text": "Ko "},
{"id": "t19", "text": "sem "},
{"id": "t20", "text": "bila "},
{"id": "t21", "text": "v "},
{"id": "t22", "text": "šoli "},
{"id": "t23", "text": ", "},
{"id": "t79", "text": "sem "},
{"id": "t80", "text": "veliko "},
{"id": "t25", "text": "slišala "},
{"id": "t26", "text": "o "},
{"id": "t27", "text": "Sloveniji "},
{"id": "t28", "text": ": "},
{"id": "t29", "text": "majhni "},
{"id": "t30", "text": ", "},
{"id": "t31", "text": "prijetni "},
{"id": "t32", "text": "in "},
{"id": "t33", "text": "neverjetno "},
{"id": "t34", "text": "lepi "},
{"id": "t35", "text": "deželi "},
{"id": "t36", "text": ". "},
{"id": "t37", "text": "Tako "},
{"id": "t38", "text": "zaradi "},
{"id": "t39", "text": "moje "},
{"id": "t40", "text": "ljubezni "},
{"id": "t41", "text": "do "},
{"id": "t42", "text": "slovanskih "},
{"id": "t43", "text": "jezikov "},
{"id": "t44", "text": "in "},
{"id": "t82", "text": "Slovenije "},
{"id": "t83", "text": "slovenščino "},
{"id": "t48", "text": "študiram "},
{"id": "t49", "text": "že "},
{"id": "t50", "text": "tri "},
{"id": "t51", "text": "leta "},
{"id": "t52", "text": ". "},
{"id": "t53", "text": "Všeč "},
{"id": "t54", "text": "mi "},
{"id": "t55", "text": "je "},
{"id": "t56", "text": "praksa "},
{"id": "t57", "text": "prevajanja "},
{"id": "t58", "text": ". "},
{"id": "t59", "text": "Na "},
{"id": "t60", "text": "univerzi "},
{"id": "t61", "text": "pogosto "},
{"id": "t62", "text": "prevajam "},
{"id": "t63", "text": "iz "},
{"id": "t64", "text": "slovenščine "},
{"id": "t65", "text": "v "},
{"id": "t66", "text": "ukrajinščino "},
{"id": "t67", "text": ". "},
{"id": "t68", "text": "V "},
{"id": "t69", "text": "prihodnosti "},
{"id": "t70", "text": "želim "},
{"id": "t71", "text": "delati "},
{"id": "t72", "text": "kot "},
{"id": "t73", "text": "slovensko-ukrajinska "},
{"id": "t74", "text": "prevajalka "},
{"id": "t75", "text": ". "}
],
"edges": {
"e-s0-t0": {
"id": "e-s0-t0",
"ids": ["s0", "t0"],
"labels": [],
"manual": false
},
"e-s1-t1": {
"id": "e-s1-t1",
"ids": ["s1", "t1"],
"labels": [],
"manual": false
},
"e-s2-t2": {
"id": "e-s2-t2",
"ids": ["s2", "t2"],
"labels": [],
"manual": false
},
"e-s3-t3": {
"id": "e-s3-t3",
"ids": ["s3", "t3"],
"labels": [],
"manual": false
},
"e-s4-t4": {
"id": "e-s4-t4",
"ids": ["s4", "t4"],
"labels": [],
"manual": false
},
"e-s5-t5": {
"id": "e-s5-t5",
"ids": ["s5", "t5"],
"labels": [],
"manual": false
},
"e-s6-t6": {
"id": "e-s6-t6",
"ids": ["s6", "t6"],
"labels": [],
"manual": false
},
"e-s7-t7": {
"id": "e-s7-t7",
"ids": ["s7", "t7"],
"labels": [],
"manual": false
},
"e-s8-t8": {
"id": "e-s8-t8",
"ids": ["s8", "t8"],
"labels": [],
"manual": false
},
"e-s9-t9": {
"id": "e-s9-t9",
"ids": ["s9", "t9"],
"labels": [],
"manual": false
},
"e-s10-t10": {
"id": "e-s10-t10",
"ids": ["s10", "t10"],
"labels": [],
"manual": false
},
"e-s11-t11": {
"id": "e-s11-t11",
"ids": ["s11", "t11"],
"labels": [],
"manual": false
},
"e-s12-t12": {
"id": "e-s12-t12",
"ids": ["s12", "t12"],
"labels": [],
"manual": false
},
"e-s13-t13": {
"id": "e-s13-t13",
"ids": ["s13", "t13"],
"labels": [],
"manual": false
},
"e-s14-t14": {
"id": "e-s14-t14",
"ids": ["s14", "t14"],
"labels": [],
"manual": false
},
"e-s15-t15": {
"id": "e-s15-t15",
"ids": ["s15", "t15"],
"labels": [],
"manual": false
},
"e-s16-t16": {
"id": "e-s16-t16",
"ids": ["s16", "t16"],
"labels": [],
"manual": false
},
"e-s17-t17": {
"id": "e-s17-t17",
"ids": ["s17", "t17"],
"labels": [],
"manual": false
},
"e-s18-t18": {
"id": "e-s18-t18",
"ids": ["s18", "t18"],
"labels": [],
"manual": false
},
"e-s19-t19": {
"id": "e-s19-t19",
"ids": ["s19", "t19"],
"labels": [],
"manual": false
},
"e-s20-t20": {
"id": "e-s20-t20",
"ids": ["s20", "t20"],
"labels": [],
"manual": false
},
"e-s21-t21": {
"id": "e-s21-t21",
"ids": ["s21", "t21"],
"labels": [],
"manual": false
},
"e-s22-t22": {
"id": "e-s22-t22",
"ids": ["s22", "t22"],
"labels": [],
"manual": false
},
"e-s23-t23": {
"id": "e-s23-t23",
"ids": ["s23", "t23"],
"labels": [],
"manual": false
},
"e-s24-t80": {
"id": "e-s24-t80",
"ids": ["s24", "t80"],
"labels": [],
"manual": false
},
"e-s25-t25": {
"id": "e-s25-t25",
"ids": ["s25", "t25"],
"labels": [],
"manual": false
},
"e-s26-t26": {
"id": "e-s26-t26",
"ids": ["s26", "t26"],
"labels": [],
"manual": false
},
"e-s27-t27": {
"id": "e-s27-t27",
"ids": ["s27", "t27"],
"labels": [],
"manual": false
},
"e-s28-t28": {
"id": "e-s28-t28",
"ids": ["s28", "t28"],
"labels": [],
"manual": false
},
"e-s29-t29": {
"id": "e-s29-t29",
"ids": ["s29", "t29"],
"labels": [],
"manual": false
},
"e-s30-t30": {
"id": "e-s30-t30",
"ids": ["s30", "t30"],
"labels": [],
"manual": false
},
"e-s31-t31": {
"id": "e-s31-t31",
"ids": ["s31", "t31"],
"labels": [],
"manual": false
},
"e-s32-t32": {
"id": "e-s32-t32",
"ids": ["s32", "t32"],
"labels": [],
"manual": false
},
"e-s33-t33": {
"id": "e-s33-t33",
"ids": ["s33", "t33"],
"labels": [],
"manual": false
},
"e-s34-t34": {
"id": "e-s34-t34",
"ids": ["s34", "t34"],
"labels": [],
"manual": false
},
"e-s35-t35": {
"id": "e-s35-t35",
"ids": ["s35", "t35"],
"labels": [],
"manual": false
},
"e-s36-t36": {
"id": "e-s36-t36",
"ids": ["s36", "t36"],
"labels": [],
"manual": false
},
"e-s37-t37": {
"id": "e-s37-t37",
"ids": ["s37", "t37"],
"labels": [],
"manual": false
},
"e-s38-t38": {
"id": "e-s38-t38",
"ids": ["s38", "t38"],
"labels": [],
"manual": false
},
"e-s39-t39": {
"id": "e-s39-t39",
"ids": ["s39", "t39"],
"labels": [],
"manual": false
},
"e-s40-t40": {
"id": "e-s40-t40",
"ids": ["s40", "t40"],
"labels": [],
"manual": false
},
"e-s41-t41": {
"id": "e-s41-t41",
"ids": ["s41", "t41"],
"labels": [],
"manual": false
},
"e-s42-t42": {
"id": "e-s42-t42",
"ids": ["s42", "t42"],
"labels": [],
"manual": false
},
"e-s43-t43": {
"id": "e-s43-t43",
"ids": ["s43", "t43"],
"labels": [],
"manual": false
},
"e-s44-t44": {
"id": "e-s44-t44",
"ids": ["s44", "t44"],
"labels": [],
"manual": false
},
"e-s45-t82": {
"id": "e-s45-t82",
"ids": ["s45", "t82"],
"labels": [],
"manual": false
},
"e-s46": {
"id": "e-s46",
"ids": ["s46"],
"labels": ["Z-LOC"],
"manual": false
},
"e-s47-t83": {
"id": "e-s47-t83",
"ids": ["s47", "t83"],
"labels": [],
"manual": false
},
"e-s48-t48": {
"id": "e-s48-t48",
"ids": ["s48", "t48"],
"labels": [],
"manual": false
},
"e-s49-t49": {
"id": "e-s49-t49",
"ids": ["s49", "t49"],
"labels": [],
"manual": false
},
"e-s50-t50": {
"id": "e-s50-t50",
"ids": ["s50", "t50"],
"labels": [],
"manual": false
},
"e-s51-t51": {
"id": "e-s51-t51",
"ids": ["s51", "t51"],
"labels": [],
"manual": false
},
"e-s52-t52": {
"id": "e-s52-t52",
"ids": ["s52", "t52"],
"labels": [],
"manual": false
},
"e-s53-t53": {
"id": "e-s53-t53",
"ids": ["s53", "t53"],
"labels": [],
"manual": false
},
"e-s54-t54": {
"id": "e-s54-t54",
"ids": ["s54", "t54"],
"labels": [],
"manual": false
},
"e-s55-t55": {
"id": "e-s55-t55",
"ids": ["s55", "t55"],
"labels": [],
"manual": false
},
"e-s56-t56": {
"id": "e-s56-t56",
"ids": ["s56", "t56"],
"labels": [],
"manual": false
},
"e-s57-t57": {
"id": "e-s57-t57",
"ids": ["s57", "t57"],
"labels": [],
"manual": false
},
"e-s58-t58": {
"id": "e-s58-t58",
"ids": ["s58", "t58"],
"labels": [],
"manual": false
},
"e-s59-t59": {
"id": "e-s59-t59",
"ids": ["s59", "t59"],
"labels": [],
"manual": false
},
"e-s60-t60": {
"id": "e-s60-t60",
"ids": ["s60", "t60"],
"labels": [],
"manual": false
},
"e-s61-t61": {
"id": "e-s61-t61",
"ids": ["s61", "t61"],
"labels": [],
"manual": false
},
"e-s62-t62": {
"id": "e-s62-t62",
"ids": ["s62", "t62"],
"labels": [],
"manual": false
},
"e-s63-t63": {
"id": "e-s63-t63",
"ids": ["s63", "t63"],
"labels": [],
"manual": false
},
"e-s64-t64": {
"id": "e-s64-t64",
"ids": ["s64", "t64"],
"labels": [],
"manual": false
},
"e-s65-t65": {
"id": "e-s65-t65",
"ids": ["s65", "t65"],
"labels": [],
"manual": false
},
"e-s66-t66": {
"id": "e-s66-t66",
"ids": ["s66", "t66"],
"labels": [],
"manual": false
},
"e-s67-t67": {
"id": "e-s67-t67",
"ids": ["s67", "t67"],
"labels": [],
"manual": false
},
"e-s68-t68": {
"id": "e-s68-t68",
"ids": ["s68", "t68"],
"labels": [],
"manual": false
},
"e-s69-t69": {
"id": "e-s69-t69",
"ids": ["s69", "t69"],
"labels": [],
"manual": false
},
"e-s70-t70": {
"id": "e-s70-t70",
"ids": ["s70", "t70"],
"labels": [],
"manual": false
},
"e-s71-t71": {
"id": "e-s71-t71",
"ids": ["s71", "t71"],
"labels": [],
"manual": false
},
"e-s72-t72": {
"id": "e-s72-t72",
"ids": ["s72", "t72"],
"labels": [],
"manual": false
},
"e-s73-t73": {
"id": "e-s73-t73",
"ids": ["s73", "t73"],
"labels": [],
"manual": false
},
"e-s74-t74": {
"id": "e-s74-t74",
"ids": ["s74", "t74"],
"labels": [],
"manual": false
},
"e-s75-t75": {
"id": "e-s75-t75",
"ids": ["s75", "t75"],
"labels": [],
"manual": false
},
"e-t79": {
"id": "e-t79",
"ids": ["t79"],
"labels": ["S-IZP"],
"manual": false
}
}
}

View File

@ -0,0 +1,208 @@
{
"source": [
{"id": "s0", "text": "Na "},
{"id": "s1", "text": "58 "},
{"id": "s2", "text": ". "},
{"id": "s3", "text": "seminarju "},
{"id": "s4", "text": "slovenskega "},
{"id": "s5", "text": "jezika "},
{"id": "s6", "text": ", "},
{"id": "s7", "text": "literature "},
{"id": "s8", "text": "in "},
{"id": "s9", "text": "kulture "},
{"id": "s10", "text": "hotela "},
{"id": "s11", "text": "bi "},
{"id": "s12", "text": "izvedeti "},
{"id": "s13", "text": "več "},
{"id": "s14", "text": "o "},
{"id": "s15", "text": "slovenski "},
{"id": "s16", "text": "kulturi "},
{"id": "s17", "text": "in "},
{"id": "s18", "text": "izboljšati "},
{"id": "s19", "text": "svojo "},
{"id": "s20", "text": "slovenščino "},
{"id": "s21", "text": ", "},
{"id": "s22", "text": "predvsem "},
{"id": "s23", "text": "izgovorjavo "},
{"id": "s24", "text": ". "}
],
"target": [
{"id": "t0", "text": "Na "},
{"id": "t1", "text": "58 "},
{"id": "t2", "text": ". "},
{"id": "t3", "text": "seminarju "},
{"id": "t4", "text": "slovenskega "},
{"id": "t5", "text": "jezika "},
{"id": "t6", "text": ", "},
{"id": "t7", "text": "literature "},
{"id": "t8", "text": "in "},
{"id": "t9", "text": "kulture "},
{"id": "t27", "text": "bi "},
{"id": "t28", "text": "hotela "},
{"id": "t29", "text": "izvedeti "},
{"id": "t13", "text": "več "},
{"id": "t14", "text": "o "},
{"id": "t15", "text": "slovenski "},
{"id": "t16", "text": "kulturi "},
{"id": "t17", "text": "in "},
{"id": "t18", "text": "izboljšati "},
{"id": "t19", "text": "svojo "},
{"id": "t20", "text": "slovenščino "},
{"id": "t21", "text": ", "},
{"id": "t22", "text": "predvsem "},
{"id": "t23", "text": "izgovorjavo "},
{"id": "t24", "text": ". "}
],
"edges": {
"e-s11-t27": {
"id": "e-s11-t27",
"ids": ["s11", "t27"],
"labels": ["S-BR"],
"manual": true
},
"e-s0-t0": {
"id": "e-s0-t0",
"ids": ["s0", "t0"],
"labels": [],
"manual": false
},
"e-s1-t1": {
"id": "e-s1-t1",
"ids": ["s1", "t1"],
"labels": [],
"manual": false
},
"e-s2-t2": {
"id": "e-s2-t2",
"ids": ["s2", "t2"],
"labels": [],
"manual": false
},
"e-s3-t3": {
"id": "e-s3-t3",
"ids": ["s3", "t3"],
"labels": [],
"manual": false
},
"e-s4-t4": {
"id": "e-s4-t4",
"ids": ["s4", "t4"],
"labels": [],
"manual": false
},
"e-s5-t5": {
"id": "e-s5-t5",
"ids": ["s5", "t5"],
"labels": [],
"manual": false
},
"e-s6-t6": {
"id": "e-s6-t6",
"ids": ["s6", "t6"],
"labels": [],
"manual": false
},
"e-s7-t7": {
"id": "e-s7-t7",
"ids": ["s7", "t7"],
"labels": [],
"manual": false
},
"e-s8-t8": {
"id": "e-s8-t8",
"ids": ["s8", "t8"],
"labels": [],
"manual": false
},
"e-s9-t9": {
"id": "e-s9-t9",
"ids": ["s9", "t9"],
"labels": [],
"manual": false
},
"e-s10-t28": {
"id": "e-s10-t28",
"ids": ["s10", "t28"],
"labels": [],
"manual": false
},
"e-s12-t29": {
"id": "e-s12-t29",
"ids": ["s12", "t29"],
"labels": [],
"manual": false
},
"e-s13-t13": {
"id": "e-s13-t13",
"ids": ["s13", "t13"],
"labels": [],
"manual": false
},
"e-s14-t14": {
"id": "e-s14-t14",
"ids": ["s14", "t14"],
"labels": [],
"manual": false
},
"e-s15-t15": {
"id": "e-s15-t15",
"ids": ["s15", "t15"],
"labels": [],
"manual": false
},
"e-s16-t16": {
"id": "e-s16-t16",
"ids": ["s16", "t16"],
"labels": [],
"manual": false
},
"e-s17-t17": {
"id": "e-s17-t17",
"ids": ["s17", "t17"],
"labels": [],
"manual": false
},
"e-s18-t18": {
"id": "e-s18-t18",
"ids": ["s18", "t18"],
"labels": [],
"manual": false
},
"e-s19-t19": {
"id": "e-s19-t19",
"ids": ["s19", "t19"],
"labels": [],
"manual": false
},
"e-s20-t20": {
"id": "e-s20-t20",
"ids": ["s20", "t20"],
"labels": [],
"manual": false
},
"e-s21-t21": {
"id": "e-s21-t21",
"ids": ["s21", "t21"],
"labels": [],
"manual": false
},
"e-s22-t22": {
"id": "e-s22-t22",
"ids": ["s22", "t22"],
"labels": [],
"manual": false
},
"e-s23-t23": {
"id": "e-s23-t23",
"ids": ["s23", "t23"],
"labels": [],
"manual": false
},
"e-s24-t24": {
"id": "e-s24-t24",
"ids": ["s24", "t24"],
"labels": [],
"manual": false
}
}
}

4
requirements.txt Executable file
View File

@ -0,0 +1,4 @@
classla==1.1.0
conllu==4.5.2
lxml==4.6.3
conversion-utils @ git+https://gitea.cjvt.si/generic/conversion_utils@master

0
solar2svala.py Normal file → Executable file
View File

0
src/__init__.py Normal file → Executable file
View File

0
src/annotate/__init__.py Executable file
View File

61
src/annotate/annotate.py Executable file
View File

@ -0,0 +1,61 @@
import os
import pickle
import classla
def annotate(tokenized_source_divs, tokenized_target_divs, args):
if os.path.exists(args.annotation_interprocessing) and not args.overwrite_annotation:
print('READING ANNOTATIONS...')
with open(args.annotation_interprocessing, 'rb') as rp:
annotated_source_divs, annotated_target_divs = pickle.load(rp)
return annotated_source_divs, annotated_target_divs
nlp = classla.Pipeline('sl', pos_use_lexicon=True, pos_lemma_pretag=False, tokenize_pretokenized="conllu",
type='standard_jos')
annotated_source_divs = []
complete_source_conllu = ''
print('ANNOTATING SOURCE...')
for i, div_tuple in enumerate(tokenized_source_divs):
print(f'{str(i*100/len(tokenized_source_divs))}')
div_name, div = div_tuple
annotated_source_pars = []
for par_tuple in div:
par_name, par = par_tuple
annotated_source_sens = []
for sen in par:
source_conllu_annotated = nlp(sen).to_conll() if sen else ''
annotated_source_sens.append(source_conllu_annotated)
complete_source_conllu += source_conllu_annotated
annotated_source_pars.append((par_name, annotated_source_sens))
annotated_source_divs.append((div_name, annotated_source_pars))
annotated_target_divs = []
complete_target_conllu = ''
print('ANNOTATING TARGET...')
for i, div_tuple in enumerate(tokenized_target_divs):
print(f'{str(i * 100 / len(tokenized_target_divs))}')
div_name, div = div_tuple
annotated_target_pars = []
for par_tuple in div:
par_name, par = par_tuple
annotated_target_sens = []
for sen in par:
# if sen.count('\n') <= 2:
# print('HERE!!!!')
target_conllu_annotated = nlp(sen).to_conll() if sen and sen.count('\n') > 2 else ''
annotated_target_sens.append(target_conllu_annotated)
complete_target_conllu += target_conllu_annotated
annotated_target_pars.append((par_name, annotated_target_sens))
annotated_target_divs.append((div_name, annotated_target_pars))
with open(os.path.join(args.results_folder, f"source.conllu"), 'w') as sf:
sf.write(complete_source_conllu)
with open(os.path.join(args.results_folder, f"target.conllu"), 'w') as sf:
sf.write(complete_target_conllu)
with open(args.annotation_interprocessing, 'wb') as wp:
pickle.dump((annotated_source_divs, annotated_target_divs), wp)
return annotated_source_divs, annotated_target_divs

178
src/create_tei.py Normal file → Executable file
View File

@ -6,6 +6,36 @@ from conversion_utils.translate_conllu_jos import get_syn_map
from lxml import etree
kost_translations = {
"Author": "Author",
"Sex": "Sex",
"Year of birth": "YearOfBirth",
"Country": "Country",
"Employment status": "EmploymentStatus",
"Completed education": "CompletedEducation",
"Current school": "CurrentSchool",
"First language": "FirstLang",
"Knowledge of other languages": "OtherLang",
"Duration of Slovene language learning": "DurSlvLearning",
"Experience with Slovene before current program": "ExpSlv",
"Language proficiency in Slovene": "ProficSlv",
"Life in Slovenija before this current program": "LifeSlovenia",
"Location of Slovene language learning": "LocSlvLearning",
"Creation date": "CreationDate",
"Teacher": "Teacher",
"Academic year": "AcademicYear",
"Grade": "Grade",
"Input type": "InputType",
"Program type": "ProgramType",
"Program subtype": "ProgramSubtype",
"Slovene textbooks used": "SloveneTextbooks",
"Study cycle": "StudyCycle",
"Study year": "StudyYear",
"Task setting": "TaskSetting",
"Topic": "Topic",
"Instruction": "Instruction"
}
labels_mapper = {
"B/GLAG/moči_morati": "B/GLAG/moči-morati",
"B/MEN/besedna_družina": "B/MEN/besedna-družina",
@ -176,9 +206,8 @@ class Sentence:
class Paragraph:
def __init__(self, _id, _doc_id, is_source):
def __init__(self, _id, _doc_id):
self._id = _id if _id is not None else 'no-id'
_doc_id += 's' if is_source else 't'
self._doc_id = _doc_id if _doc_id is not None else ''
self.sentences = []
@ -231,25 +260,41 @@ class TeiDocument:
tag_usage.set('gi', tag)
tag_usage.set('occurs', str(count))
for (paras, bibl, div_id), (_, _, corresp_div_id) in zip(self.divs, self.corresp_divs):
for (paras, div_id, metadata), (_, corresp_div_id, _) in zip(self.divs, self.corresp_divs):
div = etree.Element('div')
set_xml_attr(div, 'id', div_id)
div.set('corresp', f'#{corresp_div_id}')
bibl = create_bibl(metadata)
div.append(bibl)
for para in paras:
div.append(para.as_xml())
body.append(div)
return root
def add_paragraph(self, paragraph):
self.paragraphs.append(paragraph)
def create_bibl(metadata):
bibl = etree.Element('bibl')
bibl.set('n', metadata['Text ID'])
for k, v in metadata.items():
if k == 'Text ID' or not v:
continue
note = etree.Element('note')
if k not in kost_translations:
# print(k)
key = ''.join([el.capitalize() for el in k.split()])
else:
key = kost_translations[k]
note.set('ana', f'#{key}')
note.text = f'{v}'
bibl.append(note)
return bibl
def convert_bibl(bibl):
etree_bibl = etree.Element('bibl')
# etree_bibl.set('corresp', bibl.get('corresp'))
etree_bibl.set('n', bibl.get('n'))
for bibl_el in bibl:
etree_bibl_el = etree.Element(bibl_el.tag)
@ -282,17 +327,23 @@ def build_complete_tei(etree_source, etree_target, etree_links):
text = etree.Element('text')
group = etree.Element('group')
print('P3')
group.append(list(etree_source[0])[1])
group.insert(len(group),
list(etree_source[0])[1])
print('P4')
group.append(list(etree_target[0])[1])
group.insert(len(group),
list(etree_target[0])[1])
print('P5')
text.append(group)
text.insert(len(text),
group)
print('P6')
root.append(tei_header)
root.insert(len(root),
tei_header)
print('P7')
root.append(text)
root.insert(len(root),
text)
print('P8')
root.append(etree_links)
root.insert(len(root),
etree_links)
print('P9')
return root
@ -301,47 +352,49 @@ def build_links(all_edges):
body = etree.Element('standOff')
for document_edges in all_edges:
# mine paragraphs
for paragraph_edges in document_edges:
for sentence_edges in paragraph_edges:
s = etree.Element('linkGrp')
p = etree.Element('linkGrp')
corresp_source_id = ''
corresp_target_id = ''
sentence_id = ''
corresp_source_id = ''
corresp_target_id = ''
corresp = []
for token_edges in sentence_edges:
if not corresp_source_id and len(token_edges['source_ids']) > 0:
random_source_id = token_edges['source_ids'][0]
corresp_source_id = '#'
corresp_source_id += '.'.join(random_source_id.split('.')[:3])
corresp.append(corresp_source_id)
if not corresp_target_id and len(token_edges['target_ids']) > 0:
random_target_id = token_edges['target_ids'][0]
corresp_target_id = '#'
corresp_target_id += '.'.join(random_target_id.split('.')[:3])
corresp.append(corresp_target_id)
link = etree.Element('link')
# translate labels
labels_list = []
for label in token_edges['labels']:
if label in labels_mapper:
labels_list.append(labels_mapper[label])
else:
labels_list.append(label)
labels = '|'.join(labels_list) if len(labels_list) > 0 else 'ID'
link.set('type', labels)
link.set('target', ' '.join(['#' + source for source in token_edges['source_ids']] + ['#' + source for source in token_edges['target_ids']]))
for token_edges in paragraph_edges:
if not corresp_source_id and len(token_edges['source_ids']) > 0:
random_source_id = token_edges['source_ids'][0]
corresp_source_id = '#'
corresp_source_id += '.'.join(random_source_id.split('.')[:2])
if not corresp_target_id and len(token_edges['target_ids']) > 0:
random_target_id = token_edges['target_ids'][0]
corresp_target_id = '#'
corresp_target_id += '.'.join(random_target_id.split('.')[:2])
s.append(link)
s.set('type', 'CORR')
targFunc = []
if corresp_source_id:
targFunc.append('orig')
if corresp_target_id:
targFunc.append('reg')
s.set('targFunc', f'{" ".join(targFunc)}')
s.set('corresp', f'{" ".join(corresp)}')
body.append(s)
link = etree.Element('link')
# translate labels
labels_list = []
for label in token_edges['labels']:
if label in labels_mapper:
labels_list.append(labels_mapper[label])
else:
labels_list.append(label)
labels = '|'.join(labels_list) if len(labels_list) > 0 else 'ID'
link.set('type', labels)
link.set('target', ' '.join(['#' + source for source in token_edges['source_ids']] + ['#' + source for source in token_edges['target_ids']]))
p.append(link)
corresp = []
if corresp_source_id:
corresp.append(corresp_source_id)
if corresp_target_id:
corresp.append(corresp_target_id)
p.set('type', 'CORR')
targFunc = []
if corresp_source_id:
targFunc.append('orig')
if corresp_target_id:
targFunc.append('reg')
p.set('targFunc', f'{" ".join(targFunc)}')
p.set('corresp', f'{" ".join(corresp)}')
body.append(p)
return body
@ -365,8 +418,8 @@ def is_metaline(line):
return False
def construct_paragraph_from_list(doc_id, para_id, etree_source_sentences, source_id):
para = Paragraph(para_id, doc_id, source_id)
def construct_paragraph_from_list(doc_id, para_id, etree_source_sentences):
para = Paragraph(para_id, doc_id)
for sentence in etree_source_sentences:
para.add_sentence(sentence)
@ -374,29 +427,6 @@ def construct_paragraph_from_list(doc_id, para_id, etree_source_sentences, sourc
return para
def construct_paragraph(doc_id, para_id, conllu_lines, is_source):
para = Paragraph(para_id, doc_id, is_source)
sent_id = None
sent_buffer = []
for line in conllu_lines:
if is_metaline(line):
key, val = parse_metaline(line)
if key == 'sent_id':
if len(sent_buffer) > 0:
para.add_sentence(construct_sentence(sent_id, sent_buffer))
sent_buffer = []
sent_id = val
elif not line.isspace():
sent_buffer.append(line)
if len(sent_buffer) > 0:
para.add_sentence(construct_sentence(sent_id, sent_buffer))
return para
def construct_sentence_from_list(sent_id, object_list, is_source):
sentence = Sentence(sent_id)
converter = Converter()

0
src/read/__init__.py Executable file
View File

94
src/read/hand_fixes.py Executable file
View File

@ -0,0 +1,94 @@
from collections import deque
HAND_FIXES = {'§§§pisala': ['§', '§', '§', 'pisala'], '§§§poldne': ['§', '§', '§', 'poldne'], '§§§o': ['§', '§', '§', 'o'], '§§§mimi': ['§', '§', '§', 'mimi'], '§§§nil': ['§', '§', '§', 'nil'], '§§§ela': ['§', '§', '§', 'ela'], 'sam§§§': ['sam', '§', '§', '§'], 'globa觧§': ['globač', '§', '§', '§'], 'sin.': ['sin', '.'], '§§§oveduje': ['§', '§', '§', 'oveduje'], 'na§§§': ['na', '§', '§', '§'], '§§§ka§§§': ['§', '§', '§', 'ka', '§', '§', '§'], '§§§e§§§': ['§', '§', '§', 'e', '§', '§', '§'], '§§§': ['§', '§', '§'], 'ljubezni.': ['ljubezni', '.'], '12.': ['12', '.'], '16.': ['16', '.'], 'st.': ['st', '.'], 'S.': ['S', '.'], 'pr.': ['pr', '.'], 'n.': ['n', '.'], '19:30': ['19', ':', '30'], '9.': ['9', '.'], '6:35': ['6', ':', '35'], 'itd.': ['itd', '.'], 'Sv.': ['Sv', '.'], 'npr.': ['npr', '.'], 'sv.': ['sv', '.'], '12:00': ['12', ':', '00'], "sram'vali": ['sram', "'", 'vali'], '18:00': ['18', ':', '00'], 'J.': ['J', '.'], '5:45': ['5', ':', '45'], '17.': ['17', '.'], '9.00h': ['9', '.', '00h'], 'H.': ['H', '.'], '1.': ['1', '.'], '6.': ['6', '.'], '7:10': ['7', ':', '10'], 'g.': ['g', '.'], 'Oz.': ['Oz', '.'], '20:00': ['20', ':', '00'], '17.4.2010': ['17.', '4.', '2010'], 'ga.': ['ga', '.'], 'prof.': ['prof', '.'], '6:45': ['6', ':', '45'], '19.': ['19', '.'], '3.': ['3', '.'], 'tj.': ['tj', '.'], 'Prof.': ['Prof', '.'], '8.': ['8', '.'], '9:18': ['9', ':', '18'], 'ipd.': ['ipd', '.'], '7.': ['7', '.'], 'št.': ['št', '.'], 'oz.': ['oz', '.'], 'R.': ['R', '.'], '13:30': ['13', ':', '30'], '5.': ['5', '.'], '...': ['.', '.', '.'], 'plavali.': ['plavali', '.'], '[XImeX]': ['[', 'XImeX', ']'], '[XimeX]': ['[', 'XimeX', ']'], 'hipoteze:': ['hipoteze', ':'], 'prehrano?': ['prehrano', '?'], '68-letna': ['68', '-', 'letna'], 'pojma:': ['pojma', ':'], '[XKrajX]': ['[', 'XKrajX', ']'], '3/4': ['3', '/', '4'], 'I-phonea': ['I', '-', 'phonea'], 'kredita:': ['kredita', ':'], '[XFakultetaX]': ['[', 'XFakultetaX', ']'], 'športno-eleganten': ['športno', '-', 'eleganten'], '[XStudijskaSmerX]': ['[', 'XStudijskaSmerX', ']'], '[XNaslovX]': ['[', 'XNaslovX', ']'], '(tudi': ['(', 'tudi'], 'kupujem)': ['kupujem', ')'], '[XPriimekX]': ['[', 'XPriimekX', ']'], '[XPodjetjeX]': ['[', 'XPodjetjeX', ']'], 'Zagreb,': ['Zagreb', ','], 'Budimpešto.': ['Budimpešto', '.'], 'žalost.': ['žalost', '.'], '....': ['.', '.', '.', '.'], '[XStevilkaX]': ['[', 'XStevilkaX', ']'], 'e-naslov': ['e', '-', 'naslov'], '[XEnaslovX]': ['[', 'XEnaslovX', ']'], 'e-pošto': ['e', '-', 'pošto'], '[XDatumX]': ['[', 'XDatumX', ']'], 'eno-sobno': ['eno', '-', 'sobno'], 'lgbtq-prijazna': ['lgbtq', '-', 'prijazna'], 'lgbtq-prijaznega': ['lgbtq', '-', 'prijaznega'], 'Covid-19': ['Covid', '-', '19'], ',,,': [',', ',', ','], 'e-maila': ['e', '-', 'maila'], 'T&d': ['T', '&', 'd'], 'Spider-Man': ['Spider', '-', 'Man'], '12-strani': ['12', '-', 'strani'], 'turbo-folk': ['turbo', '-', 'folk'], 'Cp-čkar': ['Cp', '-', 'čkar'], '46-letnik': ['46', '-', 'letnik'], '40-letna': ['40', '-', 'letna'], '18-19h': ['18', '-', '19h'], '[XSvojilniPridevnikX]': ['[', 'XSvojilniPridevnikX', ']'], 'COVID-19': ['COVID', '-', '19'], '"sims"': ['"', 'sims', '"'], '2021/22': ['2021', '/', '22'], '2020/21': ['2020', '/', '21'], 'leto2021/22': ['leto2021', '/', '22'], 'H&m': ['H', '&', 'm'], 'high-street': ['high', '-', 'street'], 'H&M-u': ['H', '&', 'M-u'], 'H&M': ['H', '&', 'M'], 'srčno-žilnih': ['srčno', '-', 'žilnih'], 'srčno-žilni': ['srčno', '-', 'žilni'], ':))': [':)', ')'], 'You-Tube-ju': ['You', '-', 'Tube-ju'], '37,8%': ['37', ',', '8%'], '23,8%': ['23', ',', '8%'], '17,6%': ['17', ',', '6%'], '12,6%': ['12', ',', '6%'], '58,2%': ['58', ',', '2%'], '76,2%': ['76', ',', '2%']}
# , '37,8%': ['37', ',', '8%'], '23,8%': ['23', ',', '8%'], '17,6%': ['17', ',', '6%'], '12,6%': ['12', ',', '6%'], '58,2%': ['58', ',', '2%'], '76,2%': ['76', ',', '2%']
SVALA_HAND_FIXES_MERGE = {('oz', '.'): 'oz.', ('Npr', '.'): 'Npr.', ('npr', '.'): 'npr.', ('1', '.'): '1.', ('2', '.'): '2.', ('3', '.'): '3.', ('m', '.'): 'm.', ('itn', '.'): 'itn.', ('max', '.'): 'max.', ('4', '.'): '4.', ('cca', '.'): 'cca.', ('30', '.'): '30.', ('mlad', '.'): 'mlad.', (':)', ')'): ':))', ('sv', '.'): 'sv.', ('p', '.'): 'p.'}
OBELIKS_HAND_FIXES_MERGE = {'2015.': ['2015', '.']}
def merge_svala_data_elements(svala_data_object, i, mask_len):
final_text = ''
involved_sources = []
involved_targets = []
involved_edges = []
for el in svala_data_object.svala_data['source'][i - mask_len + 1:i + 1]:
# check whether merge won't cause further (unnoticed) issues later
edges = svala_data_object.links_ids_mapper[el['id']]
if len(edges) != 1:
raise ValueError('Incorrect number of edges!')
edge = svala_data_object.svala_data['edges'][edges[0]]
# TODO check if or len(edge['labels']) != 0 has to be added
if len(edge['source_ids']) != 1 or len(edge['target_ids']) != 1:
raise ValueError('Possible errors - CHECK!')
final_text += el['text']
involved_sources.append(edge['source_ids'][0])
involved_targets.append(edge['target_ids'][0])
involved_edges.append(edge['id'])
# erase merged svala elements
svala_data_object.svala_data['source'][i - mask_len + 1]['text'] = final_text
svala_data_object.svala_data['source'] = [el for el in svala_data_object.svala_data['source'] if
el['id'] not in involved_sources[1:]]
for el in svala_data_object.svala_data['target']:
if el['id'] == involved_targets[0]:
el['text'] = final_text
break
svala_data_object.svala_data['target'] = [el for el in svala_data_object.svala_data['target'] if
el['id'] not in involved_targets[1:]]
svala_data_object.svala_data['edges'] = {k: v for k, v in svala_data_object.svala_data['edges'].items() if
v['id'] not in involved_edges[1:]}
i -= len(involved_sources[1:])
return i
def apply_svala_handfixes(svala_data_object):
hand_fix_mask = []
for key in SVALA_HAND_FIXES_MERGE.keys():
if len(key) not in hand_fix_mask:
hand_fix_mask.append(len(key))
remember_length = max(hand_fix_mask)
q = deque()
i = 0
for el in svala_data_object.svala_data['source']:
q.append(el['text'])
if len(q) > remember_length:
q.popleft()
for mask_len in hand_fix_mask:
list_q = list(q)
if len(list_q) - mask_len >= 0:
key = tuple(list_q[remember_length - mask_len:])
if key in SVALA_HAND_FIXES_MERGE:
i = merge_svala_data_elements(svala_data_object, i, mask_len)
i += 1
def apply_obeliks_handfixes(tokenized_paragraph):
for t_i in range(len(tokenized_paragraph)):
sen = tokenized_paragraph[t_i]
i = 0
error = False
for tok in sen:
# if tok['text'] == ',,,':
# tok['text'] = ','
if tok['text'] in OBELIKS_HAND_FIXES_MERGE:
error = True
break
i += 1
if error:
new_sen = []
new_id = 1
for t in sen:
if t['text'] in OBELIKS_HAND_FIXES_MERGE:
for ex_t in OBELIKS_HAND_FIXES_MERGE[t['text']]:
new_sen.append({'id': tuple([new_id]), 'text': ex_t})
new_id += 1
else:
new_sen.append({'id': tuple([new_id]), 'text': t['text']})
new_id += 1
tokenized_paragraph[t_i] = new_sen

281
src/read/merge.py Executable file
View File

@ -0,0 +1,281 @@
from src.read.read import read_raw_text, map_svala_tokenized
from conllu import TokenList
def create_edges_list(target_ids, links_ids_mapper):
target_edges = []
target_edges_set = []
for target_sentence in target_ids:
target_sentence_edges = []
for target_id in target_sentence:
target_sentence_edges.extend(links_ids_mapper[target_id])
target_edges.append(target_sentence_edges)
target_edges_set.append(set(target_sentence_edges))
return target_edges, target_edges_set
SKIP_IDS = ['solar2284s.1.1.1']
def create_edges(raw_edges, source_par, target_par):
source_mapper = {el['svala_id']: source[1] + '.' + str(el['id']) for source in source_par for el in source[0]}
target_mapper = {el['svala_id']: target[1] + '.' + str(el['id']) for target in target_par for el in target[0]}
# actually add edges
edges = []
for _, edge in raw_edges.items():
labels = edge['labels']
source_ids = [source_mapper[el] for el in edge['ids'] if el in source_mapper]
target_ids = [target_mapper[el] for el in edge['ids'] if el in target_mapper]
edges.append({'source_ids': source_ids, 'target_ids': target_ids, 'labels': labels})
return edges
def update_ids(pretag, in_list):
for el in in_list:
el['id'] = f'{pretag}.{el["id"]}'
def create_conllu(interest_list, sentence_string_id):
conllu_result = TokenList([{"id": token_i + 1, "form": token['token'], "lemma": None, "upos": None, "xpos": None, "feats": None,
"head": None, "deprel": None, "deps": None, "misc": "SpaceAfter=No"} if not token['space_after']
else {"id": token_i + 1, "form": token['token'], "lemma": None, "upos": None, "xpos": None,
"feats": None, "head": None, "deprel": None, "deps": None, "misc": None} for token_i, token in
enumerate(interest_list)])
# Delete last SpaceAfter
misc = conllu_result[len(conllu_result) - 1]['misc'] if len(conllu_result) > 0 else None
if misc is not None:
misc_split = misc.split('|')
if misc is not None and misc == 'SpaceAfter=No':
conllu_result[len(conllu_result) - 1]['misc'] = None
elif misc is not None and 'SpaceAfter=No' in misc_split:
conllu_result[len(conllu_result) - 1]['misc'] = '|'.join([el for el in misc_split if el != 'SpaceAfter=No'])
conllu_result.metadata = {"sent_id": sentence_string_id}
return conllu_result.serialize()
def add_error_token_source_target_only(el, out_list, sentence_string_id, out_list_i, is_source, s_t_id):
sentence_string_id_split = sentence_string_id.split('.')
source_token_id = f'{sentence_string_id_split[0]}s.{".".join(sentence_string_id_split[1:])}.{out_list_i}' if is_source \
else f'{sentence_string_id_split[0]}t.{".".join(sentence_string_id_split[1:])}.{out_list_i}'
token_tag = 'w' if el.tag.startswith('w') else 'pc'
out_list.append({'token': el.text, 'tag': token_tag, 'ana': el.attrib['ana'], 'id': source_token_id, 'space_after': False, 'svala_id': s_t_id})
def add_errors_source_target_only(svala_i, source_i, target_i, error, source, target, svala_data, sentence_string_id):
# solar5.7
for el in error:
if el.tag.startswith('w') or el.tag.startswith('pc'):
ind = str(svala_i)
source_id = "s" + ind
add_error_token_source_target_only(el, source, sentence_string_id, source_i, True, source_id)
source_i += 1
svala_i += 1
elif el.tag.startswith('c') and len(source) > 0:
source[-1]['space_after'] = True
elif el.tag.startswith('p'):
for p_el in el:
if p_el.tag.startswith('w') or p_el.tag.startswith('pc'):
ind = str(svala_i)
target_id = "t" + ind
add_error_token_source_target_only(p_el, target, sentence_string_id, target_i, False, target_id)
target_i += 1
svala_i += 1
elif p_el.tag.startswith('c') and len(target) > 0:
target[-1]['space_after'] = True
elif el.tag.startswith('u2'):
for el_l2 in el:
if el_l2.tag.startswith('w') or el_l2.tag.startswith('pc'):
ind = str(svala_i)
source_id = "s" + ind
add_error_token_source_target_only(el_l2, source, sentence_string_id, source_i, True, source_id)
source_i += 1
svala_i += 1
elif el_l2.tag.startswith('c') and len(source) > 0:
source[-1]['space_after'] = True
elif el_l2.tag.startswith('u3'):
for el_l3 in el_l2:
if el_l3.tag.startswith('w') or el_l3.tag.startswith('pc'):
ind = str(svala_i)
source_id = "s" + ind
add_error_token_source_target_only(el_l3, source, sentence_string_id, source_i, True, source_id)
source_i += 1
svala_i += 1
elif el_l3.tag.startswith('c') and len(source) > 0:
source[-1]['space_after'] = True
elif el_l3.tag.startswith('u4'):
for el_l4 in el_l3:
if el_l4.tag.startswith('w') or el_l4.tag.startswith('pc'):
ind = str(svala_i)
source_id = "s" + ind
add_error_token_source_target_only(el_l4, source, sentence_string_id, source_i, True, source_id)
source_i += 1
svala_i += 1
elif el_l4.tag.startswith('c') and len(source) > 0:
source[-1]['space_after'] = True
elif el_l4.tag.startswith('u5'):
for el_l5 in el_l4:
if el_l5.tag.startswith('w') or el_l5.tag.startswith('pc'):
ind = str(svala_i)
source_id = "s" + ind
add_error_token_source_target_only(el_l5, source, sentence_string_id, source_i, True, source_id)
source_i += 1
svala_i += 1
elif el_l5.tag.startswith('c') and len(source) > 0:
source[-1]['space_after'] = True
for p_el in el:
if p_el.tag.startswith('w') or p_el.tag.startswith('pc'):
ind = str(svala_i)
target_id = "t" + ind
add_error_token_source_target_only(p_el, target, sentence_string_id, target_i, False, target_id)
target_i += 1
svala_i += 1
elif p_el.tag.startswith('c') and len(target) > 0:
target[-1]['space_after'] = True
return svala_i, source_i, target_i
def add_source(svala_i, source_i, sentence_string_id_split, source, el):
source_id = "s" + svala_i
source_token_id = f'{sentence_string_id_split[0]}s.{".".join(sentence_string_id_split[1:])}.{source_i}'
token_tag = 'w' if el.tag.startswith('w') else 'pc'
source.append({'token': el.text, 'tag': token_tag, 'ana': el.attrib['ana'], 'id': source_token_id,
'space_after': False, 'svala_id': source_id})
def add_target(svala_i, target_i, sentence_string_id_split, target, el):
target_id = "t" + svala_i
target_token_id = f'{sentence_string_id_split[0]}t.{".".join(sentence_string_id_split[1:])}.{target_i}'
token_tag = 'w' if el.tag.startswith('w') else 'pc'
target.append({'token': el.text, 'tag': token_tag, 'ana': el.attrib['ana'], 'id': target_token_id,
'space_after': False, 'svala_id': target_id})
def merge(sentences, paragraph, svala_i, svala_data, add_errors_func, source_raw_text, target_raw_text, nlp_tokenize):
if source_raw_text is not None:
text = read_raw_text(source_raw_text)
raw_text, source_tokenized, metadocument = nlp_tokenize.processors['tokenize']._tokenizer.tokenize(text) if text else ([], [], [])
source_res = map_svala_tokenized(svala_data['source'], source_tokenized)
if target_raw_text is not None:
text = read_raw_text(target_raw_text)
raw_text, target_tokenized, metadocument = nlp_tokenize.processors['tokenize']._tokenizer.tokenize(text) if text else ([], [], [])
target_res = map_svala_tokenized(svala_data['target'], target_tokenized)
par_source = []
par_target = []
sentences_len = len(sentences)
source_conllus = []
target_conllus = []
if source_raw_text is not None:
sentences_len = max(sentences_len, len(source_res))
if target_raw_text is not None:
sentences_len = max(sentences_len, len(target_res))
for sentence_id in range(sentences_len):
source = []
target = []
sentence_id += 1
source_i = 1
target_i = 1
sentence_string_id = paragraph.attrib['{http://www.w3.org/XML/1998/namespace}id'] + f'.{sentence_id}'
sentence_string_id_split = sentence_string_id.split('.')
if sentence_id - 1 < len(sentences):
sentence = sentences[sentence_id - 1]
for el in sentence:
if el.tag.startswith('w'):
if source_raw_text is None:
add_source(str(svala_i), source_i, sentence_string_id_split, source, el)
if target_raw_text is None:
add_target(str(svala_i), target_i, sentence_string_id_split, target, el)
svala_i += 1
source_i += 1
target_i += 1
elif el.tag.startswith('pc'):
if source_raw_text is None:
add_source(str(svala_i), source_i, sentence_string_id_split, source, el)
if target_raw_text is None:
add_target(str(svala_i), target_i, sentence_string_id_split, target, el)
svala_i += 1
source_i += 1
target_i += 1
elif el.tag.startswith('u'):
if source_raw_text is None or target_raw_text is None:
svala_i, source_i, target_i = add_errors_source_target_only(svala_i, source_i, target_i, el, source, target, svala_data, sentence_string_id)
else:
svala_i, source_i, target_i = add_errors_func(svala_i, source_i, target_i, el, source, target,
svala_data, sentence_string_id)
elif el.tag.startswith('c'):
if len(source) > 0:
source[-1]['space_after'] = True
if len(target) > 0:
target[-1]['space_after'] = True
if source_raw_text is not None and sentence_id - 1 < len(source_res):
source = source_res[sentence_id - 1]
update_ids(f'{sentence_string_id_split[0]}s.{".".join(sentence_string_id_split[1:])}', source)
par_source.append(source)
source_conllu = ''
if len(source) > 0:
source_conllu = create_conllu(source, sentence_string_id)
if target_raw_text is not None and sentence_id - 1 < len(target_res):
target = target_res[sentence_id - 1]
update_ids(f'{sentence_string_id_split[0]}t.{".".join(sentence_string_id_split[1:])}', target)
par_target.append(target)
if source_raw_text is None:
par_source.append(source)
if target_raw_text is None:
par_target.append(target)
target_conllu = ''
if len(target) > 0:
target_conllu = create_conllu(target, sentence_string_id)
if source_raw_text is None or len(source_conllus) < len(par_source):
source_conllus.append(source_conllu)
if target_raw_text is None or len(target_conllus) < len(par_target):
target_conllus.append(target_conllu)
sentence_edges = create_edges(svala_data, par_source, par_target)
return sentence_edges, source_conllus, target_conllus

86
src/read/read.py Executable file
View File

@ -0,0 +1,86 @@
import re
from src.read.hand_fixes import HAND_FIXES, apply_obeliks_handfixes, SVALA_HAND_FIXES_MERGE
def read_raw_text(path):
print(path)
try:
with open(path, 'r', encoding='utf-8') as rf:
return rf.read()
except:
try:
with open(path, 'r', encoding='utf-16') as rf:
return rf.read()
except:
with open(path, 'r', encoding="windows-1250") as rf:
return rf.read()
def map_svala_tokenized(svala_data_part, tokenized_paragraph, sent_i):
# apply handfixes for obeliks
apply_obeliks_handfixes(tokenized_paragraph)
paragraph_res = []
wierd_sign_count = 0
svala_data_i = 0
for i in range(sent_i, len(tokenized_paragraph)):
sentence = tokenized_paragraph[i]
sentence_res = []
sentence_id = 0
for tok in sentence:
tag = 'pc' if 'xpos' in tok and tok['xpos'] == 'Z' else 'w'
if 'misc' in tok:
assert tok['misc'] == 'SpaceAfter=No'
space_after = not 'misc' in tok
if len(svala_data_part) <= svala_data_i:
# if sentence does not end add it anyway
# TODO i error?
if sentence_res:
paragraph_res.append(sentence_res)
return i, paragraph_res
if svala_data_part[svala_data_i]['text'] != tok['text']:
key = svala_data_part[svala_data_i]['text']
if key not in HAND_FIXES:
if key.startswith('§§§') and key.endswith('§§§'):
HAND_FIXES[key] = ['§', '§', '§', key[3:-3], '§', '§', '§']
elif key.startswith('§§§'):
HAND_FIXES[key] = ['§', '§', '§', key[3:]]
elif key.endswith('§§§'):
HAND_FIXES[key] = [key[:-3], '§', '§', '§']
else:
if len(key) < len(tok['text']):
print('HAND_FIXES_MERGE:')
print(f", ('{tok['text'][:len(key)]}', '{tok['text'][len(key):]}'): '{tok['text']}'")
SVALA_HAND_FIXES_MERGE[(tok['text'][:len(key)], tok['text'][len(key):])] = tok['text']
else:
print('HAND_FIXES OLD:')
print(f", '{key}': ['{key[:len(tok['text'])]}', '{key[len(tok['text']):]}']")
print('HAND_FIXES NEW:')
reg = re.findall(r"[\w]+|[^\s\w]", key)
print(f", '{key}': {str(reg)}")
HAND_FIXES[key] = re.findall(r"[\w]+|[^\s\w]", key)
print(f'key: {key} ; tok[text]: {tok["text"]}')
if tok['text'] == HAND_FIXES[key][wierd_sign_count]:
wierd_sign_count += 1
if wierd_sign_count < len(HAND_FIXES[key]):
continue
else:
tok['text'] = key
wierd_sign_count = 0
elif key in ['[XKrajX]']:
tok['text'] = '[XKrajX]'
elif key in ['[XImeX]']:
tok['text'] = '[XImeX]'
else:
print(f'key: {key} ; tok[text]: {tok["text"]}')
raise 'Word mismatch!'
sentence_id += 1
sentence_res.append({'token': tok['text'], 'tag': tag, 'id': sentence_id, 'space_after': space_after, 'svala_id': svala_data_part[svala_data_i]['id']})
svala_data_i += 1
paragraph_res.append(sentence_res)
return sent_i, paragraph_res

388
src/read/read_and_merge.py Executable file
View File

@ -0,0 +1,388 @@
import json
import logging
import os
import pickle
import queue
import string
from collections import deque
import classla
from src.read.hand_fixes import apply_svala_handfixes
from src.read.merge import merge, create_conllu, create_edges
from src.read.read import read_raw_text, map_svala_tokenized
from src.read.svala_data import SvalaData
alphabet = list(map(chr, range(97, 123)))
def add_error_token(el, out_list, sentence_string_id, out_list_i, out_list_ids, is_source, s_t_id):
sentence_string_id_split = sentence_string_id.split('.')
source_token_id = f'{sentence_string_id_split[0]}s.{".".join(sentence_string_id_split[1:])}.{out_list_i}' if is_source \
else f'{sentence_string_id_split[0]}t.{".".join(sentence_string_id_split[1:])}.{out_list_i}'
token_tag = 'w' if el.tag.startswith('w') else 'pc'
lemma = el.attrib['lemma'] if token_tag == 'w' else el.text
out_list.append({'token': el.text, 'tag': token_tag, 'ana': el.attrib['ana'], 'lemma': lemma, 'id': source_token_id, 'space_after': False, 'svala_id': s_t_id})
out_list_ids.append(source_token_id)
def add_errors(svala_i, source_i, target_i, error, source, target, svala_data, sentence_string_id, edges=None):
source_edge_ids = []
target_edge_ids = []
source_ids = []
target_ids = []
# solar5.7
for el in error:
if el.tag.startswith('w') or el.tag.startswith('pc'):
ind = str(svala_i)
source_id = "s" + ind
source_edge_ids.append(source_id)
add_error_token(el, source, sentence_string_id, source_i, source_ids, True, source_id)
source_i += 1
svala_i += 1
elif el.tag.startswith('c') and len(source) > 0:
source[-1]['space_after'] = True
elif el.tag.startswith('p'):
for p_el in el:
if p_el.tag.startswith('w') or p_el.tag.startswith('pc'):
ind = str(svala_i)
target_id = "t" + ind
target_edge_ids.append(target_id)
add_error_token(p_el, target, sentence_string_id, target_i, target_ids, False, target_id)
target_i += 1
svala_i += 1
elif p_el.tag.startswith('c') and len(target) > 0:
target[-1]['space_after'] = True
elif el.tag.startswith('u2'):
for el_l2 in el:
if el_l2.tag.startswith('w') or el_l2.tag.startswith('pc'):
ind = str(svala_i)
source_id = "s" + ind
source_edge_ids.append(source_id)
add_error_token(el_l2, source, sentence_string_id, source_i, source_ids, True, source_id)
source_i += 1
svala_i += 1
elif el_l2.tag.startswith('c') and len(source) > 0:
source[-1]['space_after'] = True
elif el_l2.tag.startswith('u3'):
for el_l3 in el_l2:
if el_l3.tag.startswith('w') or el_l3.tag.startswith('pc'):
ind = str(svala_i)
source_id = "s" + ind
source_edge_ids.append(source_id)
add_error_token(el_l3, source, sentence_string_id, source_i, source_ids, True, source_id)
source_i += 1
svala_i += 1
elif el_l3.tag.startswith('c') and len(source) > 0:
source[-1]['space_after'] = True
elif el_l3.tag.startswith('u4'):
for el_l4 in el_l3:
if el_l4.tag.startswith('w') or el_l4.tag.startswith('pc'):
ind = str(svala_i)
source_id = "s" + ind
source_edge_ids.append(source_id)
add_error_token(el_l4, source, sentence_string_id, source_i, source_ids, True, source_id)
source_i += 1
svala_i += 1
elif el_l4.tag.startswith('c') and len(source) > 0:
source[-1]['space_after'] = True
elif el_l4.tag.startswith('u5'):
for el_l5 in el_l4:
if el_l5.tag.startswith('w') or el_l5.tag.startswith('pc'):
ind = str(svala_i)
source_id = "s" + ind
source_edge_ids.append(source_id)
add_error_token(el_l5, source, sentence_string_id, source_i, source_ids, True, source_id)
source_i += 1
svala_i += 1
elif el_l5.tag.startswith('c') and len(source) > 0:
source[-1]['space_after'] = True
if edges is not None:
edge_ids = sorted(source_edge_ids) + sorted(target_edge_ids)
edge_id = "e-" + "-".join(edge_ids)
edges.append({'source_ids': source_ids, 'target_ids': target_ids, 'labels': svala_data['edges'][edge_id]['labels']})
return svala_i, source_i, target_i
def create_target(svala_data_object, source_tokenized):
source_tokenized_dict = {}
for i, sent in enumerate(source_tokenized):
for tok in sent:
tok['sent_id'] = i + 1
source_tokenized_dict[tok['svala_id']] = tok
links_ids_mapper, edges_of_one_type = svala_data_object.links_ids_mapper, svala_data_object.edges_of_one_type
curr_sententence = 1
source_curr_sentence = 1
target_tokenized = []
target_sent_tokenized = []
tok_i = 1
for i, token in enumerate(svala_data_object.svala_data['target']):
edge_id = links_ids_mapper[token['id']]
if len(edge_id) > 1:
print('Whaat?')
edge_id = edge_id[0]
edge = svala_data_object.svala_data['edges'][edge_id]
source_word_ids = []
target_word_ids = []
for word_id in edge['ids']:
if word_id[0] == 's':
source_word_ids.append(word_id)
if word_id[0] == 't':
target_word_ids.append(word_id)
token_text = token['text']
new_sentence = False
if len(source_word_ids) == 1:
source_id = source_word_ids[0]
source_token = source_tokenized_dict[source_id]
if source_token['sent_id'] != source_curr_sentence:
source_curr_sentence = source_token['sent_id']
if source_token['id'] == 1 and len(target_sent_tokenized) > 1:
target_tokenized.append(target_sent_tokenized)
target_sent_tokenized = []
curr_sententence += 1
tok_i = 1
# check if words are equal and update
if token_text == source_token['token']:
target_token = {
'token': source_token['token'],
'tag': source_token['tag'],
'id': tok_i,
'space_after': source_token['space_after'],
'svala_id': token['id'],
'sent_id': curr_sententence,
}
else:
# Check for punctuation mismatch.
if token_text in string.punctuation:
tag = 'pc'
else:
tag = 'w'
target_token = {
'token': token_text,
'tag': tag,
'id': tok_i,
'space_after': source_token['space_after'],
'svala_id': token['id'],
'sent_id': curr_sententence,
}
else:
space_after = True
if token_text in string.punctuation:
tag = 'pc'
if token_text in '!?.,):;]}':
if len(target_sent_tokenized) == 0:
raise ValueError('Sentence lenght = 0!')
target_sent_tokenized[-1]['space_after'] = False
if token_text in '!?.':
new_sentence = True
# Handle cases like `...`
if len(svala_data_object.svala_data['target']) > i + 1 and svala_data_object.svala_data['target'][i+1]['text'] in '.?!':
new_sentence = False
elif token_text in '([{':
space_after = False
else:
tag = 'w'
target_token = {
'token': token_text,
'tag': tag,
'id': tok_i,
'space_after': space_after,
'svala_id': token['id'],
'sent_id': curr_sententence,
}
target_sent_tokenized.append(target_token)
if new_sentence:
target_tokenized.append(target_sent_tokenized)
target_sent_tokenized = []
curr_sententence += 1
tok_i = 0
tok_i += 1
target_tokenized.append(target_sent_tokenized)
return target_tokenized
def fake_svala_data(source_tokenized):
source_res, target_res, generated_edges = [], [], {}
edge_id = 0
for sent in source_tokenized:
source_sent = []
target_sent = []
for tok in sent:
tok_id = tok['id'][0]
tok_tag = 'w' if 'xpos' not in tok or tok['xpos'] != 'Z' else 'pc'
source_svala_id = 's' + str(edge_id)
target_svala_id = 't' + str(edge_id)
space_after = not ('misc' in tok and tok['misc'] == 'SpaceAfter=No')
source_sent.append({
'token': tok['text'],
'tag': tok_tag,
'id': tok_id,
'space_after': space_after,
'svala_id': source_svala_id
})
target_sent.append({
'token': tok['text'],
'tag': tok_tag,
'id': tok_id,
'space_after': space_after,
'svala_id': target_svala_id
})
generated_edges[f'e-{source_svala_id}-{target_svala_id}'] = {
'id': f'e-{source_svala_id}-{target_svala_id}',
'ids': [source_svala_id, target_svala_id],
'labels': [],
'manual': False,
'source_ids': [source_svala_id],
'target_ids': [target_svala_id]
}
edge_id += 1
source_res.append(source_sent)
target_res.append(target_sent)
return source_res, target_res, generated_edges
def tokenize(args):
if os.path.exists(args.tokenization_interprocessing) and not args.overwrite_tokenization:
print('READING TOKENIZATION...')
with open(args.tokenization_interprocessing, 'rb') as rp:
tokenized_source_divs, tokenized_target_divs, document_edges = pickle.load(rp)
return tokenized_source_divs, tokenized_target_divs, document_edges
print('TOKENIZING...')
nlp_tokenize = classla.Pipeline('sl', processors='tokenize', pos_lemma_pretag=True)
tokenized_divs = {}
all_js_filenames = [sorted(filenames) for folder, _, filenames in os.walk(args.svala_folder)][0]
for text_folder, _, text_filenames in os.walk(args.raw_text):
text_filenames = sorted(text_filenames)
for text_filename_i, text_filename in enumerate(text_filenames):
text_file = read_raw_text(os.path.join(args.raw_text, text_filename))
raw_text, source_tokenized, metadocument = nlp_tokenize.processors['tokenize']._tokenizer.tokenize(
text_file) if text_file else ([], [], [])
source_sent_i = 0
filenames = [filename for filename in all_js_filenames if filename.startswith(text_filename[:-4])]
# new_text_filename = '-'.join(filename[:-5].split('-')[:3]) + '.txt'
if filenames:
for filename in filenames:
svala_path = os.path.join(args.svala_folder, filename)
jf = open(svala_path, encoding='utf-8')
print(svala_path)
svala_data = json.load(jf)
jf.close()
svala_data_object = SvalaData(svala_data)
apply_svala_handfixes(svala_data_object)
source_sent_i, source_res = map_svala_tokenized(svala_data_object.svala_data['source'], source_tokenized, source_sent_i)
target_res = create_target(svala_data_object, source_res)
if text_filename not in tokenized_divs:
tokenized_divs[text_filename] = []
tokenized_divs[text_filename].append((filename, source_res, target_res, svala_data_object.svala_data['edges']))
else:
filename = text_filename[:-4] + '.json'
source_res, target_res, generated_edges = fake_svala_data(source_tokenized)
if text_filename not in tokenized_divs:
tokenized_divs[text_filename] = []
tokenized_divs[text_filename].append((filename, source_res, target_res, generated_edges))
logging.info(f'Tokenizing at {text_filename_i * 100 / len(text_filenames)} %')
tokenized_source_divs = []
tokenized_target_divs = []
document_edges = []
for div_id in tokenized_divs.keys():
paragraph_edges = []
tokenized_source_paragraphs = []
tokenized_target_paragraphs = []
for tokenized_para in tokenized_divs[div_id]:
paragraph_name, source_res, target_res, edges = tokenized_para
split_para_name = paragraph_name[:-5].split('-')
div_name = '-'.join(split_para_name[:-1]) if len(split_para_name) == 4 else '-'.join(split_para_name)
par_name = split_para_name[-1] if len(split_para_name) == 4 else '1'
assert not par_name.isnumeric() or par_name not in alphabet, Exception('Incorrect paragraph name!')
if par_name in alphabet:
par_name = str(alphabet.index(par_name) + 10)
source_paragraphs = []
target_paragraphs = []
sen_source = []
sen_target = []
for sen_i, sen in enumerate(source_res):
source_sen_name = f'{div_name}s.{par_name}.{str(sen_i + 1)}'
source_conllu = create_conllu(sen, source_sen_name)
source_paragraphs.append(source_conllu)
sen_source.append((sen, source_sen_name))
for sen_i, sen in enumerate(target_res):
target_sen_name = f'{div_name}t.{par_name}.{str(sen_i + 1)}'
target_conllu = create_conllu(sen, target_sen_name)
target_paragraphs.append(target_conllu)
sen_target.append((sen, target_sen_name))
tokenized_source_paragraphs.append((par_name, source_paragraphs))
tokenized_target_paragraphs.append((par_name, target_paragraphs))
paragraph_edges.append(create_edges(edges, sen_source, sen_target))
tokenized_source_divs.append((div_name+'s', tokenized_source_paragraphs))
tokenized_target_divs.append((div_name+'t', tokenized_target_paragraphs))
document_edges.append(paragraph_edges)
with open(args.tokenization_interprocessing, 'wb') as wp:
pickle.dump((tokenized_source_divs, tokenized_target_divs, document_edges), wp)
return tokenized_source_divs, tokenized_target_divs, document_edges

48
src/read/svala_data.py Executable file
View File

@ -0,0 +1,48 @@
from collections import deque
from src.read.hand_fixes import SVALA_HAND_FIXES_MERGE
class SvalaData():
def __init__(self, svala_data):
for el in svala_data['source']:
el['text'] = el['text'].strip()
if el['text'] == '':
print('What?')
for el in svala_data['target']:
el['text'] = el['text'].strip()
if el['text'] == '':
print('What?')
self.svala_data = svala_data
self.links_ids_mapper, self.edges_of_one_type = self.create_ids_mapper(svala_data)
@staticmethod
def create_ids_mapper(svala_data):
# create links to ids mapper
links_ids_mapper = {}
edges_of_one_type = set()
for k, v in svala_data['edges'].items():
has_source = False
has_target = False
v['source_ids'] = []
v['target_ids'] = []
for el in v['ids']:
# create edges of one type
if el[0] == 's':
v['source_ids'].append(el)
has_source = True
if el[0] == 't':
v['target_ids'].append(el)
has_target = True
# create links_ids_mapper
if el not in links_ids_mapper:
links_ids_mapper[el] = []
links_ids_mapper[el].append(k)
if not has_source or not has_target or (
len(svala_data['source']) == 1 and svala_data['source'][0]['text'] == ' ') \
or (len(svala_data['target']) == 1 and svala_data['target'][0]['text'] == ' '):
edges_of_one_type.add(k)
return links_ids_mapper, edges_of_one_type

0
src/write/__init__.py Executable file
View File

210
src/write/write.py Executable file
View File

@ -0,0 +1,210 @@
import copy
import csv
import json
import os
from lxml import etree
import conllu
from src.create_tei import construct_sentence_from_list, \
construct_paragraph_from_list, TeiDocument, build_tei_etrees, build_links, build_complete_tei, convert_bibl
def form_paragraphs(annotated_source_divs, metadata):
etree_source_divs = []
for div_i, div_tuple in enumerate(annotated_source_divs):
div_name, div = div_tuple
if div_name[:-1] not in metadata:
print(div_name[:-1])
continue
div_metadata = metadata[div_name[:-1]]
etree_source_paragraphs = []
for par_i, paragraph_tuple in enumerate(div):
par_name, paragraph = paragraph_tuple
etree_source_sentences = []
for sentence_id, sentence in enumerate(paragraph):
if len(sentence) > 0:
conllu_parsed = conllu.parse(sentence)[0]
etree_source_sentences.append(
construct_sentence_from_list(str(sentence_id + 1), conllu_parsed, True))
etree_source_paragraphs.append(construct_paragraph_from_list(div_name, par_name, etree_source_sentences))
etree_source_divs.append((etree_source_paragraphs, div_name, div_metadata))
return etree_source_divs, div_name
def read_metadata(args):
texts_metadata = []
with open(args.texts_metadata, 'r') as file:
csvreader = csv.reader(file, delimiter='|', quotechar='"')
column_names = []
for i, row in enumerate(csvreader):
if i == 0:
column_names = row
continue
else:
row_dict = {}
for j, content in enumerate(row):
row_dict[column_names[j]] = content.strip()
texts_metadata.append(row_dict)
# handle teachers
teachers_metadata = {}
with open(args.teachers_metadata, 'r') as file:
csvreader = csv.reader(file, delimiter='\t', quotechar='"')
column_names = []
for i, row in enumerate(csvreader):
if i == 0:
column_names = row
continue
else:
row_dict = {}
for j, content in enumerate(row):
row_dict[column_names[j]] = content
row_dict['Ime in priimek'] = row_dict['Ime in priimek'].strip()
teachers_metadata[row_dict['Ime in priimek']] = row_dict
# handle authors
authors_metadata = {}
with open(args.authors_metadata, 'r') as file:
csvreader = csv.reader(file, delimiter='|', quotechar='"')
column_names = []
for i, row in enumerate(csvreader):
if i == 0:
column_names = row
continue
elif i == 1:
active_column_name = ''
for j, sub_name in enumerate(row):
if column_names[j]:
active_column_name = column_names[j]
if sub_name:
column_names[j] = f'{active_column_name} - {sub_name}'
continue
elif i == 2:
continue
else:
row_dict = {}
for j, content in enumerate(row):
row_dict[column_names[j]] = content.strip()
row_dict['Ime in priimek'] = row_dict['Ime in priimek'].strip()
authors_metadata[row_dict['Ime in priimek']] = row_dict
translations = {}
with open(args.translations, 'r') as file:
csvreader = csv.reader(file, delimiter='\t', quotechar='"')
for row in csvreader:
translations[row[0]] = row[1]
return texts_metadata, authors_metadata, teachers_metadata, translations
def process_metadata(args):
texts_metadata, authors_metadata, teachers_metadata, translations = read_metadata(args)
metadata = {}
for document_metadata in texts_metadata:
document_metadata['Tvorec'] = document_metadata['Tvorec'].strip()
if document_metadata['Tvorec'] not in authors_metadata:
if document_metadata['Tvorec']:
print(document_metadata['Tvorec'])
continue
author_metadata = authors_metadata[document_metadata['Tvorec']]
metadata_el = {}
for attribute_name_sl, attribute_name_en in translations.items():
if attribute_name_sl in document_metadata:
if attribute_name_sl == 'Ocena':
grade = f'{document_metadata[attribute_name_sl]} od {document_metadata["Najvišja možna ocena"]}' if document_metadata[attribute_name_sl] and document_metadata["Najvišja možna ocena"] else ''
metadata_el[attribute_name_en] = grade
elif attribute_name_sl == 'Tvorec':
metadata_el[attribute_name_en] = author_metadata['Koda tvorca']
elif attribute_name_sl == 'Učitelj':
metadata_el[attribute_name_en] = teachers_metadata[document_metadata['Učitelj']]['Koda'] if document_metadata['Učitelj'] in teachers_metadata else None
else:
metadata_el[attribute_name_en] = document_metadata[attribute_name_sl]
elif attribute_name_sl in author_metadata:
metadata_el[attribute_name_en] = author_metadata[attribute_name_sl]
elif attribute_name_sl == 'Ime šole, Fakulteta':
curr_school = []
if author_metadata["Trenutno šolanje - Ime šole"]:
curr_school.append(author_metadata["Trenutno šolanje - Ime šole"])
if author_metadata["Trenutno šolanje - Fakulteta"]:
curr_school.append(author_metadata["Trenutno šolanje - Fakulteta"])
metadata_el['Current school'] = ', '.join(curr_school)
elif attribute_name_sl == 'Stopnja študija':
metadata_el[attribute_name_en] = author_metadata['Trenutno šolanje - Stopnja študija']
elif attribute_name_sl == 'Leto študija':
metadata_el[attribute_name_en] = author_metadata['Trenutno šolanje - Leto študija']
elif attribute_name_sl == 'Ostali jeziki':
metadata_el[attribute_name_en] = ','.join([k[16:] for k, v in author_metadata.items() if k[:13] == 'Ostali jeziki' and v == 'ja'])
elif attribute_name_sl == 'Kje učenje':
metadata_el[attribute_name_en] = author_metadata['Življenje v Sloveniji pred tem programom - Kje?']
elif attribute_name_sl == 'Koliko časa učenje?':
metadata_el[attribute_name_en] = author_metadata['Življenje v Sloveniji pred tem programom - Koliko časa?']
elif attribute_name_sl == 'Učbeniki':
metadata_el[attribute_name_en] = author_metadata['Učenje slovenščine pred tem programom - Učbeniki']
elif attribute_name_sl == 'Kje?':
metadata_el[attribute_name_en] = author_metadata['Učenje slovenščine pred L+ - Kje?']
elif attribute_name_sl == 'Koliko časa?':
metadata_el[attribute_name_en] = author_metadata['Učenje slovenščine pred L+ - Koliko čas?']
else:
raise Exception(f'{attribute_name_sl} not found!')
metadata[metadata_el['Text ID']] = metadata_el
return metadata
def write_tei(annotated_source_divs, annotated_target_divs, document_edges, args):
print('BUILDING LINKS...')
etree_links = build_links(document_edges)
with open(os.path.join(args.results_folder, f"links.xml"), 'w') as tf:
tf.write(etree.tostring(etree_links, pretty_print=True, encoding='utf-8').decode())
with open(os.path.join(args.results_folder, f"links.json"), 'w') as jf:
json.dump(document_edges, jf, ensure_ascii=False, indent=" ")
print('WRITTING TEI...')
etree_source_documents = []
etree_target_documents = []
print('PREPARING METADATA FOR BIBL...')
metadata = process_metadata(args)
print('WRITING SOURCE FILES...')
etree_source_divs, source_div_name = form_paragraphs(annotated_source_divs, metadata)
print('WRITING TARGET FILES...')
etree_target_divs, target_div_name = form_paragraphs(annotated_target_divs, metadata)
print('APPENDING DOCUMENT...')
etree_source_documents.append(
TeiDocument(source_div_name,
etree_source_divs, etree_target_divs))
etree_target_documents.append(
TeiDocument(target_div_name,
etree_target_divs, etree_source_divs))
print('BUILDING TEI DOCUMENTS...')
etree_source = build_tei_etrees(etree_source_documents)
etree_target = build_tei_etrees(etree_target_documents)
# to reduce RAM usage you may process the following in two steps, firstly write all but complete (by commenting complete tree code), secondly write only complete (by commenting "Writting all but complete" section of code and "deepcopy" function)
print('Writting all but complete')
with open(os.path.join(args.results_folder, f"source.xml"), 'w') as sf:
sf.write(etree.tostring(etree_source[0], pretty_print=True, encoding='utf-8').decode())
with open(os.path.join(args.results_folder, f"target.xml"), 'w') as tf:
tf.write(etree.tostring(etree_target[0], pretty_print=True, encoding='utf-8').decode())
print('COMPLETE TREE CREATION...')
complete_etree = build_complete_tei(copy.deepcopy(etree_source), copy.deepcopy(etree_target), etree_links)
# complete_etree = build_complete_tei(etree_source, etree_target, etree_links)
print('WRITING COMPLETE TREE')
with open(os.path.join(args.results_folder, f"complete.xml"), 'w') as tf:
tf.write(etree.tostring(complete_etree, pretty_print=True, encoding='utf-8').decode())

1208
svala2tei.py Normal file → Executable file

File diff suppressed because it is too large Load Diff

1226
svala2tei_solar.py Executable file

File diff suppressed because it is too large Load Diff

0
svala_formatter/__init__.py Normal file → Executable file
View File

0
svala_formatter/copy_generated_files.py Normal file → Executable file
View File

0
svala_formatter/copy_svala_handchecked_files.py Normal file → Executable file
View File

0
svala_formatter/generate_text.py Normal file → Executable file
View File

0
svala_formatter/svala1.0.1_compare_hand_changes.py Normal file → Executable file
View File

0
tag_selection.py Normal file → Executable file
View File

6
txt2svala.py Normal file → Executable file
View File

@ -47,11 +47,11 @@ def main(args):
if __name__ == '__main__':
parser = argparse.ArgumentParser(
description='Read already processed xmls, erase entries without examples and limit gigafida examples to 1 per entry.')
description='Converts raw text into svala format.')
parser.add_argument('--input_folder', default='data/txt/input',
help='input file in (gz or xml currently). If none, then just database is loaded')
help='Path to folder containing raw texts.')
parser.add_argument('--output_folder', default='data/txt/output',
help='input file in (gz or xml currently). If none, then just database is loaded')
help='Path to folder that will contain svala formatted texts.')
args = parser.parse_args()
start = time.time()