classla-api/README.md

151 lines
3.4 KiB
Markdown
Raw Normal View History

2021-12-01 07:58:09 +00:00
# classla-api
2022-01-14 08:42:34 +00:00
## Description
This tool uses classla library as an API. It allows for calls on some preset classla settings, as well as a custom one.
2022-01-25 14:37:59 +00:00
## Slovenian Standard UD
2022-01-14 08:42:34 +00:00
Preset classla settings:
```json
{
"lang": "sl",
2022-01-25 14:37:59 +00:00
"pos_use_lexicon": true
2022-01-14 08:42:34 +00:00
}
```
Usage example:
```commandline
2022-01-25 14:37:59 +00:00
curl -X POST -d '{"text": "France Prešeren je rojen v Vrbi."}' https://orodja.cjvt.si/oznacevalnik/standard-ud
2022-01-14 08:42:34 +00:00
```
2022-01-25 14:37:59 +00:00
## Slovenian Standard JOS
2022-01-14 08:42:34 +00:00
Preset classla settings:
```json
{
"lang": "sl",
"pos_use_lexicon": true,
"type": "standard_jos"
}
```
Usage example:
```commandline
2022-01-25 14:37:59 +00:00
curl -X POST -d '{"text": "France Prešeren je rojen v Vrbi."}' https://orodja.cjvt.si/oznacevalnik/standard-jos
2022-01-14 08:42:34 +00:00
```
2022-01-25 14:37:59 +00:00
## Slovenian Nonstandard UD
2022-01-14 08:42:34 +00:00
Preset classla settings:
```json
{
"lang": "sl",
"pos_use_lexicon": true,
2022-01-25 14:37:59 +00:00
"type": "nonstandard"
2022-01-14 08:42:34 +00:00
}
```
Usage example:
```commandline
2022-01-25 14:37:59 +00:00
curl -X POST -d '{"text": "kva smo mi zurali zadnje leto v zagrebu..."}' https://orodja.cjvt.si/oznacevalnik/nonstandard-ud
2022-01-14 08:42:34 +00:00
```
2022-01-25 14:37:59 +00:00
## Slovenian Nonstandard JOS
2022-01-14 08:42:34 +00:00
Preset classla settings:
```json
{
"lang": "sl",
"pos_use_lexicon": true,
"processors": {
"tokenize": "nonstandard",
"lemma": "nonstandard",
"pos": "nonstandard",
"depparse": "standard_jos",
"ner": "nonstandard"
}
}
```
Usage example:
```commandline
2022-01-25 14:37:59 +00:00
curl -X POST -d '{"text": "kva smo mi zurali zadnje leto v zagrebu..."}' https://orodja.cjvt.si/oznacevalnik/nonstandard-jos
2022-01-14 08:42:34 +00:00
```
2022-01-25 14:37:59 +00:00
## Croatian Standard UD
Preset classla settings:
```json
{
"lang": "hr",
}
```
Usage example:
```commandline
curl -X POST -d '{"text": "Ante Starčević rođen je u Velikom Žitniku."}' https://orodja.cjvt.si/oznacevalnik/hr-standard-ud
```
## Croatian Nonstandard UD
Preset classla settings:
```json
{
"lang": "hr",
"type": "nonstandard"
}
```
Usage example:
```commandline
curl -X POST -d '{"text": "kaj sam ja tulumaril jucer u ljubljani..."}' https://orodja.cjvt.si/oznacevalnik/hr-nonstandard-ud
```
## Serbian Standard UD
Preset classla settings:
```json
{
"lang": "sr",
}
```
Usage example:
```commandline
curl -X POST -d '{"text": "Slobodan Jovanović rođen je u Novom Sadu."}' https://orodja.cjvt.si/oznacevalnik/sr-standard-ud
```
## Serbian Nonstandard UD
Preset classla settings:
```json
{
"lang": "sr",
"type": "nonstandard"
}
```
Usage example:
```commandline
curl -X POST -d '{"text": "ne mogu da verujem kakvo je zezanje bilo prosle godine u zagrebu..."}' https://orodja.cjvt.si/oznacevalnik/sr-nonstandard-ud
```
## Bulgarian Standard UD
Preset classla settings:
```json
{
"lang": "bg",
}
```
Usage example:
```commandline
curl -X POST -d '{"text": "Алеко Константинов е роден в Свищов."}' https://orodja.cjvt.si/oznacevalnik/bg-standard-ud
```
## Macedonian Standard UD
Preset classla settings:
```json
{
"lang": "mk",
}
```
Usage example:
```commandline
curl -X POST -d '{"text": "Крсте Петков Мисирков е роден во Постол."}' https://orodja.cjvt.si/oznacevalnik/mk-standard-ud
```
2022-01-14 08:42:34 +00:00
## Custom settings
Custom settings may be used, however they have to be in compliance with what the library allows (you can check this on https://github.com/clarinsi/classla)
###Warning: Usage of custom settings is a slow action! It may take more than 30s to get a result!
Usage example:
```commandline
2022-01-25 14:37:59 +00:00
curl -X POST -d '{"text": "France Prešeren je rojen v Vrbi.", "settings": {"lang": "sl", "pos_lemma_pretag": false}}' https://orodja.cjvt.si/oznacevalnik/custom-settings
2022-01-14 08:42:34 +00:00
```