2016-03-17T10:26:25Z
http://clarino.uib.no/oai
oai:clarino.uib.no:SIKOR_free_sma_20151010
2016-03-17T10:21:32Z
Giellatekno
Ciprian-Virgil Gerstenberger
2015-12-14
clarin.eu:cr1:p_1407745711925
Clarino UiT
corpus
SIKOR South Saami free corpus
The SIKOR South Saami free corpus is a monolingual text corpus of South Saami that contains administrative, law, religious, non-fiction, and fiction texts. It is work done by the Giellatekno and Divvun research groups, Department of Linguistics, UiT The Arctic University of Norway, as well as by members of the language community. In particular, the following colleagues have contributed to the creation of the ressource: Ciprian Gerstenberger, Børre Gaup, Risten-Birje Steinfjell, Lene Antonsen, Trond Trosterud, and Maja Kappfjell. Linguistically, the data set (58,407 sentences; 646,273 tokens) features word form, lemma, morphosyntactic analysis, and dependency relations between tokens. The corpus has been automatically processed and linguistically analyzed with the Giellatekno/Divvun tools. Therefore, it may contain wrong annotations. In case you find any errors the creators would appreciate your feedback sent to giellatekno@uit.no and feedback@divvun.no.
Please note that the Giellatekno resources are dynamic in nature. A stable "snapshot" is deposited with regular intervals at the CLARINO Bergen repository for download. To ensure that you have a completely updated version, please contact Giellatekno (see Contact Info in metadata).
SIKOR_sma_free_20151010
2015-10-10
Public
Creative Commons (CC)
Creative_Commons-BY (CC-BY)
http://creativecommons.org/licenses/by/4.0/
BY
organization
creator
contact
Giellatekno, Saami Language Technology
Giellatekno
Department of Linguistics, UiT The Arctic University of Norway
giellatekno@uit.no
http://giellatekno.uit.no/index.eng.html
organization
creator
contact
The Divvun group at UiT
Divvun
Department of Linguistics, UiT The Arctic University of Norway
feedback@divvun.no
http://divvun.no
2015-12-14
2016-03-17
person
contact
creator
Gerstenberger
Ciprian-Virgil
male
Norges arktiske universitet
The arctic university of Norway
UiT
Giellatekno - Saami Language Technology
ciprian.gerstenberger@uit.no
http://ansatte.uit.no/ciprian.gerstenberger
2010
organization
creator
contact
Giellatekno, Saami Language Technology
Giellatekno
Department of Linguistics, UiT The Arctic University of Norway
giellatekno@uit.no
http://giellatekno.uit.no/index.eng.html
http://divvun.no
Tromsø
Norway
organization
creator
contact
The Divvun group at UiT
Divvun
Department of Linguistics, UiT The Arctic University of Norway
feedback@divvun.no
http://divvun.no
Tromsø
Norway
Written Corpus
text
monolingual
sma
South Saami
writtenLanguage
58,407
sentences
646,273
tokens
lemmatization
morphosyntacticAnnotation-posTagging
syntacticAnnotation-treebanks
dependency structures
textGenre
administrative
textGenre
unstandardised
law
textGenre
unstandardised
religious
textGenre
factual prose
textGenre
fiction and drama
1993-2015