Show simple item record

 
dc.contributor.author Gammeltoft, Peder
dc.date.accessioned 2021-01-29T12:21:00Z
dc.date.available 2021-01-29T12:21:00Z
dc.date.issued 2021-01-29
dc.identifier.uri http://hdl.handle.net/11509/140
dc.description Randomized extraction of the New Norwegian Corpus (Nynorskkorpuset). Contains sentences in New Norwegian (Nynorsk) from the year 2000 and after. Tab-separated, one word pr. line, lemmatized and morphologically tagged, year and domain information is given. Annotation is done with the Oslo-Bergen tagger. Sentences in the Bokmål standard have been removed. This corpus is intended for use in the development of language technology. Size: 3,3 million sentences, 57,5 million words.
dc.language.iso nno
dc.publisher University of Bergen Library
dc.rights Creative Commons - Attribution 3.0 Unported (CC BY 3.0)
dc.rights.uri http://creativecommons.org/licenses/by/3.0/
dc.rights.label CC
dc.source.uri http://spraksamlingene.no/
dc.subject Sentences
dc.subject Nynorsk
dc.title Randomized extraction of the New Norwegian corpus
dc.type corpus
metashare.ResourceInfo#ContentInfo.mediaType text
hidden false
hasMetadata false
has.files yes
branding Clarino
demo.uri http://no2014.uib.no/korpuset/conc_enkeltsok.htm
contact.person Peder Gammeltoft peder.gammeltoft@uib.no University of Bergen Library
contact.person Paul Meurer paul.meurer@uib.no University of Bergen Library
size.info 3300000 sentences
files.size 454682983
files.count 1


 Files in this item

This item is
Distributed under Creative Commons
and licensed under:
Creative Commons - Attribution 3.0 Unported (CC BY 3.0)
Attribution Required
Icon
Name
nnk-2000-scrambled.zip
Size
433.62 MB
Format
application/zip
Description
Extraction of Nynorskkorpuset
 Download file   Preview
  File Preview  
    • nnk-2000-scrambled.tsv2 GB

Show simple item record