dc.contributor.author | Gammeltoft, Peder |
dc.date.accessioned | 2021-01-29T12:21:00Z |
dc.date.available | 2021-01-29T12:21:00Z |
dc.date.issued | 2021-01-29 |
dc.identifier.uri | http://hdl.handle.net/11509/140 |
dc.description | Randomized extraction of the New Norwegian Corpus (Nynorskkorpuset). Contains sentences in New Norwegian (Nynorsk) from the year 2000 and after. Tab-separated, one word pr. line, lemmatized and morphologically tagged, year and domain information is given. Annotation is done with the Oslo-Bergen tagger. Sentences in the Bokmål standard have been removed. This corpus is intended for use in the development of language technology. Size: 3,3 million sentences, 57,5 million words. |
dc.language.iso | nno |
dc.publisher | University of Bergen Library |
dc.rights | Creative Commons - Attribution 3.0 Unported (CC BY 3.0) |
dc.rights.uri | http://creativecommons.org/licenses/by/3.0/ |
dc.rights.label | CC |
dc.source.uri | http://spraksamlingene.no/ |
dc.subject | Sentences |
dc.subject | Nynorsk |
dc.title | Randomized extraction of the New Norwegian corpus |
dc.type | corpus |
metashare.ResourceInfo#ContentInfo.mediaType | text |
hidden | false |
hasMetadata | false |
has.files | yes |
branding | Clarino |
demo.uri | http://no2014.uib.no/korpuset/conc_enkeltsok.htm |
contact.person | Peder Gammeltoft peder.gammeltoft@uib.no University of Bergen Library |
contact.person | Paul Meurer paul.meurer@uib.no University of Bergen Library |
size.info | 3300000 sentences |
files.size | 454682983 |
files.count | 1 |
Files in this item
This item is
Creative Commons - Attribution 3.0 Unported (CC BY 3.0)
Distributed under Creative Commons
and licensed under:Creative Commons - Attribution 3.0 Unported (CC BY 3.0)
- Name
- nnk-2000-scrambled.zip
- Size
- 433.62 MB
- Format
- application/zip
- Description
- Extraction of Nynorskkorpuset