Randomized extraction of the New Norwegian Corpus (Nynorskkorpuset).
Contains sentences in New Norwegian (Nynorsk) from the year 2000 and after. Tab-separated, one word pr. line, lemmatized and morphologically tagged, year and domain information is given. Annotation is done with the Oslo-Bergen tagger. Sentences in the Bokmål standard have been removed.
This corpus is intended for use in the development of language technology.
Size: 3,3 million sentences, 57,5 million words.