dc.contributor.author | Rosén, Victoria |
dc.date.accessioned | 2015-10-09T07:37:23Z |
dc.date.available | 2015-10-09T07:37:23Z |
dc.date.issued | 2015-08-17 |
dc.identifier.uri | http://hdl.handle.net/11509/90 |
dc.description | In the INESS project, Norwegian texts in Norwegian Bokmål and Nynorsk are parsed with the NorGram grammar and lexicon. When text is parsed, there will always be words that are unknown to the morphological analyzer and/or the lexicon. INESS has therefore developed an intelligent browser-based preprocessing interface which facilitates, among other things, the efficient treatment of unknown word forms. The list of word forms that have not been automatically recognized are manually inspected. While some of these result from OCR errors and others are simply typos, most unrecognized word forms are productive compounds, words only occurring in MWEs, names, foreign words, neologisms, interjections, dialect words, and systematic, or intended, misspellings. To read more about the types of lexical units registered, please refer to the documentation at http://clarino.uib.no/iness/page?page-id=Text_preprocessing. |
dc.language.iso | nor |
dc.publisher | http://clarino.uib.no/iness/ |
dc.rights | Creative Commons - Attribution 3.0 Unported (CC BY 3.0) |
dc.rights.uri | http://creativecommons.org/licenses/by/3.0/ |
dc.rights.label | CC |
dc.source.uri | http://clarino.uib.no/iness/page |
dc.subject | Lexical Conceptual Resource |
dc.title | INESS list of lexical units unknown to the NorGram lexicon |
dc.type | lexicalConceptualResource |
metashare.ResourceInfo#ContentInfo.detailedType | computationalLexicon |
metashare.ResourceInfo#ContentInfo.mediaType | text |
hidden | false |
hasMetadata | false |
has.files | yes |
branding | Clarino |
demo.uri | http://clarino.uib.no/iness/extracted-words |
contact.person | Victoria Rosén iness@uib.no University of Bergen, Department of Linguistic, Literary and Aesthetic Studies |
sponsor | The Research Council of Norway under the Infrastruktur program 000000 INESS (Infrastructure for the Exploration of Syntax and Semantics) nationalFunds |
size.info | 1 units |
files.size | 1778275 |
files.count | 1 |
Files in this item
This item is
Creative Commons - Attribution 3.0 Unported (CC BY 3.0)
Distributed under Creative Commons
and licensed under:Creative Commons - Attribution 3.0 Unported (CC BY 3.0)