Language Resource Search - SHACHI: Language Resource Metadata Database

Language resource #: 3330 Results 1251 - 1260 of 2023

Select items

description_language
language_area
language
type
subject_monoMultilingual
subject_resourceSubject
type_style
type_form
type_sentence
type_linguisticType
type_discourseType
type_purpose
subject_linguisticField
contributor_author_level
contributor_speaker_level
contributor_author_motherTongue
contributor_speaker_motherTongue
contributor_author_dialect
contributor_speaker_dialect
contributor_author_age
contributor_speaker_age
contributor_author_gender
contributor_speaker_gender
type_annotation

C-003633: MultiSemCor Corpus 1.1
The MultiSemCor project aims at creating a semantically annotated corpus by utilizing information in the English SemCor corpus, a subset of the English Brown Corpus containing almost 700,000 running words. MultiSemCor is an English/Italian parallel corpus, aligned at the word level and annotated with POS, lemma and word sense, consisting of 116 English texts aligned with their corresponding 116 Italian translations, for a total of about 464,000 running words. It is also an aligned parallel corpus lexically annotated with a shared inventory of word senses. The corpus uses the same sense inventory as the one used by MultiWordNet.
- hasPart: C-003632: SemCor 1.6
- conformsTo: D-000825: WordNet
- conformsTo: D-000378: MultiWordNet database (included semantic fields)
C-003634: SemCor 1.7
The SemCor corpus 1.7 consists of semantically annotated 352 texts from Brown Corpus. All the words in SemCor are tagged for POS and more than 200,000 content words are lemmatized and sense-tagged. SemCor 1.7 was automatically created from SemCor 1.6 by mapping WordNet 1.6 to WordNet 1.7 senses (SemCor 1.6 was manually annotated).
- references: D-000825: WordNet
- references: C-003632: SemCor 1.6
- replaces: C-003632: SemCor 1.6
- isReplacedBy: C-003635: SemCor 1.7.1
C-003635: SemCor 1.7.1
The SemCor corpus 1.7.1 consists of semantically annotated 352 texts from Brown Corpus. All the words in SemCor are tagged for POS and more than 200,000 content words are lemmatized and sense-tagged. SemCor 1.7.1 was automatically created from SemCor 1.6 by mapping WordNet 1.6 to WordNet 1.7.1 senses (SemCor 1.6 was manually annotated).
- references: D-000825: WordNet
- references: C-003632: SemCor 1.6
- replaces: C-003634: SemCor 1.7
- isReplacedBy: C-003636: SemCor 2.0
C-003636: SemCor 2.0
The SemCor corpus 2.0 consists of semantically annotated 352 texts from Brown Corpus. All the words in SemCor are tagged for POS and more than 200,000 content words are lemmatized and sense-tagged. SemCor 2.0 was automatically created from SemCor 1.6 by mapping WordNet 1.6 to WordNet 2.0 senses (SemCor 1.6 was manually annotated).
- references: D-000825: WordNet
- references: C-003632: SemCor 1.6
- replaces: C-003635: SemCor 1.7.1
- isReplacedBy: C-003637: SemCor 2.1
C-003637: SemCor 2.1
The SemCor corpus 2.1 consists of semantically annotated 352 texts from Brown Corpus. All the words in SemCor are tagged for POS and more than 200,000 content words are lemmatized and sense-tagged. SemCor 2.1 was automatically created from SemCor 1.6 by mapping WordNet 1.6 to WordNet 2.1 senses (SemCor 1.6 was manually annotated).
- references: D-000825: WordNet
- references: C-003632: SemCor 1.6
- replaces: C-003636: SemCor 2.0
- isReplacedBy: C-003638: SemCor 3.0
C-003638: SemCor 3.0
The SemCor corpus 3.0 consists of semantically annotated 352 texts from Brown Corpus. All the words in SemCor are tagged for POS and more than 200,000 content words are lemmatized and sense-tagged. SemCor 3.0 was automatically created from SemCor 1.6 by mapping WordNet 1.6 to WordNet 3.0 senses (SemCor 1.6 was manually annotated).
- references: D-000825: WordNet
- references: C-003632: SemCor 1.6
- replaces: C-003637: SemCor 2.1
C-003639: Wordnet Domains 3.2
WordNet Domains was created by augmenting WordNet, created by Princeton University, with domain labels (e.g. Sport, Politics, Medicine) based on the Dewey Decimal Classification. Synsets have been annotated with at least one semantic domain label among about two hundred labels. Information in the Domains corpus is complementary to what is already in Wordnet. In the Domains corpus, senses of the same word may be grouped into homogeneous clusters, reducing word polysemy in WordNet. The corpus icludes WordNet-Affect which contains an additional hierarchy of "affective" domain labels.
- references: D-000825: WordNet
- hasPart: WordNet-Affect
- conformsTo: Dewey Decimal Classification
C-003640: Corpus of Written British Creole
This is a corpus of written Caribbean Creole (Jamaican Creole) English consisting of about 12,000 words. The data is annoated by using a set of contrastive tags which would mark differences in spelling, lexis, and discoursal and grammatical structure between Standard English and the Creole language texts. In other words, tags have mainly been used only where the word or structure in the corpus would not be expected in a text which was in Standard English. The types of texts in the corpus includes poems, novels/fictions, plays, and miscellaneous writings including advertisements and grafitti.
C-003641: AUTONOMATA Spoken Names Corpus
The AUTONOMATA Spoken Names Corpus is a speech database of about 5000 Dutch and Flemish first names, surnames, street names, city names and control words. The package includes the corresponding phonetic transcriptions. The total of 240 speakers (120 Dutch and 120 Flemish) consist of 120 native speakers and 120 non-native.
- conformsTo: C-001711: Corpus Gesproken Nederlands
C-003642: COREA-coreferentiecorpus
The COREA corpus is a text corpus in Dutch and Flemish annotated for coreference relations for names, pronouns, noun phrases. The corpus was developed under the COREA project, a two-year project for developing a robust system for assigning coreference relations automatically.
- references: C-001711: Corpus Gesproken Nederlands
- references: C-003643: D-Coi-corpus

SHACHI - Language Resource Metadata Database