Language Resource Search - SHACHI: Language Resource Metadata Database

Language resource #: 3330 Results 1241 - 1250 of 2023

C-003622: Nihongo-no Goitokusei (Lexical properties of Japanese) Vol. 9
NTT "Nihongo-no Goitokusei" is a psycholinguistic database of Japanese words consisting of nine sublexicons grouped into four sets. The database is very useful for not just psycholinguistic study but development of technologies in such fields as education, speech processing and natural language processing. The Fourth Release (Volume 9) is the addition to Vol. 1 "Word Familiality," providing familiarity ratings for approximately 30,000 new entries from Gakken Kokugo Daijiten (Gakken Great Japanese Dictionary) 2nd Edition. A search software and text format files are also included in the package.
- references: Gakken Kokugo Daijiten 2nd Edition
- isPartOf: D-003476: Nihongo-no Goitokusei (Lexical properties of Japanese)
- hasVersion: C-003623: Kihongo Database Gogibetsu Tango Shinmitsudo
- hasVersion: D-003458: Nihongo-no Goitokusei (Lexical properties of Japanese) Vol. 1
- hasVersion: D-003460: Nihongo-no Goitokusei (Lexical properties of Japanese) Vol. 2
- hasVersion: D-003462: Nihongo-no Goitokusei (Lexical properties of Japanese) Vol. 3
- hasVersion: D-003464: Nihongo-no Goitokusei (Lexical properties of Japanese) Vol. 4
- hasVersion: D-003466: Nihongo-no Goitokusei (Lexical properties of Japanese) Vol. 5
- hasVersion: D-003468: Nihongo-no Goitokusei (Lexical properties of Japanese) Vol. 6
- hasVersion: D-003470: CD-ROM Nihongo-no Goitokusei (Lexical properties of Japanese) Vol. 7
- hasVersion: D-003472: Nihongo-no Goitokusei (Lexical properties of Japanese) Vol. 8
- hasVersion: D-003474: CD-ROM Nihongo-no Goitokusei (Lexical properties of Japanese) Vols. 1-6
C-003623: Kihongo Database Gogibetsu Tango Shinmitsudo
The database provides familiarity ratings based on word meanings for approximately 28,000 Japanese words (approximately 45,000 word meanings). It can be used as a basic data for linguistic, educational, psycholinguistic and other studies.
- hasVersion: D-003476: Nihongo-no Goitokusei (Lexical properties of Japanese)
- hasVersion: D-003458: Nihongo-no Goitokusei (Lexical properties of Japanese) Vol. 1
- hasVersion: D-003460: Nihongo-no Goitokusei (Lexical properties of Japanese) Vol. 2
- hasVersion: D-003462: Nihongo-no Goitokusei (Lexical properties of Japanese) Vol. 3
- hasVersion: D-003464: Nihongo-no Goitokusei (Lexical properties of Japanese) Vol. 4
- hasVersion: D-003466: Nihongo-no Goitokusei (Lexical properties of Japanese) Vol. 5
- hasVersion: D-003468: Nihongo-no Goitokusei (Lexical properties of Japanese) Vol. 6
- hasVersion: D-003470: CD-ROM Nihongo-no Goitokusei (Lexical properties of Japanese) Vol. 7
- hasVersion: D-003472: Nihongo-no Goitokusei (Lexical properties of Japanese) Vol. 8
- hasVersion: C-003622: Nihongo-no Goitokusei (Lexical properties of Japanese) Vol. 9
- hasVersion: D-003474: CD-ROM Nihongo-no Goitokusei (Lexical properties of Japanese) Vols. 1-6
C-003624: Turin University Treebank 1.1
TUT is a morphologically, syntactically and semantically annotated corpus of Italian sentences. It consists of two Italian subcorpora (Civil law corpus (1100 sentences) and Newpaper corpus (1100)) and an English corpus (200 sentences) as a support for non-Italian speakers to the comprehension of the annotation scheme. The Italian corpora are annoted with two different formats; TUT format and Penn Treebank format. TUT format is dependency-oriented and aims at capturing the richness of the predicate-argument structure. The English corpus is annoted only with TUT format.
- conformsTo: C-001546: Treebank-2
- conformsTo: ILEX
C-003625: Tübingen Partially Parsed Corpus of Written German
TüPP-D/Z is a collection of newspaper articles written in German, automatically annotated with clause structure, topological fields, chunks and some low level annotation including POS, morphological ambiguity classes and information about some regular types of named entities including numerical expressions such as dates, numbers and units. The raw text of the corpus consists of more than 200 million words.
C-003626: Tübingen Treebank of Spoken German
The TüBa-D/S treebank was built under the project Verbmobil, a longterm machine translation project for spontaneous speech funded by the Ministry for Education, Science, Research, and Technology (BMBF) in Germany. It contains syntactically annotated transcribed spontaneous dialogues in German consisting of approximately 38,000 sentences (360,000 words). The annotation scheme distinguishes four levels of syntactic constituency: the lexical level, the phrasal level, the level of topological fields, and the clausal level. The treebank is available in 3 different formats: negra export format, XML format and Penn Treebank format.
- hasVersion: C-003627: Tübingen Treebank of Written German Release 4
- hasVersion: C-003628: Tübingen Treebank of Spoken English
- hasVersion: C-003629: Tübingen Treebank of Spoken Japanese
- references: C-000188: VERBMOBIL - VM CD 1.1 (new edition)
- references: C-000189: VERBMOBIL - VM CD 12.1 (new edition)
- references: C-000190: VERBMOBIL - VM CD 14.1 (new edition)
- references: C-000191: VERBMOBIL - VM CD 2.1 (new edition)
- references: C-000192: VERBMOBIL - VM CD 3.1 (new edition)
- references: C-000193: VERBMOBIL - VM CD 4.1 (new edition)
- references: C-000194: VERBMOBIL - VM CD 5.1 (new edition)
- references: C-000196: VERBMOBIL - VM CD 7.1 (new edition)
- references: C-000197: VERBMOBIL - VM CD S 1.0 (original edition)
- references: C-000198: VERBMOBIL II - VM CD 22.1 - VM22.1 (BAS edition)
- references: C-000199: VERBMOBIL II - VM CD 24.1 - VM24.1 (BAS edition)
- references: C-000203: VERBMOBIL II - VM CD 29.1 - VM29.1 (BAS edition)
- references: C-000207: VERBMOBIL II - VM CD 38.1 - VM38.1 (BAS edition)
- references: C-000208: VERBMOBIL II - VM CD 39.1 - VM39.1 (BAS edition)
- references: C-000209: VERBMOBIL II - VM CD 48.1 - VM48.1 (BAS edition)
- references: C-000211: VERBMOBIL II - VM CD20.1 - VM20.1 (new edition)
- references: C-000212: VERBMOBIL II - VM CD21.1 - VM21.1 (new edition)
- references: C-000370: VERBMOBIL II - VM CD 63.0 - VM63.0 (original edition)
- references: C-000373: VERBMOBIL II - VM CD 65.0 - VM65.0 (original edition)
- references: C-000374: VERBMOBIL II - VM CD 53.1 - VM53.1 (BAS edition)
- references: C-000375: VERBMOBIL II - VM CD 60.1 - VM60.1 (BAS edition)
- references: C-000376: VERBMOBIL II - VM CD 61.1 - VM61.1 (BAS edition)
- references: C-000377: VERBMOBIL II - VM CD 64.0 - VM64.0 (original edition)
C-003627: Tübingen Treebank of Written German Release 4
The TüBa-D/Z treebank is a syntactically annotated German newspaper corpus consisting of approximately 36,000 sentences (640,000 words). The annotation represents information on inflectional morphology, syntactic constituency, grammatical functions, (complex) named entities and anaphora and coreference relations. The corpus is still in progress (as of November of 2008), and releases of more data will follow.
- replaces: Tübingen Treebank of Written German Release 3
C-003628: Tübingen Treebank of Spoken English
The TüBa-E/S treebank was built under the project Verbmobil, a longterm machine translation project for spontaneous speech funded by the Ministry for Education, Science, Research, and Technology (BMBF) in Germany. It contains syntactically annotated transcribed spontaneous dialogues in English consisting of approximately 30,000 sentences (310,000 words). The manual syntactic annotation is HPSG-oriented and based on three levels of syntactic constituency: the lexical level, the phrasal level and the clausal level. The treebank is available in 3 different formats: negra export format, XML format and Penn Treebank format.
- hasVersion: C-003626: Tübingen Treebank of Spoken German
- hasVersion: C-003629: Tübingen Treebank of Spoken Japanese
- references: C-000195: VERBMOBIL - VM CD 6.1 (new edition)
- references: C-001565: VERBMOBIL - VM CD 8.1 (new edition)
- references: C-001564: VERBMOBIL - VM CD 13.1 (new edition)
- references: C-001567: VERBMOBIL II - VM CD 23.1 - VM23.1 (BAS edition)
- references: C-001568: VERBMOBIL II - VM CD 28.1 - VM28.1 (BAS edition)
- references: C-001569: VERBMOBIL II - VM CD 30.1 - VM30.1 (BAS edition)
- references: C-001570: VERBMOBIL II - VM CD 31.1 - VM31.1 (BAS edition)
- references: C-001571: VERBMOBIL II - VM CD 32.1 - VM32.1 (BAS edition)
- references: C-001572: VERBMOBIL II - VM CD 42.1 - VM42.1 (BAS edition)
- references: C-001573: VERBMOBIL II - VM CD 43.1 - VM43.1 (BAS edition)
- references: C-000210: VERBMOBIL II - VM CD 50.1 - VM50.1 (BAS edition)
C-003629: Tübingen Treebank of Spoken Japanese
The TüBa-J/S treebank was built under the project Verbmobil, a longterm machine translation project for spontaneous speech funded by the Ministry for Education, Science, Research, and Technology (BMBF) in Germany. It contains syntactically annotated transcribed spontaneous dialogues in Japanese consisting of approximately 18,000 sentences (160,000 words). The speech data was romanized and manually annotated. The syntactic annotation is HPSG-oriented and based on three levels of syntactic constituency: the lexical level, the phrasal level and the clausal level. The treebank is available in 2 different formats: negra export format and CoNLL-X Shared Task dependency format.
- hasVersion: C-003626: Tübingen Treebank of Spoken German
- hasVersion: C-003628: Tübingen Treebank of Spoken English
- references: C-000368: VERBMOBIL II - VM CD 16.1 - VM16.1 (new edition)
- references: C-000371: VERBMOBIL II - VM CD 17.1 - VM17.1 (new edition)
- references: C-000367: VERBMOBIL II - VM CD 18.1 - VM18.1 (new edition)
- references: C-000372: VERBMOBIL II - VM CD 19.1 - VM19.1 (new edition)
- references: C-000200: VERBMOBIL II - VM CD 25.1 - VM25.1 (BAS edition)
- references: C-000201: VERBMOBIL II - VM CD 26.1 - VM26.1 (BAS edition)
- references: C-000202: VERBMOBIL II - VM CD 27.1 - VM27.1 (BAS edition)
- references: C-000204: VERBMOBIL II - VM CD 33.1 - VM33.1 (BAS edition)
- references: C-000205: VERBMOBIL II - VM CD 34.1 - VM34.1 (BAS edition)
- references: C-000206: VERBMOBIL II - VM CD 35.1 - VM35.1 (BAS edition)
- references: N-001197: VERBMOBIL II - VM CD 44.1 - VM44.1 (BAS edition)
- references: C-000365: VERBMOBIL II - VM CD 45.1 - VM45.1 (BAS edition)
- references: C-001574: VERBMOBIL II - VM CD 46.1 - VM46.1 (BAS edition)
- references: C-000369: VERBMOBIL II - VM CD 62.1 - VM62.1 (BAS edition)
C-003631: Princeton WordNet Gloss Corpus
The corpus is a set of annotated disambiguated glosses and contains word forms from the definitions in WordNet's synsets, manually linked to the context-appropriate sense in WordNet. The corpus is provided in two different formats; the merged format (all annotations combined in a single file) and the standoff format (annotations are stored in documents separate from the gloss text).
- references: D-000825: WordNet
C-003632: SemCor 1.6
The SemCor corpus 1.6 is a subcorpus of WordNet 1.6 and consists of 352 texts. All the words in SemCor are tagged for POS and more than 200,000 content words are lemmatized and sense-tagged according to Word Net 1.6. The semantic tagging of SemCor 1.6 was manually done while all the other versions like SemCor 1.7 were automatically created.
- references: D-000825: WordNet
- references: C-000751: Brown Corpus
- isPartOf: C-003633: MultiSemCor Corpus 1.1
- isReplacedBy: C-003634: SemCor 1.7
- isPartOf: D-000825: WordNet

SHACHI - Language Resource Metadata Database