Language resource #: 3330
Results 1261 - 1270 of 2023
-
C-003643: D-Coi-corpus
The corpus consists of newspaper text, texts intended for the general public obtained from government websites, journals, brochures, legal texts, manuals etc. The texts have been annotated with dependency relations according to the guidelines of the D-Coi-project, a preparatory project which aimed to produce a blueprint and the tools needed for the construction of a 500-million-word reference corpus of contemporary written Dutch.
- isReferencedBy: C-003642: COREA-coreferentiecorpus
- references: C-003645: Twente Nieuws Corpus
-
C-003644: DuELME
DuELM is one of the results of the IRME (Identification and Representation of Multiword Expressions) project. It contains lexical descriptions of 5,000 multiword expressions (MWEs), which meets the criterion of being highly theory- and implementation-independent. Its main purpose is for it to be used in various Dutch NLP systems.
- references: C-003645: Twente Nieuws Corpus
-
C-003645: Twente Nieuws Corpus
The TwNC is a multifaceted Dutch news corpus, comprising about 530 million words of text data and some audio data useful for language model training. The data includes text data from newspaper and magazine articles, and text and audio data from subtitling and autocues/transcripts of broadcast news shows.
- isReferencedBy: C-003644: DuELME
- isReferencedBy: C-003643: D-Coi-corpus
-
C-003646: Malay Concordance Project
The MCP contains over 4.8 million words (including over 100,000 verses) from more than 130 sources of pre-modern Malay written text. These texts can be searched on-line to provide useful information about contexts in which words are used, where particular terms or names occur in texts, and patterns of morphology and syntax.
-
C-003648: NTCIRデータセット/テストコレクション
- hasPart: C-003740: NTCIR-1(情報検索/用語抽出研究用テストコレクション)
- hasPart: C-003741: NTCIR-2(情報検索用テストコレクション)
- hasPart: C-003742: NTCIR-2 SUMM(テキスト自動要約用テストコレクション)
- hasPart: C-003743: NTCIR-2 SUMM TAO(自動要約用データ:TAO作成)
- hasPart: C-003744: NTCIR-3 CLIR(情報検索/言語横断検索用テストコレクション)
- hasPart: C-003745: NTCIR-3 PATENT(特許検索テストコレクション)
- hasPart: C-003746: NTCIR-3 QA(質問応答用テストコレクション)
- hasPart: C-003747: NTCIR-3 SUMM(テキスト自動要約用テストコレクション)
- hasPart: C-003748: NTCIR-3 WEB(Web検索評価用テストコレクション)
- hasPart: C-003749: NTCIR-4 CLIRꉡfeXgRNV
- hasPart: C-003750: NTCIR-4 特許検索タスクテストコレクション
- hasPart: C-003751: NTCIR-4 QAC2(質問応答テストコレクション)
- hasPart: C-003752: NTCIR-4 WEB(Web検索評価用テストコレクションタスク文書データ)
- hasPart: C-003753: NTCIR-5 CLIR ꉡfeXgRNV
- hasPart: C-003754: NTCIR-5 CLQA 多言語質問応答テストコレクション
- hasPart: C-003755: NTCIR-5 特許検索タスクテストコレクション
- hasPart: C-003756: NTCIR-5 QAC 質問応答テストコレクション
- hasPart: C-003757: NTCIR-5 WEB検索評価用テストコレクション
- hasPart: C-003758: NTCIR-6 CLIR ꉡfeXgRNV
- hasPart: C-003759: NTCIR-6 CLQA 多言語質問応答テストコレクション
- hasPart: C-003760: NTCIR-6 OPINION 意見分析タスクテストコレクション
- hasPart: C-003761: NTCIR-6 特許検索タスクテストコレクション
- hasPart: C-003762: NTCIR-6 QAC 質問応答テストコレクション
- hasPart: C-003763: NTCIR-6 MuST 「動向情報の要約と可視化」テストコレクション
-
C-003650: Utsunomiya University Spoken Dialogue Database for Paralinguistic Information Studies
Utsunomiya University (UU) Spoken Dialogue Database for Paralinguistic Information Studies (UUDB) is a collection of spontaneous and expressive dialogue speech.
-
C-003652: Japanese phonetically-balanced word speech database
1542 words, Japanese phonetically-balanced word speech database
-
C-003655: NTT Infant Speech Database
NTT Infant Database was developed from longitudinal recordings of utterances of infants and their parents. It is a huge database with many kinds of information such as an utterance transcription, a phoneme label, and a fundamental frequency.
-
C-003658: Computer Assisted System for TEaching & Learning / Japanese
Various database for teaching and learning Japanese, such as Kanji dictionary, a kanji dictionary, kanji stroke order dictionary, word dictionary, technical term dictionary "Education Ministry Technical Terms", Sample sentences dictionary, Japanese-English dictionary, sound / illustration dictionary, Kodansha Gendai Shinsho / Shochiku "Otoko wa Tsurai yo" screenplay, and Text database of Sakyo Komatsu works are collected.
-
C-003662: Educational Research Information Database : Questions of the high school entrance examination
A database of the public high school entrance examination questions that has been implemented by each local Board of Education between 1991 and 2007. (The data up to 2000 is availble at present. The rest will be added as soon as they are ready.)