Language Resource Search - SHACHI: Language Resource Metadata Database

Language resource #: 3330 Results 1021 - 1030 of 2023

C-003215: Newcastle Corpus
New Castle Corpus contains recorded speech in French by learners of French as a part of on-going project. 30 learners have been recruited from four different sixth form colleges and are being recorded performing a variety of oral tasks, five oral tasks on a one-to-one basis with a researcher, and a paired oral discussion task together with another learner from the same college for each learner. For comparison purposes, speech by 15 French native speakers are also included. The native speakers were recorded doing the same 6 oral tasks as the learners. The corpus also contains morphosyntactically-tagged transcripts.
- isPartOf: N-003064: French Learner Language Oral Corpora (FLLOC)
C-003216: BioCaster ontology
The BioCaster ontology aims to support multilingual surveillance of disease outbreak news for public health workers, clinicians and researchers in the biomedical sciences. The ontology has been developed jointly by groups at the National Institute of Informatics, the National Institute of Infectious Diseases, Vietnam National University (HCM), Okayama University, the National Institute of Genetics and Kasetsart University. The ontology covers vocabulary for infectious diseases, pathogenic agents, signs and symptoms, syndromes, control measures, hosts, transmission modes and eventually the host genes. Terms are provided in six languages (Chinese, English, Korean, Japanese, Thai and Vietnamese).
- isReferencedBy: [???Reference] A multilingual ontology for infectious disease outbreak surveillance: rationale, design and challenges (Nigel Collier, Ai Kawazoe, Lihua Jin, Mika Shigematsu, Dinh Dien, Roberto Barrero, Koichi Takeuchi, Asanee Kawtrakul, 2007)
- isReferencedBy: [???Reference] BioCaster Project Working Report on English Names Entity Annotation Ver. 2.4 (http://biocaster.nii.ac.jp/files/BioCaster_working_report_English%20NE_ver24.pdf)
C-003221: News articles
A collection of news in both Eglish and Malay languages from news agency categorised in various fields.
C-003222: Collection of Malay Speech Sentences
A collection of pronunciation of sentences in Malay language which is used in text-to-speech project.
C-003223: UEA Corpus
The UEA Corpus is a corpus of multi-participant French and English oppositional talk (i.e. talk in which speakers express opposing views). The corpus represents the communicative behaviour, in comparable oppositional interactions, of undergraduate learners of French and English (French for English strudents, English for French students, English for English students, and French for French students). The Corpus, as of January of 2008, contains only transcripts of recorded speech, and does not include any soundfiles nor tagged transcripts, which may be added at a later date.
- isPartOf: N-003064: French Learner Language Oral Corpora (FLLOC)
C-003224: Lancom Corpus
Lancom Corpus contains speech data in French by adolescent Dutch learners in Belgium and France, with corresponding transcripts in standard french orthography. Tasks for speakers include role playing with scenarios such as encounters on the street, babysitting, and so on.
C-003225: INL 27 Million Words Newspaper Corpus 1995
The 27 Million Words Corpus is a text corpus of over 27 million words of Dutch newspaper text through the Internet. The corpus allows users to search for single words or for word patterns, including some predefined syntactic patterns that can be changed by the user. Searches concern the levels of word form, part of speech, and head word, both separately and in combination by use of Boolean operators and proximity searches. During the search, data concerning frequency and distribution over the texts are provided at several levels. The output most often is a list of items, or a series of concordances (words in context) with a variable, user-defined textual context.
- hasVersion: C-003226: INL 5 Million Words Corpus 1994
- hasVersion: C-003227: INL 38 Million Words Corpus 1996
- isReferencedBy: [???Reference] INL 27 Million Words Newspaper Corpus 1995 User manual (http://www.inl.nl/images/stories/taalbank/documentatie/engels/27mlj_handleid_eng.doc)
- isReferencedBy: [???Reference] On-line Access to Linguistically Annotated Text Corpora of Dutch via Internet (Kruyt, J.G., S.A. Raaijmakers, P.H.J. van der Kamp & R.J. van Strien (1995))
- isReferencedBy: [???Reference] Dutch Written Language Resources, their Users and Uses, in: A. Rubio, N. Gallardo, R. Castro & A. Tejada (eds.) (Kruyt, J.G. (1998))
- isReferencedBy: [???Reference] 27 Miljoen Woorden Krantencorpus 1995 gebruikershandleiding (http://www.inl.nl/images/stories/taalbank/documentatie/27mlj_handleid.doc)
C-003226: INL 5 Million Words Corpus 1994
The 5 Million Words Corpus 1994 enables the user to consult a text corpus of a 5 million words of present-day Dutch text through the Internet. It contains seventeen text sources classified along the parameters publication medium (book, newspaper, magazine, written-to-be-spoken (TV broadcast)) and topic (politics, journalism, leisure, linguistics, environment, business and employment). Searches concern the levels of word form, part of speech, and head word.
- hasVersion: C-003225: INL 27 Million Words Newspaper Corpus 1995
- hasVersion: C-003227: INL 38 Million Words Corpus 1996
- isReferencedBy: [???Reference] On-line Access to Linguistically Annotated Text Corpora of Dutch via Internet (Kruyt, J.G., S.A. Raaijmakers, P.H.J. van der Kamp & R.J. van Strien (1995))
- isReferencedBy: [???Reference] Dutch Written Language Resources, their Users and Uses, in: A. Rubio, N. Gallardo, R. Castro & A. Tejada (eds.) (Kruyt, J.G. (1998))
- isReferencedBy: [???Reference] The 5 Million Words Corpus User manual (http://www.inl.nl/images/stories/taalbank/documentatie/engels/5mlj_handleid_eng.doc)
- isReferencedBy: [???Reference] Het 5 miljoen woorden corpus 1994 (http://www.inl.nl/images/stories/taalbank/documentatie/5mlj_handleid.pdf)
C-003227: INL 38 Million Words Corpus 1996
The 38 Million Words Coupus is on-line database accessible via Internet. The corpus consists of three main components: varied materials, newspaper articles, and legal materials. The user has the opportunity to define subcorpora, on the basis of the parameters (1) corpuscomponent, (2) topic, (3) publication medium/text type, and (4) period. Individual texts can be selected from text surveys on the screen, for either the whole corpus or a subcorpus. The corpus texts have been automatically annotated with lemma and two part of speech assignments: a global one (13 POS's) and a fine-grained one (with subcategorization) conformant with the MECOLB standard.
- hasVersion: N-003325: Wikipedia - the free Encyclopedia
- hasVersion: C-003226: INL 5 Million Words Corpus 1994
- isReferencedBy: [???Reference] A 38 million words Dutch text corpus and its users (Kruyt, J.G. & M.W.F. Dutilh (1997))
- isReferencedBy: [???Reference] Dutch Written Language Resources, their Users and Uses, in: A. Rubio, N. Gallardo, R. Castro & A. Tejada (eds.) (Kruyt, J.G. (1998))
- isReferencedBy: [???Reference] User manual INL 38 Million Words Corpus 1996 (http://www.inl.nl/images/stories/taalbank/documentatie/engels/38mlj_handleid_eng.doc)
- isReferencedBy: [???Reference] Het 38 miljoen woorden corpus 1996 gebruikershandleiding (http://www.inl.nl/images/stories/taalbank/documentatie/38mlj_handleid.pdf)
C-003228: INL PAROLE Corpus 2004
The PAROLE corpus is a collection of modern Dutch texts amounting to 20 million tokens, mostly from newspaper or magazine articles. The texts are annotated for typographical and text-structural features. It is entirely annotated with headword and part of speech. All encoding is TEI conformant. The corpus is accessible free of charge via Internet.
- hasPart: C-000909: Dutch PAROLE Distributable Corpus
- isReferencedBy: [???Reference] Tagging the Dutch PAROLE Corpus (Jesse de Does, John van der Voort van der Kleij, 2002, Instituut voor Nederlandse Lexicolofie: http://parole.inl.nl/html-eng/publicaties/clinproceedings_eng.pdf)
- isReferencedBy: [???Reference] CORPUS DOCUMENTATION (http://parole.inl.nl/html-eng/par_corpusdocumentatie.html#adjustments)
- isReferencedBy: [???Reference] Implementation and Evaluation of PAROLE PoS in a National Context (Dutilh, M.W.F. & J.G. Kruyt (2002))
- isReferencedBy: [???Reference] Elektronische woordenboeken en tekstcorpora voor Europese taaltechnologie (Kruyt, J.G. (1998))
- isReferencedBy: [???Reference] Putting the Dutch PAROLE Corpus to Work (Kamp, P.H.J. van der & J.G. Kruyt (2004))
- isReferencedBy: D-003240: CGN lexicon

SHACHI - Language Resource Metadata Database