言語資源検索 - SHACHI: Language Resource Metadata Database

言語資源の登録件数: 3330件 2023 件中 1071 - 1080 件目

検索条件を選択

description_language
language_area
language
type
subject_monoMultilingual
subject_resourceSubject
type_style
type_form
type_sentence
type_linguisticType
type_discourseType
type_purpose
subject_linguisticField
contributor_author_level
contributor_speaker_level
contributor_author_motherTongue
contributor_speaker_motherTongue
contributor_author_dialect
contributor_speaker_dialect
contributor_author_age
contributor_speaker_age
contributor_author_gender
contributor_speaker_gender
type_annotation

C-003282: 500-People Telephone Read Speech Corpus
The 500-People TRSC is a large collection of Putongha (Mandarin Chinese) telephone read speech. Most read materials are sentences taken from the China People's Daily newspaper articles. The corpus also contains some sentences to cover the connected digits and Alphabets.
C-003283: Telephone Name Dialing Corpus
The TNDC is a Mandarin Chinese read speech corpus designed for name-dialing system. There are about 500 Chinese names and 50 sentence templates chosen for designing the target sentences.
C-003284: CASS Corpus
The CASS corpus contains phonetically transcribed Mandarin Chinese spontaneous speech taken from university lectures, student colloquia, and other public meetings. There were no read prompts nor keywords so the speech in this corpus are totally spontaneous.
- isReferencedBy: [???Reference] CASS: A Phonetically Transcribed Corpus of Mandarin Spontaneous Speech (A. Li (1), F. ZHENG (2), W. Byrne (3), P. Fung (4), T. Kamm (3), Y. Liu (4), Z. Song (2), U. Ruhi (5), V. Venkataramani (3), X. Chen (1), Chinese Academy of Social Sciences (1), Tsinghua University (2), Johns Hopkins University (3), University of Science and Technology (4), and University of Toronto (5), http://www.clsp.jhu.edu/ws2000/groups/mcs/publications/CASS-ICSLP.ps)
C-003285: Wu-Dialectal Chinese Speech Corpus
The WDCS is a Wu-dialectal (Shanghainese) Chinese speech corpus and contains 5.5-hour read speech and 5.5-hour spontaneous speech data. For read speech, the prompting texts were designed by using an automatic sentence selection algorithm so as to cover the Chinese language phenomena phonetically as much as possible. For spontaneous speech, 5 topics (sports, politics and economy, entertainment, lifestyle, and technologies) with some corresponding subtopics were elaborately designed. The corpus includes corresponding manual transcriptions.
C-003286: BIT-MobileSpeech
The Mobile Phone Speech Corpus is a Mandarin Chinese read speech corpus designed for Traffic Information Query. The particular characteristics of this corpus are: 1) the communication networks cover not only PSTN but GSM or CDMA; 2) many kinds of mobile phones with different types from different manufactures were used; and 3) the domain of the corpus is based on Traffic Inforamtion Query and labeled with many keywords of traffic stations, bus lines and locations, etc.
- hasVersion: C-003287: BIT-MobileTalk
C-003287: BIT-MobileTalk
BIT-MobileTalk is a Mandarin Chinese telephone conversational speech corpus by 30 speakers. The conversations are spontaneous, based on a given topic and keywords, and contain variant phenomena of spoken language. The particular characteristics of this corpus are: 1) the telephone networks cover not only PSTN but GSM or CDMA; 2) many kinds of mobile phones with different types from different manufactures were used; 3) the conversations are goal-oriented, that is, each speaker was assigned a travel site as a topic for the conversation; and 4) the speakers were given some keywords such as how to go there or the admission charge.
- hasVersion: C-003286: BIT-MobileSpeech
C-003288: BIT-TeleSpeech
The BIT-TeleSpeech corpus is a Mandarin Chinese telephone read speech corpus by 15 speakers. Every speaker reads the same 40 sentences lasting about 3s each. The corpus can be used for speaker recognition.
C-003289: BIT-TonalName
The BIT-TonalName corpus contains 8,000 utterances of Chinese names and can be used for the study of Chinese name recognition and the influence of tonal information. The 100 pairs of names in the corpus cover 39 of the most popular 45 Chinese surnames and all 21 initials and 38 finals of Mandarin.
C-003290: BIT-MonoSyllable
The BIT-MonoSyllable corpus is a single syllable corpus of Mandarin Chinese by 10 speakers. The corpus covers the most-frequently-used 1259 tonal syllables in Mandarin. Every speaker reads 1 - 2 times of all syllables.
C-003291: CCC-VPR3C2005
The corpus was designed for voiceprint recognition (VPR) or speaker recognition tasks, and contains two subsets of Mandarin Chinese speech data; one for text-independent (TI) VPR and the other for text-dependent (TD) VPR. The Chinese syllable layer transcription is also included.

SHACHI - Language Resource Metadata Database