Language Resource Search - SHACHI: Language Resource Metadata Database

Language resource #: 3330 Results 851 - 860 of 2023

C-001468: Mandarin Chinese Speech Recognition Corpus (desktop) - place name (120 people)
Desktop/Microphone
This corpus comprises 3,600 speech files uttered by 120 speakers of different dialects, ages and various educational levels, recorded over 3 channels (Mic 1: SHURE Beta53; Mic 2: AKG C4000b; Mic 3: Labtec Axis 002). The database comprises 4,858 place names. Speech samples are stored as a sequence of 16-bit 48kHz WAV for 6.26 hours of speech per channel. The total capacity of the data is 6.04 Gb.
Text files are stored in Unicode format. All data have been proofread manually.
The corpus aims to be applied to the testing and telephone natural speech recognition system.
C-001469: Mandarin Chinese Speech Recognition Corpus (desktop) - short message (120 people)
Desktop/Microphone
This corpus comprises 3,600 speech files uttered by 120 speakers of different dialects, ages and various educational levels, recorded over 3 channels (Mic 1: SHURE Beta53; Mic 2: AKG C4000b; Mic 3: Labtec Axis 002). The database comprises 7,161 Chinese short messages (SMS) in total. Speech samples are stored as a sequence of 16-bit 48kHz WAV for 5.86 hours of speech per channel. The total capacity of the data is 5.65 Gb.
Text files are stored in Unicode format. All data have been proofread manually.
The corpus aims to be applied to the testing and telephone natural speech recognition system.
C-001470: Mandarin Chinese Speech Synthesis Corpus (Basic Corpus)
Desktop/Microphone
This corpus contains the recordings of 1 native Chinese speaker (female).
The corpus is composed of 20 texts with 109,227 words and has been proofread manually. The corpus contents include: phrases, digit strings, letter strings, uncommon words, neutral tone, final retroflexion, Latin alphabet, interrogative sentences, 282 English words.
The speaker has been recorded in a professional recording studio over 2 channels: microphone and glottis wave (fundamental frequency) signals for a total of 18.2 hours.
Speech samples are stored as sequences of 16-bit 44,1 kHz PCM on two channels. The total data size is 5.67 Gb for a total of 12,679 files. The data is encoded in GB-2312 format.
The transcriptions include labels for four-class pause boundaries.
This database is aimed to be used within text-to-speech and speech synthesis applications.
- hasVersion: C-001472: Mandarin Chinese Speech Synthesis Corpus
- isPartOf: C-001471: Mandarin Chinese Speech Synthesis Corpus (Integrated Corpus)
C-001471: Mandarin Chinese Speech Synthesis Corpus (Integrated Corpus)
Desktop/Microphone
The Mandarin Chinese Speech Synthesis Integrated Corpus includes both Basic and Accessory Corpora (see ELRA-S0228/01 and ELRA- S0228/02).
- hasPart: C-001470: Mandarin Chinese Speech Synthesis Corpus (Basic Corpus)
- hasPart: C-001472: Mandarin Chinese Speech Synthesis Corpus
C-001472: Mandarin Chinese Speech Synthesis Corpus
Desktop/Microphone
This corpus contains the recordings of 1 native Chinese speaker (female).
The corpus is complementing the Basic Corpus (ELRA-S0228/01) and aims at covering a variety of speech context data which does not include syllables.
The corpus is composed of 28 texts with 75,841 words and has been proofread manually. The corpus contents include: text of statements, digit strings, uncommon words, letter strings, measurement units, neutral tone, final retroflexion, latin alphabet, interrogative sentences, English words and room-ordering stimulation.
The speaker has been recorded in a professional recording studio over 2 channels: microphone and glottis wave (fundamental frequency) signals for a total of 30.2 hours.
- hasVersion: C-001470: Mandarin Chinese Speech Synthesis Corpus (Basic Corpus)
- isPartOf: C-001471: Mandarin Chinese Speech Synthesis Corpus (Integrated Corpus)
C-001473: Mandarin Chinese high clarity Speech Recognition Corpus (in recording studio) - (desktop) person name (200 people)
Desktop/Microphone
This corpus comprises 8,000 Chinese person names uttered by 200 speakers of different dialects, ages and various educational levels, recorded over 4 channels. Speech samples are stored as a sequence of 16-bit 44.1kHz WAV for 10 hours of speech per channel. The total capacity of the data is 12 Gb.
Each speaker read 40 items. Text files are stored in Unicode format. All data have been proofread manually.
The corpus aims to be applied to the testing and telephone natural speech recognition system.
C-001474: Mandarin Chinese high clarity Speech Recognition Corpus (in recording studio) - (desktop) place name (200 people)
Desktop/Microphone
This corpus comprises 8,000 Chinese place names uttered by 200 speakers of different dialects, ages and various educational levels, recorded over 4 channels. Speech samples are stored as a sequence of 16-bit 44.1kHz WAV for 12.27 hours of speech per channel. The total capacity of the data is 14.45 Gb.
Each speaker read 40 items. Text files are stored in Unicode format. All data have been proofread manually.
The corpus aims to be applied to the testing and telephone natural speech recognition system.
C-001475: Mandarin Chinese high clarity Speech Recognition Corpus (in recording studio) - (desktop) digit string (200 people)
Desktop/Microphone
This corpus comprises 8,000 digit strings uttered by 200 speakers of different dialects, ages and various educational levels, recorded over 4 channels. Speech samples are stored as a sequence of 16-bit 44.1kHz WAV for 13.3 hours of speech per channel. The total capacity of the data is 15.7 Gb.
Each speaker read 40 items. Text files are stored in Unicode format. All data have been proofread manually.
The corpus aims to be applied to the testing and telephone natural speech recognition system.
C-001476: Mandarin Chinese high clarity Speech Recognition Corpus (in recording studio) - single Chinese sentence (200 people)
Desktop/Microphone
This corpus (in recording studio) comprises 8,000 Chinese sentences uttered by 200 speakers of different dialects, ages and various educational levels, recorded over 4 channels. Speech samples are stored as a sequence of 16-bit 44.1kHz WAV for 12 hours of speech per channel. The total capacity of the data is 14.22 Gb.
Each speaker read 40 items. Text files are stored in Unicode format. All data have been proofread manually.
The corpus aims to be applied to the testing and telephone natural speech recognition system.
C-001478: Mixer Corpus
In order to promote the development of robust speaker recognition technologies, we have created the Mixer corpus of multilingual, cross-channel speech. This corpus adds two dimensions to the traditional Switchboard collection: language and channels.Mixer is a collection of telephone conversations targeting 600 speakers participating in up to 25 calls of at least 6 minutes duration.

SHACHI - Language Resource Metadata Database