Language resource #: 3330
Results 1921 - 1930 of 2023
-
C-004962: SALA II US English database (2000 speakers)
The SALA II US English database collected in the United States was recorded within the scope of the SALA II project. It contains the recordings of ca. 2,000 US English speakers (equally balanced between males and females, including some speakers with Hispanic accents) recorded over the United States mobile telephone network.
The following acoustic conditions were selected as representative of a mobile user's environment (some speakers were recorded in several environments):
- Passenger in moving car, railway, bus, etc.
- Public place
- Stationary pedestrian by road side
- Home/office environment
- Passenger in moving car using a hands-free kit
The speech files are stored as sequences of 8-bit, 8kHz Mu-law speech files and are not compressed, according to the specifications of SALA II. Each prompt utterance is stored within a separate file and has an accompanying ASCII SAM label file.
This speech database was validated by SPEX (the Netherlands) to assess its compliance with the SALA II format and content specifications.
Each speaker uttered the following items:
- 6 application words (out of a set of 30)
- 1 sequence of 10 isolated digits
- 4 connected digits (1 sheet number -5+ digits, 1 telephone number –9/11 digits, 1 credit card number –14/16 digits, 1 PIN code -6 digits)
- 3 dates (1 spontaneous date e.g. birthday, 1 word style prompted date, 1 relative and general date expression)
- 1 spotting phrase using an embedded application word
- 2 isolated digits
- 3 spelled words (1 surname, 1 directory assistance city name, 1 real/artificial name for coverage)
- 1 currency money amount
- 1 natural number
- 5 directory assistance names (1 spontaneous, e.g. own surname, 1 city of birth/growing up, 1 most frequent city out of a set of 500, 1 most frequent company/agency out of a set of 500, 1 “forename surname” out of a set of 150 )
- 2 yes/no questions (1 predominantly “yes” question, 1 predominantly “no” question, including fuzzy questions)
- 9 phonetically rich sentences
- 2 time phrases (1 spontaneous time of day, 1 word style time phrase)
- 4 phonetically rich words
A pronunciation lexicon with a phonemic transcription in SAMPA is also included. -
C-004963: Buckeye Corpus
The Buckeye Corpus of conversational speech contains high-quality recordings from 40 speakers in Columbus OH conversing freely with an interviewer. The speech has been orthographically transcribed and phonetically labeled. The audio and text files, together with time-aligned phonetic labels, are stored in a format for use with speech analysis software (Xwaves and Wavesurfer). Software for searching the transcription files is currently being written. The corpus is FREE for noncommercial uses.
-
C-004964: Annotated Speech Corpora for 3 East Indian Languages
All the informants of the corpora are professional voice over artist. The speech is recorded in a speech studio environment and digitized at a sampling rate of 22,050 Hz with an accuracy of 16 bits/sample in PCM wave format. The annotation has been done both at text level and speech level. At text level Parts of Speech (POS), Phrase and Clause have been annotated. Text files are also phonetically transcribed in Internal Phonetic Alphabet (IPA). In case of speech, phonemes, syllables and breath pause have been annotated. The total size of the speech corpora is about 8.5GB. Majority of this Corpus is for Bangla Language (5.12 GB). Only standard dialect of a particular language is included in this corpora.
The content of the corpora has been designed in a way that it can help various aspects of speech research such as Speech Synthesis, Speech Recognition, Speaker Recognition etc. -
C-004965: RML Emotion Database
The RML emotion database contains 720 audiovisual emotional expression samples that were collected at Ryerson Multimedia Lab. Six basic human emotions are expressed: Anger, Disgust, Fear, Happiness, Sadness, Surprise. A digital video camera was used to record the samples in a quiet and bright environment, with a simple background. Our experimental subjects were provided with a list of emotional sentences and were directed to express their emotions as naturally as possible by recalling the emotional happening, which they had experienced in their lives. A total number of ten different sentences were provided for each emotional class.
-
C-004966: Surrey Audio-Visual Expressed Emotion (SAVEE) Database
Surrey Audio-Visual Expressed Emotion (SAVEE) database has been recorded as a pre-requisite for the development of an automatic emotion recognition system. The database consists of recordings from 4 male actors in 7 different emotions, 480 British English utterances in total. The sentences were chosen from the standard TIMIT corpus and phonetically-balanced for each emotion. The data were recorded in a visual media lab with high quality audio-visual equipment, processed and labeled. To check the quality of performance, the recordings were evaluated by 10 subjects under audio, visual and audio-visual conditions. Classification systems were built using standard features and classifiers for each of the audio, visual and audio-visual modalities, and speaker-independent recognition rates of 61%, 65% and 84% achieved respectively.
-
C-004967: Taiwanese Mandarin Multilngual Spoken Corpus
The corpus was collaboratively made by Tamkang University in Taiwan, China and Tokyo University of Foreign Studies in Japan. It consists of 44 conversations about general topics (TV shows, jobs, etc.) between a pair of undergraduate or graduate students of Tamkang University. One conversation lasts for about an hour.
- hasVersion: C-001319: Canada Mulitilingual Spoken Corpus
- hasVersion: C-001320: French (Aix) Multiligual Spoken Corpus
- hasVersion: C-001321: French (Paris) Multilingual Spoken Corpus
- hasVersion: C-001322: Malay Multilingual Spoken Corpus
- hasVersion: C-001323: Multilingual Spoken Language Corpus: Spanish
- hasVersion: C-001324: Multilingual Spoken Language Corpus: Turkish
- hasVersion: C-004968: Spanish Multilingual Spoken Corpus 2006
-
C-004968: Spanish Multilingual Spoken Corpus 2006
The corpus has been collaboratively made by Tokyo University of Foreign Studies in Japan and The Autonomous University of Madrid in Spain. It consists of 40 dialogues about general topics like movies, shopping, and family between Spanish native speakers. The corpus comes with the transcripts and Japanese and English translations.
- hasVersion: C-001319: Canada Mulitilingual Spoken Corpus
- hasVersion: C-001320: French (Aix) Multiligual Spoken Corpus
- hasVersion: C-001321: French (Paris) Multilingual Spoken Corpus
- hasVersion: C-001322: Malay Multilingual Spoken Corpus
- hasVersion: C-001324: Multilingual Spoken Language Corpus: Turkish
- hasVersion: C-001323: Multilingual Spoken Language Corpus: Spanish
- hasVersion: C-004967: Taiwanese Mandarin Multilngual Spoken Corpus
-
C-004971: Glossed Audio Corpus of Ainu Folklore
This is the first fully glossed and annotated digital collection of Ainu folktales with translations into Japanese and English. It contains 10 stories (8 uepeker ‘prosaic folktales’ and 2 kamuy yukar ‘divine epics’) narrated by Mrs. Kimi Kimura (1900-1988, born in Penakori Village, upper district of the Saru River) with a total recording time of about 3 hours.
-
C-004972: Database of endangered languages/dialects in Japan
This database releases recorded voices of endangered languages/dialects in Japan. It consists of word pronunciations and natural conversations of the endangered languages/dialects, such as Amami, Okinawa, and Hachijo. It also includes transcribed texts and their translation into Standard Japanese.
-
C-004975: Accented English GlobalPhone
The Accented English part of the GlobalPhone resources contains 63 recording sessions of Bulgarian, Chinese, German, and Indian native speakers reading 37 English sentences each, produced in GlobalPhone-style, i.e. 16kHz PCM encoded audio recordings of utterance-segmented read speech from the newspaper domain.