Language resource #: 3330
Results 1771 - 1780 of 2023
-
C-004486: OrienTel Jordan MSA (Modern Standard Arabic) database
Telephone
The OrienTel Jordan MSA (Modern Standard Arabic) database comprises 556 Jordanian speakers (288 males, 268 females) recorded over the Jordanian fixed and mobile telephone network. This database is stored on 1 DVD. The speech databases made within the OrienTel project were validated by SPEX, the Netherlands, to assess their compliance with the OrienTel format and content specifications.
Speech samples are stored as sequences of 8-bit 8 kHz A-law. Each prompted utterance is stored in a separate file. Each signal file is accompanied by an ASCII SAM label file which contains the relevant descriptive information.
Each speaker uttered the following items:
1 isolated single digit
2 sequences of 5 isolated digits
7+1 connected digits : 1 prompt sheet number written in letters (6 digits), 6 strings of 4 digits in written format, +1 prompt sheet number written in digits
2 currency money amounts,
2 natural numbers
3 dates : 1 prompted date, 1 relative or general date expression, 1 prompted date phrase (Islamic calendar)
1 time phrase
2 spelled words : string of 4 letter sequences
3 directory assistance utterances : 1 frequent city name, 1 frequent company name, 1 personal name ( first name and family name)
2 yes/no questions : 1 predominantly yes question, 1 predominantly no question
5 application keywords/keyphrases
1 word spotting phrase using embedded application words
4 phonetically rich words
9 phonetically rich sentences
4+2 spontaneous items (for control)
The following age distribution has been obtained: 297 speakers are between 16 and 30, 180 speakers are between 31 and 45, 79 speakers are between 46 and 60.
A pronunciation lexicon with a phonemic transcription in SAMPA is also included. -
C-004487: OrienTel English as spoken in Jordan database
Telephone
The OrienTel English as spoken in Jordan database comprises 578 Jordanian speakers of English (319 males, 259 females) recorded over the Jordanian fixed and mobile telephone network. This database is stored on 1 DVD. The speech databases made within the OrienTel project were validated by SPEX, the Netherlands, to assess their compliance with the OrienTel format and content specifications.
Speech samples are stored as sequences of 8-bit 8 kHz A-law. Each prompted utterance is stored in a separate file. Each signal file is accompanied by an ASCII SAM label file which contains the relevant descriptive information.
Each speaker uttered the following items:
1 isolated single digit
1 sequences of 10 isolated digits
5 connected digits : 1 prompt sheet number (6 digits), 1 telephone number (6-15 digits), 1 credit card number (14-16 digits), 1 PIN code (6 digits), 1 spontaneous phone number
1 currency money amount
2 natural numbers
3 dates : 1 prompted date, 1 relative or general date expression, 1 prompted date phrase
2 time phrases : 1 time of day (spontaneous), 1 time phrase (word style)
3 spelled words : 1 personal first name, 1 city name, 1 real word for coverage
5 directory assistance utterances : 1 spontaneous, own forename, 1 city of childhood (spontaneous), 1 frequent city name, 1 frequent company name, 1 common forename and surname
2 yes/no questions : 1 predominantly yes question, 1 predominantly no question
6 application keywords/keyphrases
1 word spotting phrase using embedded application words
4 phonetically rich words
9 phonetically rich sentences
2+3 spontaneous items (for control)
The following age distribution has been obtained: 317 speakers are between 16 and 30, 188 speakers are between 31 and 45, 72 speakers are between 46 and 60, 1 speaker is over 60.
A pronunciation lexicon with a phonemic transcription in SAMPA is also included. -
C-004488: Danish EUROM1
Desktop/Microphone
EUROM1 is the first really multilingual speech database produced in Europe. Equivalent corpora for each of the European languages were collected with the same number of speakers selected in the same way, and recorded in the same conditions with common file formats. Initially eight European countries have made recordings: Italy, United Kingdom, Germany, Netherlands, Denmark, Sweden, Norway, France. Additional recordings have been then completed (thanks to CEE Esprit Project SAM-A), in Greece, Spain and Portugal. More than sixty speakers were recorded per language.
The content consists of:
1) Continuous speech:
- 40 passages made of five task related sentences.
- 30 patching sentences, designed to compensate for uneven phoneme distribution in the passage material.
2) Numbers:
The numbers were divided into five blocks, each containing twenty numbers. Each block was recorded as one single take.
3) CVC words:
The CVC word lists contain sixteen list types and also carrier phrases of the suggested type. 114 isolated words were used.- hasVersion: C-000061: EUROM1g German
- hasVersion: C-000915: EUROM1f French
- hasVersion: C-000916: EUROM1i
- hasVersion: C-001403: EUROM1e English
- hasVersion: C-004471: Swedish EUROM1
- hasVersion: C-004509: Norwegian EUROM1
-
C-004490: CHIEDE Corpus: a spontaneous child language corpus of Spanish
Desktop/Microphone
The spontaneous child language corpus, CHIEDE, consists of 58,163 words, in 30 texts, with 7 hours and 53 minutes of recordings and 59 child participants. About a third of the whole corpus is formed by child language and the remaining two thirds by adult speech. The main feature of CHIEDE is the interactions spontaneity: texts are recordings of communicative situations in their natural context. The resource is presented in different formats: an orthographic transcription, an automatic phonological transcription, a XML tagged version and the text-sound alignment. Results obtained after the extraction of data from the annotated texts, through statistical methods, are also provided.
The corpus presents a final design formed by two kinds of interactions: spontaneous collective conversations, recorded at a daily activity in classroom, and personal interviews, made by an adult to a single child, where the conversation loses spontaneity, as it is guided with questions. -
C-004491: LILA Korean database
Telephone
The LILA Korean database collected in South Korea was recorded within the scope of the LILA project. It contains the recordings of 1,000 Korean speakers (500 males and 500 females) recorded over the Korean mobile telephone network.
The following acoustic conditions were selected as representative of a mobile user's environment (some speakers were recorded in several environments):
- Passenger in moving car, railway, bus, etc. (152 speakers)
- Public place (148 speakers)
- Stationary pedestrian by road side (150 speakers)
- Home/office environment (350 speakers)
- Passenger in moving car using a hands-free kit (200 speakers)
This database is distributed as 1 DVD-ROM. The speech files are stored as sequences of 8-bit, 8kHz A-law speech files and are not compressed, according to the specifications of LILA. Each prompt utterance is stored within a separate file and has an accompanying ASCII SAM label file.
This speech database was validated by SPEX (the Netherlands) to assess its compliance with the LILA format and content specifications.
Each speaker uttered the following items:
- 4 isolated digits
- 1 sequence of 9 isolated digits
- 1 sequence of 11 isolated digits
- 6 connected digits (1 sheet number -5+ digits, 2 read telephone numbers 9/11 digits, 1 credit card number 14/16 digits, 1 PIN code -6 digits, 1 spontaneous telephone number)
- 1 natural number
- 1 currency money amount
- 2 yes/no questions (1 predominantly yes question, 1 predominantly no question)
- 3 dates (1 spontaneous date e.g. birthday, 1 word style prompted date, 1 relative and general date expression)
- 2 time phrases (1 spontaneous time of day, 1 word style time phrase)
- 6 application words (out of a set of 30)
- 1 spotting phrase using an embedded application word
- 5 directory assistance names (1 spontaneous, e.g. own surname, 1 city of birth/growing up, 1 most frequent city out of a set of 500, 1 most frequent company/agency out of a set of 500, 1 forename surname out of a set of 150 )
- 3 spelled words (1 surname, 1 directory assistance city name, 1 real/artificial name for coverage)
- 1 silence word
- 4 phonetically rich words
- 13 phonetically rich sentences
- 6 spontaneous items for control
The following age distribution has been obtained: 361 speakers are between 16 and 30, 360 speakers are between 31 and 45, and 279 speakers are between 46 and 60.
A pronunciation lexicon with a phonemic transcription in SAMPA is also included. -
C-004492: CHIL 2007 Evaluation Package
Multimodal/Multimedia Resources
The CHIL 2007 Evaluation Package was produced within the CHIL Project (Computers in the Human Interaction Loop), in the framework of an Integrated Project (IP 506909) under the European Commission's Sixth Framework Programme. The objective of this project is to create environments in which computers serve humans who focus on interacting with other humans as opposed to having to attend to and being preoccupied with the machines themselves. Instead of computers operating in an isolated manner, and Humans [thrust] in the loop [of computers] we will put Computers in the Human Interaction Loop (CHIL).
In this context, the CHIL project produced CHIL Seminars. The CHIL Seminars are scientific presentations given by students, faculty members or invited speakers in the field of multimodal interfaces and speech processing. During the talks, videos of the speaker and the audience from 4 fixed cameras, frontal close ups of the speaker, close talking and far-field microphone data of the speakers voice and ambient sounds were recorded.
The CHIL Seminars have been compiled in four different packages, according to the evaluations for which they have been created and used:
- CHIL 2004 Evaluation Package (catalogue reference ELRA-E0009)
- CHIL 2005 Evaluation Package (catalogue reference ELRA-E0010)
- CHIL 2006 Evaluation Package (catalogue reference ELRA-E0017)
- CHIL 2007 Evaluation Package (catalogue reference ELRA-E0033)
The CHIL 2007 Evaluation Package consists of the following contents:
1) A set of audiovisual recordings of interactive seminars. The number of people present in the recording was fixed to be between 3 and 7. The recordings were done between June and September 2006 according to the CHIL Room Setup specification.
2) Video annotations.
3) Orthographic transcriptions. -
C-004493: FBK-Irst database of isolated meeting-room acoustic events
Desktop/Microphone
This database was produced within the CHIL Project (Computers in the Human Interaction Loop), in the framework of an Integrated Project (IP 506909) under the European Commission's Sixth Framework Programme. It contains a set of isolated acoustic events that occur in a meeting room environment and that were recorded for the CHIL Acoustic Event Detection (AED) task. The recorded sounds do not have temporal overlapping. The database can be used as training material for AED algorithms in quiet environments without temporal sound overlapping.
The database contains 16 semantic classes of acoustic events: door knock; door open; door slam; steps; chair moving; cough; paper wrapping; falling object; laugh; keyboard clicking; key jingle; spoon, cup jingle; phone ring; phone vibration; MIMIO pen buzz; applause.
9 people participated at the recordings. 3 experiments were recorded in different days, each one composed by 4 sessions and executed by 4 persons. During each session, every person reproduced a complete set of acoustic events. After every session, people swapped their positions.
There are 3 DVDs containing 3 sessions each. In every session there are 32 audio files + one text file containing the segmentation.
The database is made available freely via FTP. -
C-004494: Hungarian Speecon database
Desktop/Microphone
The Hungarian Speecon database is divided into 2 sets:
1) The first set comprises the recordings of 555 adult Hungarian speakers (280 males, 275 females), recorded over 4 microphone channels in 4 recording environments (office, entertainment, car, public place).
2) The second set comprises the recordings of 50 child Hungarian speakers (32 boys, 18 girls), recorded over 4 microphone channels in 1 recording environment (children room).
This database is partitioned into 29 DVDs (first set) and 3 DVDs (second set).
Each of the four speech channels is recorded at 16 kHz, 16 bit, uncompressed unsigned integers in Intel format (lo-hi byte order). To each signal file corresponds an ASCII SAM label file which contains the relevant descriptive information.
Each speaker uttered the following items (over 290 items for adults and over 210 items for children):
Calibration data:
6 noise recordings
The silence word recording
Free spontaneous items (adults only):
5 minutes (session time) of free spontaneous, rich context items (story telling) (an open number of spontaneous topics out of a set of 30 topics)
17 Elicited spontaneous items (adults only):
3 dates, 2 times, 3 proper names, 2 city names, 1 letter sequence, 2 answers to questions, 3 telephone numbers, 1 language
Read speech:
30 phonetically rich sentences uttered by adults and 60 uttered by children
5 phonetically rich words (adults only)
4 isolated digits
1 isolated digit sequence
4 connected digit sequences
1 telephone number
3 natural numbers
1 money amount
2 time phrases (T1 : analogue, T2 : digital)
3 dates (D1 : analogue, D2 : relative and general date, D3 : digital)
3 letter sequences
1 proper name
2 city or street names
2 questions
2 special keyboard characters
1 Web address
1 email address
208 application specific words and phrases per session (adults)
74 toy commands, 14 phone commands and 34 general commands (children)
The following age distribution has been obtained:
Adults: 273 speakers are between 15 and 30, 141 speakers are between 31 and 45, 141 speakers are over 46.
Children: 22 speakers are between 8 and 10, and 28 speakers are between 11 and 15.
A pronunciation lexicon with a phonemic transcription in SAMPA is also included.- hasVersion: C-000095: Mandarin Chinese Speecon database
- hasVersion: C-000120: Portuguese Speecon database
- hasVersion: C-000136: Spanish Speecon database
- hasVersion: C-000415: German Speecon database
- hasVersion: C-000936: Finnish Speecon database
- hasVersion: C-000941: French Speecon database
- hasVersion: C-000946: Hebrew Speecon database
- hasVersion: C-000952: Italian Speecon database
- hasVersion: C-000955: Korean Speecon database
- hasVersion: C-000974: Polish Speecon database
- hasVersion: C-000977: Russian Speecon database
- hasVersion: C-000995: Swedish Speecon database
- hasVersion: C-001000: Turkish Speecon database
- hasVersion: C-001002: UK English Speecon database
- hasVersion: C-001237: Taiwan Mandarin Speecon database
- hasVersion: C-001530: Swiss-German Speecon database
- hasVersion: C-001553: US English Speecon database
- hasVersion: C-001554: US Spanish Speecon database
- hasVersion: C-003376: Japanese Speecon database
- hasVersion: C-003377: Danish Speecon Database
- hasVersion: C-003378: Dutch from the Netherlands Speecon Database
- hasVersion: C-003379: Dutch from Belgium Speecon Database
- hasVersion: C-003380: French-Canadian Speecon database
- hasVersion: C-004483: Cantonese Speecon database
- hasVersion: C-004484: Thai Speecon database
- hasVersion: C-004495: Czech Speecon database
- hasVersion: C-004517: Egyptian Arabic Speecon database
- hasVersion: C-004539: Catalan Speecon database
-
C-004495: Czech Speecon database
Desktop/Microphone
The Czech Speecon database is divided into 2 sets:
1) The first set comprises the recordings of 550 adult Czech speakers (275 males, 275 females), recorded over 4 microphone channels in 4 recording environments (office, entertainment, car, public place).
2) The second set comprises the recordings of 50 child Czech speakers (18 boys, 32 girls), recorded over 4 microphone channels in 1 recording environment (children room).
This database is partitioned into 20 DVDs (first set) and 3 DVDs (second set).
The speech databases made within the Speecon project were validated by SPEX, the Netherlands, to assess their compliance with the Speecon format and content specifications. Each of the four speech channels is recorded at 16 kHz, 16 bit, uncompressed unsigned integers in Intel format (lo-hi byte order). To each signal file corresponds an ASCII SAM label file which contains the relevant descriptive information.
Each speaker uttered the following items (over 290 items for adults and over 210 items for children):
Calibration data:
6 noise recordings
The silence word recording
Free spontaneous items (adults only):
5 minutes (session time) of free spontaneous, rich context items (story telling) (an open number of spontaneous topics out of a set of 30 topics)
17 Elicited spontaneous items (adults only):
3 dates, 2 times, 3 proper names, 2 city names, 1 letter sequence, 2 answers to questions, 3 telephone numbers, 1 language
Read speech:
30 phonetically rich sentences uttered by adults and 60 uttered by children
5 phonetically rich words (adults only)
4 isolated digits
2 isolated digit sequences
4 connected digit sequences
1 telephone number
3 natural numbers
1 money amount
2 time phrases (T1 : analogue, T2 : digital)
3 dates (D1 : analogue, D2 : relative and general date, D3 : digital)
3 letter sequences
1 proper name
2 city or street names
2 questions
2 special keyboard characters
1 Web address
1 email address
208 application specific words and phrases per session (adults)
74 toy commands, 14 phone commands and 34 general commands (children)
The following age distribution has been obtained:
Adults: 263 speakers are between 15 and 30, 189 speakers are between 31 and 45, 98 speakers are over 46.
Children: 22 speakers are between 8 and 10, and 28 speakers are between 11 and 15.
A pronunciation lexicon with a phonemic transcription in SAMPA is also included.- hasVersion: C-000095: Mandarin Chinese Speecon database
- hasVersion: C-000120: Portuguese Speecon database
- hasVersion: C-000136: Spanish Speecon database
- hasVersion: C-000415: German Speecon database
- hasVersion: C-000936: Finnish Speecon database
- hasVersion: C-000941: French Speecon database
- hasVersion: C-000946: Hebrew Speecon database
- hasVersion: C-000952: Italian Speecon database
- hasVersion: C-000955: Korean Speecon database
- hasVersion: C-000974: Polish Speecon database
- hasVersion: C-000977: Russian Speecon database
- hasVersion: C-000995: Swedish Speecon database
- hasVersion: C-001000: Turkish Speecon database
- hasVersion: C-001002: UK English Speecon database
- hasVersion: C-001237: Taiwan Mandarin Speecon database
- hasVersion: C-001530: Swiss-German Speecon database
- hasVersion: C-001553: US English Speecon database
- hasVersion: C-001554: US Spanish Speecon database
- hasVersion: C-003376: Japanese Speecon database
- hasVersion: C-003377: Danish Speecon Database
- hasVersion: C-003378: Dutch from the Netherlands Speecon Database
- hasVersion: C-003379: Dutch from Belgium Speecon Database
- hasVersion: C-003380: French-Canadian Speecon database
- hasVersion: C-004483: Cantonese Speecon database
- hasVersion: C-004484: Thai Speecon database
- hasVersion: C-004494: Hungarian Speecon database
- hasVersion: C-004517: Egyptian Arabic Speecon database
- hasVersion: C-004539: Catalan Speecon database
-
C-004497: Alcohol Language Corpus (BAS ALC)
Desktop/Microphone
ALC contains recordings of German speakers that are either intoxicated or sober. The type of speech ranges from read single digits to full conversation style. Recordings were done during drinking test where speakers drank beer or wine to reach a self-chosen level of alcoholic intoxication. The actual level of intoxication was measured by breath alcohol and blood samples taken immediately before the speech recording. Recordings were performed in two standing automobiles to ensure a constant acoustic environment across the different recording locations; both, the intoxicated and sober condition recording were done in the same car and supervised by the same investigator (dialogue partner). In the intoxicated state 30 items were sampled from each speaker (set A), while in the sober state 60 items were recorded (set NA; set A being a subset of set NA).
Preliminary version of 25/03/2009:
number of recorded speakers: 88 (final: 150)
number of recordings: 8586
number of phonetic segments: 709220
file formats:
o headset Beyerdynamics Opus 54: WAV 44,1kHz, 16 bit
o mouse micro AKG 400: WAV 44,1kHz, 16 bit
o meta data: speaker and recording protocol (SpeechDat)
o lexicon: 7-bit ASCII
o Emu database: *.hlb, *.phonetic
segmentation: manual segmentation of initial and final silence interval; automatic phonemic segmentation by MAUS
distribution: DVD-R
The final version will be made available free of charge to the preliminary version purchasers after completion (planned for end 2009).