言語資源の登録件数: 3330件 2023 件中 1721 - 1730 件目
現在の検索条件
キーワードを入力
検索条件を選択
  • C-004397: 感情評定値付きオンラインゲーム音声チャットコーパス
    本コーパスは、自発音声と演技音声とを収録した感情音声コーパスである。自発対話音声:オンラインゲーム中のプレーヤーに音声チャットを利用させ,自然に感情が表出した音声を,6対話分・計9,114発話収集。6,578発話に対し,Pluchik の立体構造モデルに基づく10種類の感情種別ラベルを付与.2名以上が一致した2,847発話には5段階の感情強度ラベリングも実施。演技音声:自発対話音声の転記テキストより17個の対話シーケンスを抜粋し,プロの俳優に対話形式で発声させたもの。発話単位で表現させる感情(8種類)を指定し,それぞれ平静状態と3段階の強度で表現させ,計2,656発話を収集。
  • C-004398: ANITA (Audio eNhancement In Telecom Applications)
    Desktop/Microphone
    ANITA (Audio eNhancement In secured Telecommunication Applications) is a European project launched on the initiative of EADS TELECOM with the objective of reducing audio acoustics noise in secured communications in adverse environments (sirens, alarms, engines, water pumps, stress situations, etc.). ANITA is a RANITA (Audio eNhancement In secured Telecommunication Applications) is a European project launched on the initiative of EADS TELECOM with the objective of reducing audio acoustics noise in secured communications in adverse environments (sirens, alarms, engines, water pumps, stress situations, etc.). ANITA is a R&T project promoted within the 5th RTD framework program of the European Commission, Information Society Technology priority (IST). In Europe, the secured digital telecommunications are based on two digital standards: TETRA (Trans European Trunk Radio) and TETRAPOL. ANITA addresses these two standards.
    This acoustic database consists of 41 recordings (17 males and 24 females) stored on 13 CDs:
    1. Voice recordings in 4 languages (English, French, German and Spanish)
    2. Noise recordings (sirens, engines, roadworks, crowds, trains, etc.)
    3. Real condition recordings (voices and mixed noises), in English
    Voice recordings took place in 2 types of conditions: normal conditions and stress and in panic conditions. Recording scenarios are public place (closed and open), moving vehicle, vehicle standing still and closed laboratory, through omnidirectional microphone B&K 4190 for voice, microphone Audio Technica ATM10A for noise and a linear uniformly-spaced 8-element array.
    Each language consists of:
    In normal conditions:
    - 60 phonetically rich sentences
    - letters and numerals
    - 10 minute text (Spanish text: extract of Rinconete y Cortadillo from Miguel de Cervantes Saavedra, French text: extract of Eugénie Grandet from Balzac, German text: extract of Die Lieben des Jungen Werther from Goethe, English text: extract of The woman in white)
    In conditions of stress and panic:
    - same 60 phonetically rich sentences
    - same letters and numerals
    One part of recordings was produced by EADS TELECOM and the other part by Friedrich-Alexander Universität Erlangen, Nürnberg. ELDA provided the 60 phonetically rich sentences.

    For more information on the project: http://fp5-anita.org
  • C-004399: NetDC Arabic BNSC (Broadcast News Speech Corpus)
    Broadcast Resources
    The NetDC Arabic BNSC (Broadcast News Speech Corpus) is a corpus developed by ELDA in the framework of the European-funded project Network of Data Centres (NetDC). The project was done in collaboration with the LDC (Linguistic Data Consortium), which has produced a similar corpus from the news broadcasted by Voice of America Arabic in the United States. The database contains ca. 22.5 hours of broadcast news speech recorded from Radio Orient (France) during a 3-month period between November 2001 and January 2002 (37 broadcast news, including 32 from the 5.55 pm news and 5 from the 10.55 pm news). The language is Standard Arabic from the Middle East region. The database is stored on 1 DVD-ROM. The database was validated by SPEX, the Netherlands, to assess its compliance with NetDC specifications.

    Recordings were made through a Sangean ATS 909 radio receiver connected to a desktop PC. Encoding is 16 kHz, 16 bits, single channel. Format is raw PCM (.wav) with header information.

    The corpus was segmented, labelled and transcribed manually using the “Transcriber” software, developed by DGA (Délégation Générale pour l'Armement, France) and LDC (Linguistic Data Consortium, USA) (with an additional patch for Arabic). The transcriptions were done in Arabic characters and the software automatically generated the transliterations. Transcriptions include speaker turns, topics, channel information.

    Each speech file (extension .wav) has an accompanying ASCII SAM label file with recording information (extension .sam), and an accompanying file with the transcription in xml format (extension .trs) and channel information. A phonetic lexicon in Arabic SAMPA has also been included.
  • C-004401: The "SIVA" Speech Database for Speaker Verification and Identification
    Telephone
    The Italian speech database SIVA (?Speaker Identification and Verification Archives: SIVA?), is a database comprising more than two thousands calls, collected over the public switched telephone network, and available very soon via ELRA.
    The SIVA database consists of four speaker categories: male users, female users, male impostors, female impostors. Speakers were contacted via mail before the test, and they were asked to read the information and the instructions provided carefully before making the call. About 500 speakers were recruited using a company specialized in selection of population samples. The others were volunteers contacted by the institute concerned.
    Speakers access the recording system by calling a toll free number. An automatic answering system guides them through the three sessions that make up a recording. In the first session, a list of 28 words (including digits and some commands) is recorded using a standard enumerated prompt. The second session is a simple unidirectional dialogue (the caller answers prompted questions) where personal information is asked (name, age, etc.). In the third session, the speaker is asked to read a continuous passage of phonetically balanced text that resembles a short curriculum vitae.
    The signal is a standard 8kHz sampled signal, coded using 8 bits mu-law format. The data collected so far consists of:
    · MU: male users 18 speakers, 20 repetitions
    · FU: female users 16 speakers, 26 repetitions
    · MI: male impostors: 189 speakers, 2 repetitions, and 128 speakers, 1 repetition
    · FI: female impostors: 213 speakers, 2 repetitions, and 107 speakers, 1 repetition.
  • C-004422: SPINA Corpus ("Robots Commands")
    Desktop/Microphone
    This German corpus contains read speech of 22 different speakers (6 male, 16 female). The corpus consists of 10 robot command sentences and 62 robot command words. Each speaker reads the whole corpus 5 times, except one speaker who reads the sentence corpus 16 times and the word corpus 51 times. The speakers were recorded at two different sites in Germany (University of Goettingen, University of Bochum).The corpus contains a total of 10,810 recorded utterances.
    All speakers are within the age of 25-30. Two speakers are non-native speakers. One file gives information about the speakers (speaker ID, recording site, sex).The task for the speaker was to read carefully but fluently. If an error occurred, the recording was interrupted by the supervisor and the sentence was repeated. The signal files are raw files without any header, 16 bit per sample, linear, most significant byte first, 16 kHz sample frequency.
    The orthography of the corpus is given in two distinct files which contain the prompted words and the prompted sentences as an ordered list.The recording conditions are as follows:
    Microphone:AKG acoustics, C414B-TL, condensator microphone omnidirectional, built-in attenuator and high pass filter switched off, distance to mouth 50 cm.
    Environment: Studio Quality, echo cancelled room, about 121 qqm
    Preamplifier: John Hardy, M-1
    Sampling rate: 48 kHz to DAT recorder, filtered to 16 kHz
    Resolution: 16 Bit, most significant byte first
    The speech data were digitally filtered to 8 kHz cut-off frequency and downsampled to 16 kHz.
    The corpus consists of 1 volume, total size 266,361 KB uncompressed data. The signal of each utterance is stored in a separate file. Symbolic information like segmentations or labelling (e.g. Phonological Segmentation of words or Word Segmentation of sentences) are stored in files with the same prefix but with different extensions.
  • C-004423: Danish SpeechDat(II) FDB-1000
    Telephone
    The Danish SpeechDat(II) FDB-1000 contains the recordings of 1,000 Danish speakers (1940 males, 2060 females) recorded over the Danish fixed telephone network.

    This speech database was validated by SPEX (the Netherlands) to assess its compliance with the SpeechDat format and content specifications.

    Speech samples are stored as sequences of 8-bit 8 kHz A-law. Each prompted utterance is stored in a separate file. Each signal file is accompanied by an ASCII SAM label file which contains the relevant descriptive information.

    Each speaker uttered the following items:

    * 3 application words
    * 1 sequence of 10 isolated digits
    * 4 connected digits (1 sheet number –5/10 digits, 1 telephone number –9/11 digits, 1 credit card number -16 digits, 1 PIN code -6 digits)
    * 3 dates (1 spontaneous e.g. birthday, 1 word style prompted date, 1 relative and general date expression)
    * 1 word spotting phrase using an embedded application word
    * 1 isolated digit
    * 3 spelled word (1 spontaneous e.g. own forename, 1 spelling of directory city name, 1 real word for coverage)
    * 1 currency money amount
    * 1 natural number
    * 5 directory assistance names (1 spontaneous e.g. own forename, 1 city of school at 7 years, 1most frequent cities out of a set of 500, 1 most frequent company/agency out of a set of 500 names, 1 "forename surname" out of a set of 500 names)
    * 2 yes/no questions (1 predominantly "yes" question, 1 predominantly "no" question)
    * 9 phonetically rich sentences
    * 2 time phrases (1 spontaneous time of day, 1 word style time phrase)
    * 4 phonetically rich words

    A pronunciation lexicon with a phonemic transcription in SAMPA is also included.
  • C-004430: FASiL English unimodal “fasil-uk” corpus
    Desktop/Microphone
    The corpus was collected in the context of the FASiL project, EU FP5 IST-2001-38685 (http://www.fasil.co.uk), as a wizard-of-oz experiment. Therefore, there are sound recordings of subject and wizard. A total of 70 subjects were recorded.
    The corpus is formatted as .wav files (u-law) for audio, plain ASCII text (.txt) for transcriptions, and a masterfile which binds .txt and .wav together. The masterfile is a “lattice” of the interaction in time, and contains the exact order of the interaction plus timings. The masterfile is loosely related to the HTK-SLF lattice format.
    The woz experiment is about the voice interaction with a Virtual Personal Assistent (VPA) for an email, calender and contacts task. Hesitations are marked as “UH”, noise as “NOISE” and other irrelevant stuff as “IRRELEVANT”. All annotations are in lower case, except for the former mentioned cases.
    Exact documentation of experiment in FASiL deliverable D.2.2
    The interactions contain mostly sentences but also spelled names, email addresses, telephone numbers, yes/no questions.
    See also S0174-02, S0174-03, S0174-04, and S0174-05.
  • C-004431: FASiL Portuguese unimodal “fasil-pt” corpus
    Desktop/Microphone
    The corpus was collected in the context of the FASiL project, EU FP5 IST-2001-38685 (http://www.fasil.co.uk), as a wizard-of-oz experiment. Therefore, there are sound recordings of subject and wizard. A total of 70 subjects were recorded.
    The corpus is formatted as .wav files (u-law) for audio, plain ASCII text (.txt) for transcriptions, and a masterfile which binds .txt and .wav together. The masterfile is a “lattice” of the ineraction in time, and contains the exact order of the interaction plus timings. The masterfile is loosely related to the HTK-SLF lattice format.
    The woz experiment is about the voice interaction with a Virtual Personal Assistent (VPA) for an email, calender and contacts task. Hesitations are marked as “UH”, noise as “NOISE” and other irrelevant stuff as “IRRELEVANT”. All annotations are in lower case, except for the former mentioned cases.
    Exact documentation of experiment in FASiL deliverable D.2.2
    The interactions contain mostly sentences but also spelled names, email addresses, telephone numbers, yes/no questions.
    S0174-01, S0174-03, S0174-04, and S0174-05.
  • C-004432: FASiL Swedish unimodal “fasil-sv” corpus
    Desktop/Microphone
    The corpus was collected in the context of the FASiL project, EU FP5 IST-2001-38685 (http://www.fasil.co.uk), as a wizard-of-oz experiment. Therefore, there are sound recordings of subject and wizard. A total of 70 subjects were recorded.
    The corpus is formatted as .wav files (u-law) for audio, plain ASCII text (.txt) for transcriptions, and a masterfile which binds .txt and .wav together. The masterfile is a “lattice” of the ineraction in time, and contains the exact order of the interaction plus timings. The masterfile is loosely related to the HTK-SLF lattice format.
    The orginal recordings were 16bit PCM which are converted to 8bit u-law.
    The woz experiment is about the voice interaction with a Virtual Personal Assistent (VPA) for an email, calender and contacts task. Hesitations are marked as “UH”, noise as “NOISE” and other irrelevant stuff as “IRRELEVANT”. All annotations are in lower case, except for the former mentioned cases.
    Exact documentation of experiment in FASiL deliverable D.2.2
    The interactions contain mostly sentences but also spelled names, email addresses, telephone numbers, yes/no questions.
    See also S0174-01, S0174-02, S0174-04, and S0174-05.
  • C-004433: FASiL combined unimodal “fasil-all” corpus
    Desktop/Microphone
    The corpus was collected in the context of the FASiL project, EU FP5 IST-2001-38685 (http://www.fasil.co.uk), as a wizard-of-oz experiment. Therefore, there are sound recordings of subject and wizard. A total of 210 subjects were recorded in the three project languages Swedish, Portuguese and English, all data for the same application.
    The corpus is formatted as .wav files (u-law) for audio, plain ASCII text (.txt) for transcriptions, and a masterfile which binds .txt and .wav together. The masterfile is a “lattice” of the ineraction in time, and contains the exact order of the interaction plus timings. The masterfile is loosely related to the HTK-SLF lattice format.
    The woz experiment is about the voice interaction with a Virtual Personal Assistent (VPA) for an email, calender and contacts task. Hesitations are marked as “UH”, noise as “NOISE” and other irrelevant stuff as “IRRELEVANT”. All annotations are in lower case, except for the former mentioned cases.
    Exact documentation of experiment in FASiL deliverable D.2.2
    The interactions contain mostly sentences but also spelled names, email addresses, telephone numbers, yes/no questions.
    See also S0174-01, S0174-02, S0174-03, and S0174-05.