What's New!

  • Feb. 20. 2016 For all resources in SHACHI, their detailed information is available.
  • Jan. 31. 2016 SHACHI Web site is redesigned
  • Apr. 24. 2008 If you wish to see detailed information by clicking names of language resources on the screen of SHACHI’s search results, the function to read some of it does not work at the moment. In this case, please refer to related URLs shown as information on “identifier” of language resources. Detailed information will be thoroughly provided by July 2008 but is now in preparation.


The National Institute of Information and Communications Technology (NICT) and Nagoya University, for the purpose of developing LRs efficiently, have been constructing a large scale metadata database named SHACHI as their joint project by collecting detailed meta information on LRs in Western and Asian countries. This research project aims to extensively collect metadata such as tag sets, formats, and recorded contents of LRs existing at home and abroad and store them systematically.
SHACHI contains more than 2,000 compiled language resources such as corpora, dictionaries, thesauruses and lexicons, forming a large scale metadata of language resources archive. Its metadata, an extended version of OLACmetadataSet conforming to Dublin Core, which contain detailed meta information, have been collected semiautomatically. To that end, it is indispensable for us to work in cooperation with language resource consortia at home and abroad and to take the initiative in contributing to Asian language resources.