ORTOLANG Deposit and sharing

Speech and Language Data Repository (SLDR/ORTOLANG)

Investissements d'avenir  Huma-Num  CLARIN

Open archives (OAI-PMH)

Publications (2)

Seediq Ontolex

Secondary data (resource) ortolang-000907

ID Bibliographical reference Abstract
127Hsieh, S. K., Su, I. L., Huang, C. R., Pei-Yi, H., Tzu-Yi, K., & Prevot, L. (2007). Basic lexicon and shared ontology for multilingual resources: A sumo+ milo hybrid approach. In Proceedings of OntoLex Workshop in the 6th International Semantic Web Conference.Secondary data (resource)
Seediq Ontolex (ortolang-000907)
A common conceptual infrastructure is crucial for multilingual language processing and documentation. Global Wordnet (GWN) was proposed as the common infrastructure for linguistically motivated conceptual representations for all languages. Two critical issues in this line of research are: the scarcity of lexical semantic information (especially from endangered languages), and the lack of a shared conceptual core as the basis of multilingual conceptual representation.
In this paper, we elaborate and formalize the proposal to build a shared core common ontology based on the Swadesh list as a solution to tackle with these two critical issues. Comparing Swadesh lists from different languages allowed us to build a small shared ontology that reflects direct human experience, and can serve as the cross-lingual conceptual core. These micro-ontologized lexicons can be used as seeds for developing a fully-grown and more comprehensive documentation of linguistically motivated ontology for each language. In terms of formalization, we pro- pose that SUMO+MILO has the appropriate level of abstractness and coverage for mapping from basic lexicon to formal ontology.
126Huang, C. R., Prévot, L., Su, I. L., & Hong, J. F. (2007). Towards a conceptual core for multicultural processing: A multilingual ontology based on the Swadesh list. In Intercultural Collaboration (pp. 17-30). Springer Berlin Heidelberg.Secondary data (resource)
Seediq Ontolex (ortolang-000907)
The work presented here is situated in the broader project of creating of multilingual lexical resources with a focus on Asian languages. In the paper, we describe the design of the upper-level we are creating for our multilingual lexical resources. Among the current efforts devoted to this issue our work put the focus on (i) the language diversity aiming at massively multilingual resource, and (ii) the attention devoted to the ontological design of the upper level.