ORTOLANG Deposit and sharing

Speech and Language Data Repository (SLDR/ORTOLANG)

Investissements d'avenir  Huma-Num  CLARIN

Open archives (OAI-PMH)

Lost in migration ? This document may help you : infoMigration-en_0.2.pdf

New deposits and registration of new users must now be done on Ortolang platform.

Visible items : 342
Documents : 790861
Downloadings : 21033
Members : 625 (53 countries)
Spoken languages : 175

Deposit and sharing of oral/multimodal linguistic data

As of 01/12/2015, deposit of data on SLDR website will be suspended to allow the public opening of Ortolang platform.

Now engaged with the CNRTL and Nanterre Orléans Centre in building ORTOLANG, SLDR continues its work of gathering and sharing language data. All services currently offered will remain part of the new platform.

Thus, SLDR/ORTOLANG allows you to browse data already collected in the area of speech/multimodal linguistics. Resources are grouped into four main types of items: primary data, secondary data, tools and collections. Downloading is possible on the basis of various provisions of archival law and intellectual property rights.

After authenticating on the site, you can also deposit your data and create their descriptive notice via a dedicated interface. This will notably allow you to specify access to data based on its status, nature, or professional categories of users.

Descriptive records created are consistent with the core metadata standards: they are eligible for harvesting by international directories and search engines. Meanwhile, items and their contents are assigned persistent identifiers to facilitate their retrieval regardless of physical location.

These steps are intended to finally prepare items for their long-term preservation by CINES, an institutional archive site.

Do not hesitate to contact us for additional information, or read our guidelines for the sharing and archiving of linguistic resources.

Detailed query

The latest deposits (169) >> morepage 1  >>
Primary data (corpus) ortolang-000939
SITAF (tandems anglais/français) (Céline HORGUES)
Individual contribution

Video-recorded corpus of tandem interactions between French-speaking students and English-speaking students at the University Sorbonne Nouvelle- Paris 3. Each of the 21 tandem pairs performs the collaborative speaking tasks (story-telling, debating, reading) on 2 occasions (2 recording sessions) 3 months apart. The corpus also comprises L1-L1 control interactions for the participants.
Metada is [...]


(applied_linguistics, language_acquisition, phonetics)
French; English
hdl:11041/ortolang-000939
2015-09-30
Version 1
source data
(Misc publications)
Primary data (corpus) blri-000940
Projet Noms Verbes (MEG) (ALARIO François-Xavier, LPC, BADIER Jean-Michel, INS, STRIJKERS Kristof, LPC, CHANOINE Valérie, BLRI)
Laboratoire de Psychologie Cognitive - UMR7290 (LPC, Marseille FR)
Institut de Neurosciences des Systèmes (INS, Marseille FR)

Response preparation and encoding in word production. Testing recent models (2011-2012-2013) of word production based on psycholinguistics, motor control and neuroscience.


(cognitive_science, neurolinguistics, psycholinguistics)
French (français)

>> Collection BLRI blri-000835
picto
hdl:11041/blri-000940
2015-09-30
Version 1
source data
• BLRI
Primary data (corpus) blri-000941
Projet Neuroling (IRMf) (FRENCK-MESTRE Cheryl, LPL, ANTON Jean-Luc, INT, BARKAT Mélissa, Praxiling, DJOURI Rym)
Laboratoire parole et langage - UMR 7309 (LPL, Aix-en-Provence FR)
Institut des Neurosciences de la Timone (INT, Marseille FR)
Praxiling - UMR 5267 (Montpellier FR)

Reconnaissance de la langue arabe.


(cognitive_science, neurolinguistics)
French (français)

>> Collection BLRI blri-000835
picto
hdl:11041/blri-000941
2015-09-30
Version 1
source data
• BLRI
Primary data (corpus) blri-000942
Projet Intermod (IRMf) (Pattamadilok Chotiga, LPL, ZIEGLER Johannes, LPC, BELIN Pascal, INT, CHANOINE Valérie, BLRI)
Laboratoire parole et langage - UMR 7309 (LPL, Aix-en-Provence FR)
Laboratoire de Psychologie Cognitive - UMR7290 (LPC, Marseille FR)
Institut des Neurosciences de la Timone (INT, Marseille FR)

The role of consciousness and top-down processes on the contribution of the auditory cortex during reading.


(neurolinguistics, cognitive_science)
French (français)

>> Collection BLRI blri-000835
picto
hdl:11041/blri-000942
2015-09-30
Version 1
source data
• BLRI
Primary data (corpus) blri-000943
Projet anti-tabac (IRMf) (SORIANO Alice, LPC, OULLIER Olivier, LPC)
Laboratoire de Psychologie Cognitive - UMR7290 (LPC, Marseille FR)

Effet des avertissements sanitaires combinés anti-tabac sur l’activité cérébrale évoquée par des stimuli liés au tabac ou « smoking cue reactivity ».


(cognitive_science)
French (français)

>> Collection BLRI blri-000835
picto
hdl:11041/blri-000943
2015-09-30
Version 1
source data
• BLRI
Primary data (corpus) ortolang-000937
Methodology and software for Semi-Automatic multi-domain annotations: annotating, exploring and sharing data (Grégoire MOREAU DE MONTCHEUIL)
Individual contribution

A tutorial that presents how to collect a large set of annotations of various domains: phonetic (words, syllables, phonemes), prosody (Momel and INTSINT), syntax (categories, groups), self-repetitions and gestures. It shows that, with good practices, it is possible to merge such annotations in a single representation for sharing and exploring in an efficient way.


(applied_linguistics, computational_linguistics, text_and_corpus_linguistics)
English
hdl:11041/ortolang-000937
2015-09-14
Version 1
source data

This material is Open Data
Primary data (corpus) ortolang-000934
Corpus ese’eja (Marine VUILLERMET)
Dynamique du langage - UMR 5596 (DDL, Lyon FR)

I recorded the data between 2005 and 2013, with the financial support of the following institutions: Endangered Language Documentation Programme (2007), Afrique Amérique Latine Langues En Danger (ANR AALLED, 2008-2009) et Endangered Language Fund (2008). Ese’eja is an endangered language spoken by about 1,700 people in the Bolivian and Peruvian Amazon. The recordings mostly come from speakers from [...]


(language_documentation, sociolinguistics, anthropological_linguistics)
Ese ejja
hdl:11041/ortolang-000934
2015-07-09
Version 1
source data
Google earth
OpenStreetMap
• Endangered Language Fund
• Endangered Language Documentation Programme
• ANR AALLED Afrique Amérique Latine Langues En Danger
Secondary data (resource) prax000929
Corpus St-Maur (documents) (Praxiling - UMR 5267 (Montpellier FR))
Praxiling - UMR 5267 (Montpellier FR)

Documents et travaux portant sur le corpus “Corpus St-Maur“


(applied_linguistics, anthropological_linguistics)
French
hdl:11041/prax000929
2015-06-29
Version 1
source data
Collection prax000930
GypsyLang
Praxiling - UMR 5267 (Montpellier FR)

Le présent corpus « Paroles de locuteurs gitans et non gitans sur trois générations à Perpignan » a été recueilli entre septembre 2013 et juin 2014. Il est issu d’une recherche financée par un fonds social européen (FSE) en partenariat avec l’Education Nationale, la Direction de l’Action Éducative et de l’Enfance de la Ville de Perpignan, afin de mieux comprendre les mécanismes [...]


(sociolinguistics, language_acquisition, anthropological_linguistics)
hdl:11041/prax000930
2015-06-29
Version 1
source data
Primary data (corpus) prax000931
GypsyLang (corpus) (Praxiling - UMR 5267 (Montpellier FR))
Praxiling - UMR 5267 (Montpellier FR)

Données primaires du corpus “GypsyLang“


(anthropological_linguistics, sociolinguistics, language_acquisition)
Northern Catalan
hdl:11041/prax000931
2015-06-29
Version 1
source data
Secondary data (resource) prax000932
GypsyLang (documents) (Praxiling - UMR 5267 (Montpellier FR))
Praxiling - UMR 5267 (Montpellier FR)

Documents et travaux portant sur le corpus “GypsyLang“


(sociolinguistics, language_acquisition, anthropological_linguistics)
hdl:11041/prax000932
2015-06-29
Version 1
source data
Secondary data (resource) prax000933
GypsyLang (transcriptions) (Praxiling - UMR 5267 (Montpellier FR))
Praxiling - UMR 5267 (Montpellier FR)

Transcriptions du corpus “GypsyLang“


(sociolinguistics, language_acquisition, anthropological_linguistics)
Catalan -> Northern Catalan
hdl:11041/prax000933
2015-06-29
Version 1
source data
Collection prax000928
Corpus St-Maur
Praxiling - UMR 5267 (Montpellier FR)

Le “Corpus Saint-Maur“ a été recueilli de Mai 2008 à juin 2009 dans le cadre du travail doctoral « Interactions et pratiques d’un processus d’innovation pédagogique en environnement carcéral » (Alidières 2013).

Il est constitué d’enregistrements audiovisuels portant, d’une part, sur les activités dans l’espace informatique de la Maison d’Arrêt et, d’autre [...]


(applied_linguistics, anthropological_linguistics)
French
hdl:11041/prax000928
2015-06-25
Version 1
source data
Primary data (corpus) ortolang-000926
CREAGEST - Dialogue entre adultes sourds (Structures formelles du langage - UMR 7023 (SFL, Paris FR), Savoirs, textes et langage - UMR 8163 (STL, Lille FR), Groupe d’imagerie neurofonctionnelle - UMR 6194, CNRS/CEA/Université de Caen (Caen FR))
Structures formelles du langage - UMR 7023 (SFL, Paris FR)
Savoirs, textes et langage - UMR 8163 (STL, Lille FR)
Groupe d’imagerie neurofonctionnelle - UMR 6194, CNRS/CEA/Université de Caen (Caen FR)

Il s’agit d’un corpus de dialogues entre adultes sourds (106 heures) : 51 entretiens, animés par quatre enquêteurs sourds de quatre régions de France différentes (entretiens semi-directifs, 3 caméras).

Les vidéos mises à disposition ici constituent une sous-partie du corpus global : 7 extraits filmés avec trois caméras (soit 21 fichiers).


(discourse_analysis, sociolinguistics, language_acquisition)
(Unclassified)

>> Collection CREAGEST ortolang-000912
picto
hdl:11041/ortolang-000926
2015-06-11
Version 1
source data
• ANR CREAGEST (2008-2012)

This material is Open Data
Collection ortolang-000922
ACORFORMed

Corpus de formation des médecins à l’annonce d’événements indésirables graves.
Deux sources du corpus : l’Institut Paoli-Calmettes de Marseille et le CHU d’Angers.


(discourse_analysis)
picto picto2
hdl:11041/ortolang-000922
2015-05-21
Version 1
source data
• CHU - Angers
• Aquarelle Santé - Angers
• Institut Paoli-Calmettes - Marseille
Primary data (corpus) ortolang-000923
Corpus Aquarel Santé (Jorane SAUBESTY)
Individual contribution

Corpus de formation des médecins à l’annonce d’événements indésirables graves mise en place au CHU d’Angers par l’organisme Aquarel Santé.

Corpus vidéo.


(discourse_analysis)
French (français)

>> Collection ACORFORMed ortolang-000922
picto picto2
hdl:11041/ortolang-000923
2015-05-21
Version 1
source data
• Aquarel santé - Angers
• CHU - Angers
Secondary data (resource) ortolang-000921
Happy Birthday Corpus - Data (Pauline LARROUY-MAESTRI)
Individual contribution

Acoustical analyses (pitch accuracy, tempo, vocal quality) and ratings (18 music experts and 18 non experts matched in gender and age) of the 166 performances of the Happy Birthday Corpus.

hdl:11041/ortolang-000921
2015-05-15
Version 1
source data
Secondary data (resource) ortolang-000920
Lexical functions of Spanish verb-noun collocations (Olga KOLESNIKOVA)
Individual contribution

The resource include first most frequent 1000 verb-noun pairs extracted automatically from the Spanish Web Corpus (located at http://www.sketchengine.co.uk). Each pair is classified as a free word combination (FWC), a collocation, or an error (of parsing). In FWCs and collocations, verbs and nouns are annotated with the Spanish WordNet senses. Each collocation is annotated with its lexical function. [...]


Spanish
hdl:11041/ortolang-000920
2015-05-14
Version 1
source data
(Misc publications)

This material is Open Data
Primary data (corpus) ortolang-000919
Corpus IPC (Jorane SAUBESTY)
Individual contribution

Corpus de formation des médecins à l’annonce d’événements indésirables graves.

Source du corpus : l’Institut Paoli-Calmettes de Marseille.

Corpus vidéo/audio avec transcriptions et annotations.


(discourse_analysis)
French (français)

>> Collection BLRI blri-000835
>> Collection ACORFORMed ortolang-000922
picto picto2
hdl:11041/ortolang-000919
2015-05-12
Version 1
source data
• Institut Paoli-Calmettes - Marseille
• Brain and Language Research Institute - Aix-en-Provence
Secondary data (resource) ortolang-000918
CID-DISP (Roxane Bertrand, Laboratoire parole et langage - UMR 7309, Laurent Prévot, Laboratoire parole et langage - UMR 7309, Stéphane Rauzy, Laboratoire parole et langage - UMR 7309)
Laboratoire parole et langage - UMR 7309 (LPL, Aix-en-Provence FR)

Syntactic, Prosodic, Discourse and Disfluencies annotation for the CID corpus


(general_linguistics, phonetics, phonology, speech_prosody, computational_linguistics)
hdl:11041/ortolang-000918
2015-04-30
Version 1
source data
page 1  >>

The 8 most frequent downloadings under SLDR licence
Secondary data (resource) Transcriptions of CID - Roxane BERTRANDDownloaded 103 time(s) (?)
Primary data (corpus) Videos of CID - Laboratoire parole et langage - UMR 7309 (LPL, Aix-en-Provence FR)Downloaded 96 time(s) (?)
Primary data (corpus) Aix-MARSEC database - Cyril Auran, Savoirs, textes et langage, Caroline Bouzon, Savoirs, textes et langage, Céline De Looze, Savoirs, textes et langage, Daniel Hirst, Laboratoire parole et langageDownloaded 82 time(s) (?)
Tool MarsaTag - Stéphane Rauzy, CNRS - LPLDownloaded 40 time(s) (?)
Secondary data (resource) VfrLPL - Stéphane RAUZYDownloaded 32 time(s) (?)
Primary data (corpus) ANGLISH - Daniel Hirst, Anne TortelDownloaded 25 time(s) (?)
Primary data (corpus) MAPTASK-AIX - Ellen Bard, Corine Astésano, Cheryl Frenck-Mestre, Mariapaola D'imperio, Alice Turk, Noël NguyenDownloaded 25 time(s) (?)
Secondary data (resource) Grammar of French language (GP) - Marie-Laure GUéNOTDownloaded 21 time(s) (?)

The 8 most frequent open-access downloadings (since 4/4/2013)
Secondary data (resource) Spoken language reference materials - Dafydd Gibbon, Roger Moore, Richard WinskiDownloaded 1483 time(s)
Tool Phonedit SIGNAIX - Robert Espesser, Alain Ghio, Tatsuya WatanabeDownloaded 1082 time(s)
Secondary data (resource) Intonation and Emphasis - Daniel HIRSTDownloaded 978 time(s)
Secondary data (resource) INTSINT - Daniel HIRSTDownloaded 919 time(s)
Tool MELISM - Geneviève CAELEN-HAUMONTDownloaded 872 time(s)
Tool IPA-Sampa - Daniel HIRSTDownloaded 719 time(s)
Secondary data (resource) Carnet de J.D. CHAUPIN - J.D. ChaupinDownloaded 718 time(s)
Secondary data (resource) Articulatory Phonology Lexicon - Alain Marchal, Laboratoire parole et langage - UMR 7309Downloaded 717 time(s)

This site has been declared to Commission Nationale de l’Informatique et des Libertés (CNIL) under agreement Nr.1222972 on 26 March 2008. As per French Law, any person cited by name is granted access to, modification, correction and suppression of data relative to him/her (art. 34 of the « Informatique et Libertés » act of 6 January 1978). To exert your right, send a message to webmaster(at)sldr.org.

This site is optimized for FireFox or any browser with the 'tabs' option set.