ORTOLANG Deposit and sharing

Speech and Language Data Repository (SLDR/ORTOLANG)

Investissements d'avenir  Huma-Num  CLARIN

Open archives (OAI-PMH)

Lost in migration ? This document may help you : infoMigration-en_0.2.pdf

New deposits and registration of new users must now be done on Ortolang platform.

Visible items : 341
Documents : 790856
Downloadings : 32859
Members : 701 (57 countries)
Spoken languages : 175
As of 01/12/2015, deposit of data on SLDR website will be suspended to allow the public opening of Ortolang platform.

Detailed query


The latest deposits (341) >> morepage 1  >>
Primary data (corpus) ortolang-000939
SITAF (tandems anglais/français) (Céline HORGUES)
2015-09-30 - version 1 - source data
Individual contribution

Video-recorded corpus of tandem interactions between French-speaking students and English-speaking students at the University Sorbonne Nouvelle- Paris 3. Each of the 21 tandem pairs performs the collaborative speaking tasks (story-telling, debating, reading) on 2 occasions (2 recording sessions) 3 months apart. The corpus also comprises L1-L1 control interactions for the participants.
Metada is [...]


(applied_linguistics, language_acquisition, phonetics)
French; English
hdl:11041/ortolang-000939
description http://sldr.org/publi/146/en
Primary data (corpus) blri-000940
Projet Noms Verbes (MEG) (ALARIO François-Xavier, LPC, BADIER Jean-Michel, INS, STRIJKERS Kristof, LPC, CHANOINE Valérie, BLRI)
2015-09-30 - version 1 - source data
Laboratoire de Psychologie Cognitive - UMR7290 (LPC, Marseille FR)
Institut de Neurosciences des Systèmes (INS, Marseille FR)

Response preparation and encoding in word production. Testing recent models (2011-2012-2013) of word production based on psycholinguistics, motor control and neuroscience.


(cognitive_science, neurolinguistics, psycholinguistics)
French (français)

>> Collection BLRI blri-000835
(see members)
picto

• BLRI

hdl:11041/blri-000940
isPartOf collection blri-000835 BLRI
Primary data (corpus) blri-000941
Projet Neuroling (IRMf) (FRENCK-MESTRE Cheryl, LPL, ANTON Jean-Luc, INT, BARKAT Mélissa, Praxiling, DJOURI Rym)
2015-09-30 - version 1 - source data
Laboratoire parole et langage - UMR 7309 (LPL, Aix-en-Provence FR)
Institut des Neurosciences de la Timone (INT, Marseille FR)
Praxiling - UMR 5267 (Montpellier FR)

Reconnaissance de la langue arabe.


(cognitive_science, neurolinguistics)
French (français)

>> Collection BLRI blri-000835
(see members)
picto

• BLRI

hdl:11041/blri-000941
isPartOf collection blri-000835 BLRI
Primary data (corpus) blri-000942
Projet Intermod (IRMf) (Pattamadilok Chotiga, LPL, ZIEGLER Johannes, LPC, BELIN Pascal, INT, CHANOINE Valérie, BLRI)
2015-09-30 - version 1 - source data
Laboratoire parole et langage - UMR 7309 (LPL, Aix-en-Provence FR)
Laboratoire de Psychologie Cognitive - UMR7290 (LPC, Marseille FR)
Institut des Neurosciences de la Timone (INT, Marseille FR)

The role of consciousness and top-down processes on the contribution of the auditory cortex during reading.


(neurolinguistics, cognitive_science)
French (français)

>> Collection BLRI blri-000835
(see members)
picto

• BLRI

hdl:11041/blri-000942
isPartOf collection blri-000835 BLRI
Primary data (corpus) blri-000943
Projet anti-tabac (IRMf) (SORIANO Alice, LPC, OULLIER Olivier, LPC)
2015-09-30 - version 1 - source data
Laboratoire de Psychologie Cognitive - UMR7290 (LPC, Marseille FR)

Effet des avertissements sanitaires combinés anti-tabac sur l’activité cérébrale évoquée par des stimuli liés au tabac ou « smoking cue reactivity ».


(cognitive_science)
French (français)

>> Collection BLRI blri-000835
(see members)
picto

• BLRI

hdl:11041/blri-000943
isPartOf collection blri-000835 BLRI
Primary data (corpus) ortolang-000937
Methodology and software for Semi-Automatic multi-domain annotations: annotating, exploring and sharing data (Grégoire MOREAU DE MONTCHEUIL)
2015-09-14 - version 1 - source data
Individual contribution

A tutorial that presents how to collect a large set of annotations of various domains: phonetic (words, syllables, phonemes), prosody (Momel and INTSINT), syntax (categories, groups), self-repetitions and gestures. It shows that, with good practices, it is possible to merge such annotations in a single representation for sharing and exploring in an efficient way.


(applied_linguistics, computational_linguistics, text_and_corpus_linguistics)
English

(see members)
hdl:11041/ortolang-000937
This material is Open Data
Primary data (corpus) ortolang-000934
Corpus ese’eja (Marine VUILLERMET)
2015-07-09 - version 1 - source data
Dynamique du langage - UMR 5596 (DDL, Lyon FR)

I recorded the data between 2005 and 2013, with the financial support of the following institutions: Endangered Language Documentation Programme (2007), Afrique Amérique Latine Langues En Danger (ANR AALLED, 2008-2009) et Endangered Language Fund (2008). Ese’eja is an endangered language spoken by about 1,700 people in the Bolivian and Peruvian Amazon. The recordings mostly come from speakers from [...]


(language_documentation, sociolinguistics, anthropological_linguistics)
Ese ejja

(see members)

• Endangered Language Fund
• Endangered Language Documentation Programme
• ANR AALLED Afrique Amérique Latine Langues En Danger

hdl:11041/ortolang-000934
Secondary data (resource) prax000929
Corpus St-Maur (documents) (Praxiling - UMR 5267 (Montpellier FR))
2015-06-29 - version 1 - source data
Praxiling - UMR 5267 (Montpellier FR)

Documents et travaux portant sur le corpus “Corpus St-Maur“


(applied_linguistics, anthropological_linguistics)
French
hdl:11041/prax000929
requires primary data (corpus) prax000925 Corpus St-Maur (corpus)
(requires primary data (corpus) prax000925 Corpus St-Maur (corpus))
(isPartOf collection prax000928 Corpus St-Maur)
Collection prax000930
GypsyLang
2015-06-29 - version 1 - source data
Praxiling - UMR 5267 (Montpellier FR)

Le présent corpus « Paroles de locuteurs gitans et non gitans sur trois générations à Perpignan » a été recueilli entre septembre 2013 et juin 2014. Il est issu d’une recherche financée par un fonds social européen (FSE) en partenariat avec l’Education Nationale, la Direction de l’Action Éducative et de l’Enfance de la Ville de Perpignan, afin de mieux comprendre les mécanismes [...]


(sociolinguistics, language_acquisition, anthropological_linguistics)
hdl:11041/prax000930
hasPart secondary data (resource) prax000932 GypsyLang (documents)
hasPart primary data (corpus) prax000931 GypsyLang (corpus)
hasPart secondary data (resource) prax000933 GypsyLang (transcriptions)
Primary data (corpus) prax000931
GypsyLang (corpus) (Praxiling - UMR 5267 (Montpellier FR))
2015-06-29 - version 1 - source data
Praxiling - UMR 5267 (Montpellier FR)

Données primaires du corpus “GypsyLang“


(anthropological_linguistics, sociolinguistics, language_acquisition)
Northern Catalan

(see members)
hdl:11041/prax000931
isRequiredBy secondary data (resource) prax000933 GypsyLang (transcriptions)
(isPartOf collection prax000930 GypsyLang)
isRequiredBy secondary data (resource) prax000932 GypsyLang (documents)
Secondary data (resource) prax000932
GypsyLang (documents) (Praxiling - UMR 5267 (Montpellier FR))
2015-06-29 - version 1 - source data
Praxiling - UMR 5267 (Montpellier FR)

Documents et travaux portant sur le corpus “GypsyLang“


(sociolinguistics, language_acquisition, anthropological_linguistics)

(see members)
hdl:11041/prax000932
(isPartOf collection prax000930 GypsyLang)
(requires primary data (corpus) prax000931 GypsyLang (corpus))
Secondary data (resource) prax000933
GypsyLang (transcriptions) (Praxiling - UMR 5267 (Montpellier FR))
2015-06-29 - version 1 - source data
Praxiling - UMR 5267 (Montpellier FR)

Transcriptions du corpus “GypsyLang“


(sociolinguistics, language_acquisition, anthropological_linguistics)
Catalan -> Northern Catalan

(see members)
hdl:11041/prax000933
(requires primary data (corpus) prax000931 GypsyLang (corpus))
(isPartOf collection prax000930 GypsyLang)
Collection prax000928
Corpus St-Maur
2015-06-25 - version 1 - source data
Praxiling - UMR 5267 (Montpellier FR)

Le “Corpus Saint-Maur“ a été recueilli de Mai 2008 à juin 2009 dans le cadre du travail doctoral « Interactions et pratiques d’un processus d’innovation pédagogique en environnement carcéral » (Alidières 2013).

Il est constitué d’enregistrements audiovisuels portant, d’une part, sur les activités dans l’espace informatique de la Maison d’Arrêt et, d’autre [...]


(applied_linguistics, anthropological_linguistics)
French
hdl:11041/prax000928
hasPart primary data (corpus) prax000925 Corpus St-Maur (corpus)
hasPart secondary data (resource) prax000929 Corpus St-Maur (documents)
Primary data (corpus) ortolang-000926
CREAGEST - Dialogue entre adultes sourds (Structures formelles du langage - UMR 7023 (SFL, Paris FR), Savoirs, textes et langage - UMR 8163 (STL, Lille FR), Groupe d’imagerie neurofonctionnelle - UMR 6194, CNRS/CEA/Université de Caen (Caen FR))
2015-06-11 - version 1 - source data
Structures formelles du langage - UMR 7023 (SFL, Paris FR)
Savoirs, textes et langage - UMR 8163 (STL, Lille FR)
Groupe d’imagerie neurofonctionnelle - UMR 6194, CNRS/CEA/Université de Caen (Caen FR)

Il s’agit d’un corpus de dialogues entre adultes sourds (106 heures) : 51 entretiens, animés par quatre enquêteurs sourds de quatre régions de France différentes (entretiens semi-directifs, 3 caméras).

Les vidéos mises à disposition ici constituent une sous-partie du corpus global : 7 extraits filmés avec trois caméras (soit 21 fichiers).


(discourse_analysis, sociolinguistics, language_acquisition)
(Unclassified)

>> Collection CREAGEST ortolang-000912
(see members)
picto

• ANR CREAGEST (2008-2012)

hdl:11041/ortolang-000926
This material is Open Data
isPartOf collection ortolang-000912 CREAGEST
Primary data (corpus) prax000925
Corpus St-Maur (corpus) (Praxiling - UMR 5267 (Montpellier FR))
2015-06-09 - version 1 - source data
Praxiling - UMR 5267 (Montpellier FR)

Le “Corpus Saint-Maur“ a été recueilli de Mai 2008 à juin 2009 dans le cadre du travail doctoral « Interactions et pratiques d’un processus d’innovation pédagogique en environnement carcéral » (Alidières 2013).

Il est constitué d’enregistrements audiovisuels portant, d’une part, sur les activités dans l’espace informatique de la Maison d’Arrêt et, d’autre [...]


(anthropological_linguistics, applied_linguistics)
French (français)

(see members)
hdl:11041/prax000925
(isRequiredBy secondary data (resource) prax000929 Corpus St-Maur (documents))
isRequiredBy secondary data (resource) prax000929 Corpus St-Maur (documents)
(isPartOf collection prax000928 Corpus St-Maur)
Collection ortolang-000922
ACORFORMed
2015-05-21 - version 1 - source data

Corpus de formation des médecins à l’annonce d’événements indésirables graves.
Deux sources du corpus : l’Institut Paoli-Calmettes de Marseille et le CHU d’Angers.


(discourse_analysis)
picto picto2

• CHU - Angers
• Aquarelle Santé - Angers
• Institut Paoli-Calmettes - Marseille

hdl:11041/ortolang-000922
(hasPart primary data (corpus) ortolang-000919 Corpus IPC)
(hasPart primary data (corpus) ortolang-000923 Corpus Aquarel Santé)
Primary data (corpus) ortolang-000923
Corpus Aquarel Santé (Jorane SAUBESTY)
2015-05-21 - version 1 - source data
Individual contribution

Corpus de formation des médecins à l’annonce d’événements indésirables graves mise en place au CHU d’Angers par l’organisme Aquarel Santé.

Corpus vidéo.


(discourse_analysis)
French (français)

>> Collection ACORFORMed ortolang-000922
(see members)
picto picto2

• Aquarel santé - Angers
• CHU - Angers

hdl:11041/ortolang-000923
isPartOf collection ortolang-000922 ACORFORMed
Secondary data (resource) ortolang-000921
Happy Birthday Corpus - Data (Pauline LARROUY-MAESTRI)
2015-05-15 - version 1 - source data
Individual contribution

Acoustical analyses (pitch accuracy, tempo, vocal quality) and ratings (18 music experts and 18 non experts matched in gender and age) of the 166 performances of the Happy Birthday Corpus.


(see members)
hdl:11041/ortolang-000921
isRequiredBy primary data (corpus) sldr000774 Happy Birthday corpus
Secondary data (resource) ortolang-000920
Lexical functions of Spanish verb-noun collocations (Olga KOLESNIKOVA)
2015-05-14 - version 1 - source data
Individual contribution

The resource include first most frequent 1000 verb-noun pairs extracted automatically from the Spanish Web Corpus (located at http://www.sketchengine.co.uk). Each pair is classified as a free word combination (FWC), a collocation, or an error (of parsing). In FWCs and collocations, verbs and nouns are annotated with the Spanish WordNet senses. Each collocation is annotated with its lexical function. [...]


Spanish

(see members)
hdl:11041/ortolang-000920
This material is Open Data
isReferencedBy http://sldr.org/publi/141/en
Primary data (corpus) ortolang-000919
Corpus IPC (Jorane SAUBESTY)
2015-05-12 - version 1 - source data
Individual contribution

Corpus de formation des médecins à l’annonce d’événements indésirables graves.

Source du corpus : l’Institut Paoli-Calmettes de Marseille.

Corpus vidéo/audio avec transcriptions et annotations.


(discourse_analysis)
French (français)

>> Collection BLRI blri-000835
>> Collection ACORFORMed ortolang-000922
(see members)
picto picto2

• Institut Paoli-Calmettes - Marseille
• Brain and Language Research Institute - Aix-en-Provence

hdl:11041/ortolang-000919
isPartOf collection blri-000835 BLRI
isPartOf collection ortolang-000922 ACORFORMed
Secondary data (resource) ortolang-000918
CID-DISP (Roxane Bertrand, Laboratoire parole et langage - UMR 7309, Laurent Prévot, Laboratoire parole et langage - UMR 7309, Stéphane Rauzy, Laboratoire parole et langage - UMR 7309)
2015-04-30 - version 1 - source data
Laboratoire parole et langage - UMR 7309 (LPL, Aix-en-Provence FR)

Syntactic, Prosodic, Discourse and Disfluencies annotation for the CID corpus


(general_linguistics, phonetics, phonology, speech_prosody, computational_linguistics)

(see members)
hdl:11041/ortolang-000918
isRequiredBy secondary data (resource) sldr000720 Transcriptions of CID
Tool ortolang-000917
MarsaGram (Grégoire MOREAU DE MONTCHEUIL)
2015-04-09 - version 1 - source data
Laboratoire parole et langage - UMR 7309 (LPL, Aix-en-Provence FR)

Tool to explore treebanks.
Extract a context-free grammar (CFG) and properties over the CFG rules.
Generate HTML pages to explore the rules and their properties.
Versatile : for constituents or dependencies treebanks.


(applied_linguistics, cognitive_science, language_documentation, computational_linguistics, text_and_corpus_linguistics)

>> Collection LPL tools lpl-000763
(see members)

• ORTOLANG

hdl:11041/ortolang-000917
This material is Open Data
isPartOf collection lpl-000763 LPL tools
description http://sldr.org/publi/140/en
Primary data (corpus) ortolang-000916
CREAGEST - Acquisition (Structures formelles du langage - UMR 7023 (SFL, Paris FR), Savoirs, textes et langage - UMR 8163 (STL, Lille FR), Groupe d’imagerie neurofonctionnelle - UMR 6194, CNRS/CEA/Université de Caen (Caen FR))
2015-04-08 - version 1 - source data
Structures formelles du langage - UMR 7023 (SFL, Paris FR)
Savoirs, textes et langage - UMR 8163 (STL, Lille FR)
Groupe d’imagerie neurofonctionnelle - UMR 6194, CNRS/CEA/Université de Caen (Caen FR)

Il s’agit d’un corpus de LSF enfantine recueilli auprès de 65 enfants sourds (bilingues LSF-Français) et 17 adultes sourds (50 heures), piloté par quatre enquêtrices sourdes de quatre régions de France différentes (4 stimuli, 2 caméras).

Les vidéos mises à disposition ici constituent une sous-partie du corpus global : 10 extraits filmés avec deux caméras (soit 20 fic [...]


(discourse_analysis, sociolinguistics, language_acquisition)
(Unclassified)

>> Collection CREAGEST ortolang-000912
(see members)
picto

• ANR CREAGEST (2008-2012)

hdl:11041/ortolang-000916
This material is Open Data
isPartOf collection ortolang-000912 CREAGEST
Primary data (corpus) ortolang-000915
VILLA : Varieties of Initial Learners in Language Acquisition (Sarra EL AYARI)
2015-03-30 - version 1 - source data
Structures formelles du langage - UMR 7023 (SFL, Paris FR)

VILLA (ANR, DFG, NWO) investigates the very first phases of foreign language acquisition under controlled input conditions. Our learners, total beginners with five different native languages (French, German, Dutch, English, Italian) were exposed to 14 hours of target language Polish. Regular tasks in Polish documented the acquisitional path of all learners.


(language_acquisition, applied_linguistics, psycholinguistics)
Polish (język polski)

• ANR ORA (2011-2013)

hdl:11041/ortolang-000915
isReferencedBy https://benjamins.com/#catalog/journals/euros...
Primary data (corpus) ortolang-000913
CrowdED_english (Andrew CAINES)
2015-03-26 - version 1 - source data
Individual contribution

Crowdsourced corpus of English native speakers answering business-topic questions of the type found in language learning oral exams. Contains soundfiles and annotated transcriptions. Reported in special session on ’Advanced Crowdsourcing for Speech and Beyond’ at the INTERSPEECH Conference 2015. Funded by Crowdee and CrowdFlower.


(text_and_corpus_linguistics, applied_linguistics, language_acquisition)
English

(see members)

• Crowdee
• Crowdflower

hdl:11041/ortolang-000913
Primary data (corpus) ortolang-000914
CrowdED_bilingual (Andrew CAINES)
2015-03-26 - version 1 - source data
Individual contribution

Crowdsourced corpus of German/English bilingual speakers answering business-topic questions of the type found in language learning oral exams, in both German and English. Contains soundfiles and annotated transcriptions. Reported in special session on ’Advanced Crowdsourcing for Speech and Beyond’ at the INTERSPEECH Conference 2015. Funded by Crowdee and CrowdFlower.


German (Deutsch)

(see members)

• Crowdee
• Crowdflower

hdl:11041/ortolang-000914
Primary data (corpus) sldr000717
Grindmill songs of Maharashtra (Guy Poitevin, Hema Rairkar)
2015-02-21 - version 1 - source data
Centre for Cooperative Research in Social Sciences (CCRSS, Pune IN)

Grindmill songs of Maharashtra (India): the complete collection of the Centre for Cooperative Research in Social Sciences (CCRSS, Pune).
Original DAT cassettes (UVS-01 to UVS-55) are stored at the Centre for Cooperative Research in Social Sciences (CCRSS) in Pune (Maharashtra).
http://ccrss.org
Cassettes UVS-01 to UVS-30 have been copied and deposited at the Archive and Research Center [...]
महाराष्ट्रातील(भारत) जात्यावरील गाणी : समाजशास्त्रीय सहकारी संशोधन केंद्राने(पुणे) केलेला समग्र संग्रह


(sociolinguistics, anthropological_linguistics)
Marathi -> Marathi, rural (ग्रामीण मराठी)

>> Collection Popular cultural productions in Marathi language ccrss-000749
(see members)

    • Netherlands Ministry for Development Cooperation (1993-1998)
    • Fondation Charles-Léopold Mayer pour le Progrès de l'Homme
    • International Fund for the Promotion of Culture (UNESCO) (1993-1998)

    hdl:11041/sldr000717
    This material is Open Data
    requires secondary data (resource) sldr000735 Annotations of grindmill songs
    project http://lpl-aix.fr/projet/67
    isPartOf collection ccrss-000749 Popular cultural productions in Marathi language
    description http://sldr.org/publi/35/en
    isReferencedBy http://hal.archives-ouvertes.fr/hal-00256388
    Collection ortolang-000912
    CREAGEST
    2015-02-17 - version 1 - source data
    Structures formelles du langage - UMR 7023 (SFL, Paris FR)
    Savoirs, textes et langage - UMR 8163 (STL, Lille FR)
    Groupe d’imagerie neurofonctionnelle - UMR 6194, CNRS/CEA/Université de Caen (Caen FR)

    CREAGEST rassemble des discours de Langue des signes française (LSF). Il comprend deux corpus :
    1) “CREAGEST - Acquisition“ est un corpus de LSF enfantine recueilli auprès de 65 enfants sourds (bilingues LSF-Français) et 17 adultes sourds (50 heures), piloté par quatre enquêtrices sourdes de quatre régions de France différentes (4 stimuli, 2 caméras);
    2) “CREAGEST - Dialogue [...]


    (discourse_analysis, sociolinguistics, language_acquisition)
    French Sign Language
    picto

    • ANR CREAGEST (2008-2012)

    hdl:11041/ortolang-000912
    (hasPart primary data (corpus) ortolang-000916 CREAGEST - Acquisition)
    (hasPart primary data (corpus) ortolang-000926 CREAGEST - Dialogue entre adultes sourds)
    description http://sldr.org/publi/137/en
    description http://sldr.org/publi/138/en
    Collection ortolang-000911
    CoFee - CoFee
    2015-02-15 - version 1 - source data
    Laboratoire parole et langage - UMR 7309 (LPL, Aix-en-Provence FR)

    4 corpora used as primary dataset for the ANR JCJC “Conversational Feedback“. Some of the corpora were pre-existing while some other have been created within this project. In addition to Primary data, annotations (ELAN files) and resulting datasets (CSV files) related to conversational feedback are alao included.

    hdl:11041/ortolang-000911
    hasPart primary data (corpus) sldr000027 Videos of CID
    hasPart secondary data (resource) sldr000720 Transcriptions of CID
    hasPart primary data (corpus) sldr000732 MAPTASK-AIX
    hasPart primary data (corpus) sldr000875 Audio-visual condition of Aix Map Task
    hasPart primary data (corpus) sldr000891 Aix-DVD
    project http://cofee.hypotheses.org/
    Secondary data (resource) sldr000735
    Annotations of grindmill songs (Guy Poitevin, Hema Rairkar)
    2015-02-15 - version 1 - source data
    Centre for Cooperative Research in Social Sciences (CCRSS, Pune IN)

    Annotations of grindmill songs
    • Transcriptions in Devanagari and Roman Devanagari scripts
    * Conversion of all databases to Unicode UTF8 (XML, TAB, CSV)
    • English translations (to be completed)
    • French translations (to be completed)
    • Information about performers
    • Information about recording places
    • Melodic classification (incomplete)


    (
    sociolinguistics, anthropological_linguistics)
    Marathi; Marathi -> Marathi, rural

    >> Collection Popular cultural productions in Marathi language ccrss-000749
    (see members)

    • International Fund for the Promotion of Culture (UNESCO) (1993-1998)
    • Netherlands Ministry for Development Cooperation (1993-1998)
    • Fondation Charles-Léopold Mayer pour le Progrès de l'Homme

    hdl:11041/sldr000735
    This material is Open Data
    (isRequiredBy primary data (corpus) sldr000717 Grindmill songs of Maharashtra)
    isPartOf collection ccrss-000749 Popular cultural productions in Marathi language

    This site has been declared to Commission Nationale de l’Informatique et des Libertés (CNIL) under agreement Nr.1222972 on 26 March 2008. As per French Law, any person cited by name is granted access to, modification, correction and suppression of data relative to him/her (art. 34 of the « Informatique et Libertés » act of 6 January 1978). To exert your right, send a message to webmaster(at)sldr.org.

    This site is optimized for FireFox or any browser with the 'tabs' option set.