SLDR archival scheme

Institutional producers

List by period of time

CORPUS LOG
sldr000865
(Aurélie GOUJON)
Primary data (corpus)
Created 2014-01-01
Modified 2014-01-01
Département de sciences du langage, Université d'Aix-Marseille (Aix-en-Provence FR)

Ce corpus LOG est composé de 2 vidéos et de leurs enregistrements audio. La première vidéo, en français, dure 2.58 secondes. La seconde, en italien, dure 1.29 secondes. La locutrice a pour tâche de raconter un extrait de dessin animé, une première fois en français face à un interlocuteur français, une seconde fois en italien face à une interlocutrice italienne. La locutrice est italienne [...]


(computational_linguistics, psycholinguistics)
French (français)
isPartOf collection sldr000804 Travaux d'étudiants Master LEX
335 Mb
24 files
Largest file: 191.31 Mb
Corpus-e/ɛ - Corpus-e/ɛ
sldr000866
(Ouissam BAIDADA)
Primary data (corpus)
Created 2014-01-06
Modified 2014-01-06
Département de sciences du langage, Université d'Aix-Marseille (Aix-en-Provence FR)

Ce corpus se compose d’un enregistrement de quatre listes de mots contenant /e/ et /ɛ/ dans des positions différentes, produites par des locuteurs méridionaux. Le but de ce corpus est de montrer la difficulté de discrimination de contraste /e/-/ɛ/ en situation de production par des sujets méridionaux.


(computational_linguistics, phonetics, text_and_corpus_linguistics, phonology)
French (français)
isPartOf collection sldr000804 Travaux d'étudiants Master LEX
12 Mb
17 files
Largest file: 4.62 Mb
Code d’alternance dans un contact de langue chez un locuteur berbère
sldr000867
(Smail TOUMERT)
Primary data (corpus)
Created 2014-01-08
Modified 2014-01-08
Département de sciences du langage, Université d'Aix-Marseille (Aix-en-Provence FR)

Cet enregistrement a été réalisé dans le cadre de parole spontanée durant une conversation téléphonique entre deux étudiants algériens- à savoir moi même entant que locuteur 1 et locuteur 2 dont on entend pas sa voix- qui vivent en France. Les langues parlées sont le français et une variété du berbère qui est le kabyle.


(computational_linguistics, sociolinguistics)
Kabyle (Taqbaylit)
isPartOf collection sldr000804 Travaux d'étudiants Master LEX
19 Mb
20 files
Largest file: 11.46 Mb
Francique et français se mélangent - Francique et français se mélangent
sldr000863
(Amélie SCHNEIDER)
Primary data (corpus)
Created 2013-12-30
Modified 2014-01-09
Département de sciences du langage, Université d'Aix-Marseille (Aix-en-Provence FR)

Conversation téléphonique d’une locutrice mosellane (50a, née à Bining, langue maternelle : dialecte, autres langues : français, allemand, anglais) avec sa sœur mosellane (49 a, née à Bining, langue maternelle : dialecte, autres langues : français, allemand) en parole spontanée, enregistré au format MP3 à l’aide de mon téléphone portable, puis converti au format [...]


(computational_linguistics, sociolinguistics)
Old High German -> Rhine Franconian (Rheinfränkisch)
isPartOf collection sldr000804 Travaux d'étudiants Master LEX
25 Mb
23 files
Largest file: 13.61 Mb
JURISDICT
sldr000868
(Adam Mickiewicz University Foundation (Poznan PL))
Primary data (corpus)
Created 2014-01-09
Modified 2014-01-09
Adam Mickiewicz University Foundation (Poznan PL)

The JURISDICT speech database is a large continuous speech database originally designed for dictated speech recognition.
The database includes above 1500 annotated sessions of speakers from 16 regions of Poland, plus another 500 experimental recordings.
The JURISDICT database is intended to provide material for both training and testing of speech dictation of common and legal texts, [...]


(applied_linguistics, phonetics, phonology)
Polish (język polski)
535 Mb
480164 files
Largest file: 269.45 Mb
Espagnol L1 dans un contexte francophone - Spanish L1 in a French context
sldr000856
(Adrià OLTRA)
Primary data (corpus)
Created 2013-12-20
Modified 2014-01-10
Département de sciences du langage, Université d'Aix-Marseille (Aix-en-Provence FR)

Observing Spanish spoken as L1 by a French born woman from Spanish family. The interest of this study is to evaluate the development of the mother tongue without sociolinguistic context. The corpus collects a random piece of spontaneous speech and two random pieces of a reading from a press text.


(computational_linguistics, language_acquisition, sociolinguistics, general_linguistics)
Spanish (Español)
isPartOf collection sldr000804 Travaux d'étudiants Master LEX
101 Mb
26 files
Largest file: 54.18 Mb
Français parlé dans le nord du Gabon
sldr000869
(Magali ITALIA)
Primary data (corpus)
Created 2014-01-14
Modified 2014-01-14
Laboratoire parole et langage - UMR 7309 (LPL, Aix-en-Provence FR)

Français parlé par des Gabonais du nord du pays, des locuteurs âgés peu ou pas scolarisés et de jeunes adultes moyennement scolarisés.


(syntax, morphology, discourse_analysis)
French (français)
4151 Mb
34 files
Largest file: 498.24 Mb
Corpus audio de teko (émérillon)
sldr000870
(Françoise ROSE)
Primary data (corpus)
Created 2014-01-21
Modified 2014-01-21
This material is Open Data Dynamique du langage - UMR 5596 (DDL, Lyon FR)

Collection of recordings of the Teko (Emerillon) language, a Tupi-Guarani language spoken in French Guiana. Recordings are linked to a transcription, French translation, morphological parsing with parts of speech information (data exported from Toolbox).


(language_documentation, text_and_corpus_linguistics)
Emerillon (teko)
35 Mb
14 files
Largest file: 24.59 Mb
Anonymise sound files
[ARK] sldr000526
(Daniel HIRST)
Tool
Created 2010-07-28
Modified 2014-01-24
Laboratoire parole et langage - UMR 7309 (LPL, Aix-en-Provence FR)

PRAAT script.
Purpose: replace portions of a long sound which are labelled with a key word on the accompanying TextGrid with a hum sound with the same prosodic characteristics as the original sound.
Original long sound can be mono or stereo, anonymised sound will be same.
TextGrid may be constructed from simple table using Table2textgrid.


(applied_linguistics, cognitive_science, language_documentation, speech_prosody, computational_linguistics)
isPartOf collection lpl-000763 LPL tools
requires http://www.fon.hum.uva.nl/praat/
requires tool sldr000811 Table2textgrid
18935 bytes
8 files
Largest file: 7277 bytes
SEC POS annotations
sldr000871
(Norwegian Computing Centre for the Humanities (NCCH, Bergen NO), Lancaster University (Lancaster UK))
Secondary data (resource)
Created 2014-02-03
Modified 2014-02-03
Norwegian Computing Centre for the Humanities (NCCH, Bergen NO)
Lancaster University (Lancaster UK)

Aix-MARSEC annotations


(speech_prosody, phonology, phonetics, computational_linguistics)
isRequiredBy primary data (corpus) sldr000033 Aix-MARSEC database
4 Mb
278 files
Largest file: 411217 bytes
ProZed. A prosody editor for linguists
sldr000778
(Daniel HIRST)
Tool
Created 2011-11-26
Modified 2014-02-24
Laboratoire parole et langage - UMR 7309 (LPL, Aix-en-Provence FR)

ProZed is a tool designed to allow linguists to manipulate the prosody of an utterance via a symbolic representation in order to evaluate linguistic models.
Prosody is manipulated via a Praat TextGrid which allows the user to modify the rhythm and melody.
Rhythm is manipulated by factoring segmental duration into three components: (i) intrinsic duration determined by phonemic identity (ii) [...]

1 Mb
45 files
Largest file: 642606 bytes
Raw data for the CATSEM project (MEG)
blri-000872
(Jean-michel Badier, INS, Catherine Liegois Chauvel, INS, Johannes Ziegler, LPC)
Primary data (corpus)
Created 2014-03-07
Modified 2014-03-07
Laboratoire de Psychologie Cognitive - UMR7290 (LPC, Marseille FR)
Institut de Neurosciences des Systèmes (INS, Marseille FR)

Study of the intellectual correlates of oral and written language in magnetoencephalography.

The objective is to study the intellectual mechanisms which underlie the perception(collection) of word and the reading by means of the technique of magnetoencephalography (MEG). The purpose is to specify the role and the properties of activation of the various intellectual structures implied (involved) [...]


(cognitive_science, neurolinguistics)
French (français)
isPartOf collection blri-000835 BLRI
48716 Mb
1531 files
Largest file: 17307.81 Mb
Transcription Orthographique Enrichie
sldr000873
(Laboratoire parole et langage - UMR 7309 (LPL, Aix-en-Provence FR))
Secondary data (resource)
Created 2014-03-12
Modified 2014-03-12
Laboratoire parole et langage - UMR 7309 (LPL, Aix-en-Provence FR)

Transcription Orthographique Enrichie


(general_linguistics, phonetics, phonology, speech_prosody, computational_linguistics, applied_linguistics)
project http://lpl-aix.fr/projet/170
isRequiredBy secondary data (resource) sldr000720 Transcriptions of CID
project http://lpl-aix.fr/projet/114
101465 bytes
6 files
Largest file: 62976 bytes
Articulatory Phonology Lexicon
sldr000874
(Alain Marchal, Laboratoire parole et langage - UMR 7309)
Secondary data (resource)
Created 2014-03-13
Modified 2014-03-13
This material is Open Data Laboratoire parole et langage - UMR 7309 (LPL, Aix-en-Provence FR)

Articulatory Phonology Lexicon (in French)


(phonology, phonetics)
185 Mb
54 files
Largest file: 61.67 Mb
RARHYF
sldr000023
(Antoine Giovanni, Centre Hospitalier Universitaire de la Timone)
Primary data (corpus)
Created 2014-04-08
Modified 2014-04-08
Laboratoire parole et langage - UMR 7309 (LPL, Aix-en-Provence FR)

Repeated Acoustic Records in Healthy Young Females


(phonetics, applied_linguistics, speech_prosody)
French (français)
745 Mb
1244 files
Largest file: 2.53 Mb
Oursel/Corpus Cours alpha
sldr000876
(Elodie OURSEL)
Primary data (corpus)
Created 2014-05-04
Modified 2014-05-04
SYstèmes Linguistiques, Enonciation et Discours - EA2290 (SYLED, Paris FR)

Private course of reading given by a non professional to a algerian woman, mother of children, and auto-confrontation of the teacher to her teaching practice.


(sociolinguistics, language_acquisition)
French (français)
(isPartOf collection ortolang-000898 Interactions FLE (Oursel))
1618 Mb
24 files
Largest file: 650.64 Mb
Oursel/Accompagnement vers l’emploi
sldr000877
(Elodie OURSEL)
Primary data (corpus)
Created 2014-05-04
Modified 2014-05-04
SYstèmes Linguistiques, Enonciation et Discours - EA2290 (SYLED, Paris FR)
Analyse et traitement informatique de la langue française - UMR 7118 (ATILF, Nancy FR)

Meeting between a non native French speaker with a non professional social worker who helps find work.


(sociolinguistics, language_acquisition)
French (français)
(isPartOf collection ortolang-000898 Interactions FLE (Oursel))
611 Mb
16 files
Largest file: 605.56 Mb
Oursel/Assistante sociale
sldr000878
(Elodie OURSEL)
Primary data (corpus)
Created 2014-05-04
Modified 2014-05-04
SYstèmes Linguistiques, Enonciation et Discours - EA2290 (SYLED, Paris FR)
Analyse et traitement informatique de la langue française - UMR 7118 (ATILF, Nancy FR)

Meeting between people from a social center and a social worker. The people speak French as a second language. The social worker is French.


(sociolinguistics, language_acquisition)
French (français)
(isPartOf collection ortolang-000898 Interactions FLE (Oursel))
7607 Mb
29 files
Largest file: 921.43 Mb
Oursel/Ateliers socio-linguistiques
sldr000879
(Elodie OURSEL)
Primary data (corpus)
Created 2014-05-04
Modified 2014-05-04
SYstèmes Linguistiques, Enonciation et Discours - EA2290 (SYLED, Paris FR)
Analyse et traitement informatique de la langue française - UMR 7118 (ATILF, Nancy FR)

Practical courses in French as a foreign language. A dozen participants, non native French speakers and a non professional teacher.


(sociolinguistics, language_acquisition)
French (français)
(isPartOf collection ortolang-000898 Interactions FLE (Oursel))
2212 Mb
17 files
Largest file: 1110.47 Mb
Oursel/Ecole doctorale
sldr000880
(Elodie OURSEL)
Primary data (corpus)
Created 2014-05-04
Modified 2014-05-04
SYstèmes Linguistiques, Enonciation et Discours - EA2290 (SYLED, Paris FR)
Analyse et traitement informatique de la langue française - UMR 7118 (ATILF, Nancy FR)

Meeting between an italian PhD student and a romanian doctoral school secretary. Administrative problem.


(sociolinguistics, language_acquisition)
French (français)
(isPartOf collection ortolang-000898 Interactions FLE (Oursel))
340 Mb
22 files
Largest file: 335.06 Mb
Oursel/Ecrivain public
sldr000881
(Elodie OURSEL)
Primary data (corpus)
Created 2014-05-04
Modified 2014-05-04
SYstèmes Linguistiques, Enonciation et Discours - EA2290 (SYLED, Paris FR)
Analyse et traitement informatique de la langue française - UMR 7118 (ATILF, Nancy FR)

Meetings between users and a public writer. Non native French speaking users, native speaking French public writer. Miscelaneous administrative problems.


(sociolinguistics, language_acquisition)
French (français)
(isPartOf collection ortolang-000898 Interactions FLE (Oursel))
5479 Mb
60 files
Largest file: 674.35 Mb
Oursel/Entretiens compréhension orale
sldr000882
(Elodie OURSEL)
Primary data (corpus)
Created 2014-05-04
Modified 2014-05-04
SYstèmes Linguistiques, Enonciation et Discours - EA2290 (SYLED, Paris FR)
Analyse et traitement informatique de la langue française - UMR 7118 (ATILF, Nancy FR)

Interviews between the researcher and foreign students about their learning of French, their sojourn in France and the evolution of their ease in understanding French speakers.
Some recordings also contain a test of recognition of implicits.


(sociolinguistics, language_acquisition)
French (français)
(isPartOf collection ortolang-000898 Interactions FLE (Oursel))
34376 Mb
50 files
Largest file: 2461.35 Mb
Oursel/Entretiens conversation
sldr000883
(Elodie OURSEL)
Primary data (corpus)
Created 2014-05-04
Modified 2014-05-04
SYstèmes Linguistiques, Enonciation et Discours - EA2290 (SYLED, Paris FR)
Analyse et traitement informatique de la langue française - UMR 7118 (ATILF, Nancy FR)

Conversations between the researcher and foreign students. Various themes.


(sociolinguistics, language_acquisition)
French (français)
(isPartOf collection ortolang-000898 Interactions FLE (Oursel))
2667 Mb
21 files
Largest file: 779.17 Mb
Oursel/Institut de traducteurs et d’interprètes
sldr000884
(Elodie OURSEL)
Primary data (corpus)
Created 2014-05-04
Modified 2014-05-04
SYstèmes Linguistiques, Enonciation et Discours - EA2290 (SYLED, Paris FR)
Analyse et traitement informatique de la langue française - UMR 7118 (ATILF, Nancy FR)

Meetings between students or futur students in translation or interpretation and the director of the programme. Director native English speaker, students native French speakers.


(sociolinguistics, language_acquisition)
French (français)
(isPartOf collection ortolang-000898 Interactions FLE (Oursel))
1018 Mb
53 files
Largest file: 317.24 Mb
Oursel/Office de l’immigration - Accueil
sldr000885
(Elodie OURSEL)
Primary data (corpus)
Created 2014-05-04
Modified 2014-05-04
SYstèmes Linguistiques, Enonciation et Discours - EA2290 (SYLED, Paris FR)
Analyse et traitement informatique de la langue française - UMR 7118 (ATILF, Nancy FR)

Meetings between users and administrative agents at the office of immigration. Administrative questions about the procedure to obtain the long-stay (1 year) visa.


(sociolinguistics, language_acquisition)
French (français)
(isPartOf collection ortolang-000898 Interactions FLE (Oursel))
2510 Mb
72 files
Largest file: 395.63 Mb
Oursel/Office de l’immigration - Audit
sldr000886
(Elodie OURSEL)
Primary data (corpus)
Created 2014-05-04
Modified 2014-05-04
SYstèmes Linguistiques, Enonciation et Discours - EA2290 (SYLED, Paris FR)
Analyse et traitement informatique de la langue française - UMR 7118 (ATILF, Nancy FR)

Evaluation of the needs for training of the immigrants who obtain a long-stay visa.


(sociolinguistics, language_acquisition)
French (français)
(isPartOf collection ortolang-000898 Interactions FLE (Oursel))
2337 Mb
41 files
Largest file: 315.69 Mb
Oursel/Préfecture
sldr000887
(Elodie OURSEL)
Primary data (corpus)
Created 2014-05-04
Modified 2014-05-04
SYstèmes Linguistiques, Enonciation et Discours - EA2290 (SYLED, Paris FR)
Analyse et traitement informatique de la langue française - UMR 7118 (ATILF, Nancy FR)

Administrative conversations between agents of the prefecture (bureau of visas) and foreign users.


(sociolinguistics, language_acquisition)
French (français)
(isPartOf collection ortolang-000898 Interactions FLE (Oursel))
1109 Mb
89 files
Largest file: 86.12 Mb
Oursel/Secrétariat de FLE
sldr000888
(Elodie OURSEL)
Primary data (corpus)
Created 2014-05-04
Modified 2014-05-04
SYstèmes Linguistiques, Enonciation et Discours - EA2290 (SYLED, Paris FR)
Analyse et traitement informatique de la langue française - UMR 7118 (ATILF, Nancy FR)

Rencontre entre une étudiante de français langue étrangère et un secrétaire de département pour récupérer son diplôme.


(sociolinguistics, language_acquisition)
French (français)
(isPartOf collection ortolang-000898 Interactions FLE (Oursel))
51 Mb
22 files
Largest file: 45.68 Mb
Oursel/Service des relations internationales
sldr000889
(Elodie OURSEL)
Primary data (corpus)
Created 2014-05-04
Modified 2014-05-04
SYstèmes Linguistiques, Enonciation et Discours - EA2290 (SYLED, Paris FR)
Analyse et traitement informatique de la langue française - UMR 7118 (ATILF, Nancy FR)

Meetings between foreign students at the end of their sojourn in France and the secretary of the service of international relations in a university. Administrative questions.


(sociolinguistics, language_acquisition)
French (français)
(isPartOf collection ortolang-000898 Interactions FLE (Oursel))
749 Mb
43 files
Largest file: 173.52 Mb
MARC-Fr
sldr000786
(Brigitte Bigi, Pauline Péri)
Primary data (corpus)
Created 2012-04-06
Modified 2014-05-05
This material is Open Data Laboratoire parole et langage - UMR 7309 (LPL, Aix-en-Provence FR)

Corpus français manuellement phonétisé et aligné d’une durée de 7 minutes. Composé de 3 sous-corpus : CID, AixOx et Grenelle.


(phonology, phonetics, speech_prosody, general_linguistics, computational_linguistics, text_and_corpus_linguistics)
French (français)
isPartOf primary data (corpus) sldr000027 Videos of CID
isPartOf collection sldr000729 Multimodalité et débats à l'Assemblée nationale - Multimodality and debates in the National Assembly
isPartOf primary data (corpus) sldr000784 AixOx
(isPartOf collection sldr000725 OMProDat)
45 Mb
77 files
Largest file: 23.81 Mb
Transcriptions of CID
sldr000720
(Roxane BERTRAND)
Secondary data (resource)
Created 2008-09-16
Modified 2014-05-06
Laboratoire parole et langage - UMR 7309 (LPL, Aix-en-Provence FR)

8 French dialogues involving 2 participants, with the following data:
• a wav file for each speaker
• the inter-pausals units (IPU) annotation aligned with audio signal
• enriched orthographic transcription (TOE) aligned with audio signal
• phones aligned with audio signal
• syllables aligned with audio signal
• tokens aligned with audio signal
• [...]


(phonology, phonetics, speech_prosody, general_linguistics, computational_linguistics)
French
isRequiredBy primary data (corpus) sldr000027 Videos of CID
project http://lpl-aix.fr/projet/114
project http://lpl-aix.fr/projet/170
project http://lpl-aix.fr/projet/71
(isReferencedBy secondary data (resource) sldr000803 ORCHID.fr)
(requires secondary data (resource) sldr000873 Transcription Orthographique Enrichie)
(isPartOf collection ortolang-000901 VariAMU)
(isRequiredBy secondary data (resource) ortolang-000722 Annotations of CID)
(isPartOf collection ortolang-000911 CoFee - CoFee)
(requires secondary data (resource) ortolang-000918 CID-DISP)
description http://portal.acm.org/citation.cfm?id=1868749...
isReferencedBy http://www.lrec-conf.org/proceedings/lrec2010...
isReferencedBy http://sldr.org/publi/104/en
3821 Mb
162 files
Largest file: 1289.26 Mb
Aix-DVD
sldr000891
(Jan Gorisch, Laurent Prévot)
Primary data (corpus)
Created 2014-05-13
Modified 2014-05-13
This material is Open Data Laboratoire parole et langage - UMR 7309 (LPL, Aix-en-Provence FR)

The recording situation involves two participants sitting in front of each other with a table in the middle. 8 DVD boxes with the labels were placed on the table. The instruction is to have a conversation and that at the end (after 30) minutes, each participant may leave the recording booth with two DVDs. This setting involved therefore some negotiation and discussion.


French (français)
(isPartOf collection ortolang-000911 CoFee - CoFee)
31443 Mb
89 files
Largest file: 10357.22 Mb
ALIPE (Acquisition de la Liaison et Interactions Parents Enfants)
alipe-000853
(Damien Chabanal, Thierry Chanier, Loïc Liégeois)
Primary data (corpus)
Created 2013-12-18
Modified 2014-05-19
Laboratoire de Recherche sur le Langage - EA 999 (LRL, Clermont-Ferrand FR)

Interactions parents-enfant annotées pour l’étude de la liaison.
Ce corpus contient la transcription d’environ 15H de conversations informelles entre enfant et parents.
Les données de ce corpus ont été structurées dans le cadre du projet ALIPE (Acquisition de la Liaison et Interactions Parents Enfants) du Laboratoire de Recherche sur le Langage.


(language_acquisition, phonology)
French (français)
21957 Mb
1836 files
Largest file: 833.41 Mb
La compétence pragmatique en négociation (Français Langue Etrangère par des Singapouriens, niveaux A2 et B1)
sldr000892
(Isabel REPISO)
Primary data (corpus)
Created 2014-05-26
Modified 2014-07-01
Individual contribution

Two role plays have been proposed to the speakers, both in formal and casual contexts.


(language_acquisition, discourse_analysis, syntax)
French (français)
3224 Mb
174 files
Largest file: 137.12 Mb
Script praat pour recueil infos acoustiques
ortolang-000895
(Sandra CORNAZ)
Tool
Created 2014-07-18
Modified 2014-07-14
This material is Open Data Grenoble Images Parole Signal Automatique - UMR 5216 (Gipsa, Grenoble FR)

Script à utiliser à partir d’un TextGrid (en particulier généré sous SPPAS). Permet de récupérer les infos suivantes pour chaque partie segmentée : étiquette, durée, F0, F1, F2, F3, F4, et pour chacune la std et la bandw. Les mesures de formants sont calculées à partir de la médiane des valeurs recueillies toutes les 10 ms sur la partie segmentée. Ce script a été modifié pour [...]


(phonetics, computational_linguistics)
33055 bytes
8 files
Largest file: 9814 bytes
Read speech in Italian L1
ortolang-000894
(Sandra CORNAZ)
Primary data (corpus)
Created 2014-07-17
Modified 2014-07-17
Grenoble Images Parole Signal Automatique - UMR 5216 (Gipsa, Grenoble FR)
Laboratorio di Fonetica Sperimentale ’Arturo Genere’ dell’Università di Torino (LFSAG, Torino IT)

Env. 50 locuteurs de langue maternelle italienne. Enregistrements de trois types de lecture : phrases de type ’bene, Iris e Ugo, mangiamo una bella pesca’; texte de “Il Vento di tramontana e il sole“; phrases sans signification : ’V, hai detto V, V come in CV’.
Circa 50 locutori italiano lingua materna. Registrazioni di tre tipi di lettura : frasi di tipo ’bene, Iris e Ugo, mangiamo una bella pesca’; testo di “Il Vento di tramontana e il sole“; frasi senza senso : ’V, hai detto V, V come in CV’.


(phonetics, phonology, text_and_corpus_linguistics)
Italian (Italiano)
1069 Mb
127 files
Largest file: 57.69 Mb
Valjouffrey-Valbonnais 2012-2014
sldr000787
(Médéric GASQUET-CYRUS)
Primary data (corpus)
Created 2012-04-22
Modified 2014-07-31
Laboratoire parole et langage - UMR 7309 (LPL, Aix-en-Provence FR)
Département de linguistique et phonétique générales, Université d'Aix-Marseille (Aix-en-Provence FR)

Audio & video recordings in the context of project “Mémoires et pratiques linguistiques en zone de transition entre francoprovençal et occitan : Valjouffrey et Valbonnais”


(language_documentation, lexicography, sociolinguistics)
Occitan (post 1500) -> Provençal; Occitan (post 1500) -> Alpine Provençal or North-Occitan; Occitan (post 1500) -> Valbonnais dialect; Franco-Provençal; French
isPartOf collection valjouffrey-000007 Valjouffrey-Valbonnais
135043 Mb
7261 files
Largest file: 3655.85 Mb
Interactions FLE (Oursel)
ortolang-000898
(Elodie OURSEL)
Collection
Created 2014-09-17
Modified 2014-09-17
SYstèmes Linguistiques, Enonciation et Discours - EA2290 (SYLED, Paris FR)
Analyse et traitement informatique de la langue française - UMR 7118 (ATILF, Nancy FR)

Corpus d’interactions en français langue étrangère (FLE)


(sociolinguistics, language_acquisition)
hasPart primary data (corpus) sldr000884 Oursel/Institut de traducteurs et d’interprètes
hasPart primary data (corpus) sldr000883 Oursel/Entretiens conversation
hasPart primary data (corpus) sldr000882 Oursel/Entretiens compréhension orale
hasPart primary data (corpus) sldr000881 Oursel/Ecrivain public
hasPart primary data (corpus) sldr000880 Oursel/Ecole doctorale
hasPart primary data (corpus) sldr000879 Oursel/Ateliers socio-linguistiques
hasPart primary data (corpus) sldr000878 Oursel/Assistante sociale
hasPart primary data (corpus) sldr000877 Oursel/Accompagnement vers l’emploi
hasPart primary data (corpus) sldr000876 Oursel/Corpus Cours alpha
hasPart primary data (corpus) sldr000885 Oursel/Office de l’immigration - Accueil
hasPart primary data (corpus) sldr000886 Oursel/Office de l’immigration - Audit
hasPart primary data (corpus) sldr000887 Oursel/Préfecture
hasPart primary data (corpus) sldr000888 Oursel/Secrétariat de FLE
hasPart primary data (corpus) sldr000889 Oursel/Service des relations internationales
17922 bytes
4 files
Largest file: 6207 bytes
Faetano : francoprovençal d’Italie - Faetano : francoprovençal d’Italie
ortolang-000899
(Priscilla TOLENTINO)
Primary data (corpus)
Created 2014-10-02
Modified 2014-10-02
Laboratoire parole et langage - UMR 7309 (LPL, Aix-en-Provence FR)

Enregistrements audio et vidéo réalisés dans le cadre du projet de thèse Contact de langues en France et en Italie : étude de deux variétés de gallo-roman


(typology, sociolinguistics)
Franco-Provençal -> Faetar (Faitare)
18871 bytes
4 files
Largest file: 7543 bytes
Werewolf
ortolang-000900
(Laboratoire parole et langage - UMR 7309 (LPL, Aix-en-Provence FR), Département de linguistique et phonétique générales, Université d'Aix-Marseille (Aix-en-Provence FR))
Primary data (corpus)
Created 2014-10-07
Modified 2014-10-07
This material is Open Data Laboratoire parole et langage - UMR 7309 (LPL, Aix-en-Provence FR)
Département de linguistique et phonétique générales, Université d'Aix-Marseille (Aix-en-Provence FR)

Multitrack recordings of interactions betweeen players of the ’Werewolf’ game. Students of the LEX Master course (SCLQ41) on 7 October 2014, room A003 of Laboratoire Parole et Langage.


(computational_linguistics, applied_linguistics, phonetics, sociolinguistics)
French (français)
isPartOf collection sldr000804 Travaux d'étudiants Master LEX
(isRequiredBy secondary data (resource) ortolang-000908 Loup garou - annotations)
16357 Mb
249 files
Largest file: 936.73 Mb
Audio-visual condition of Aix Map Task
sldr000875
(Jan Gorisch, Laurent Prévot)
Primary data (corpus)
Created 2014-03-20
Modified 2014-10-13
This material is Open Data Laboratoire parole et langage - UMR 7309 (LPL, Aix-en-Provence FR)

This is the audio-visual condition of the Aix Map Task corpus. Two participants sit face-to-face and complete the Map Task. They are native speakers of French. The audio was recorded on head mounted microphones. The video is recorded individually for each participant on MiniDV cameras. Due to image loss, the audio and video are synchronised using a method described in a paper submitted to the IEEE [...]


(discourse_analysis, computational_linguistics, phonetics)
French (français)
conformsTo primary data (corpus) sldr000732 MAPTASK-AIX
(isPartOf collection ortolang-000911 CoFee - CoFee)
isReferencedBy http://www.lrec-conf.org/proceedings/lrec2014...
4021 Mb
210 files
Largest file: 1047.76 Mb
VariAMU
ortolang-000901
(Julie Abbou, LPL)
Collection
Created 2014-10-17
Modified 2014-10-17

The VariAMU collection is a data and tools repository.
The crucial goal of Variamu is to have comparable datasets in order to make cross-linguistic comparisons.


(general_linguistics, computational_linguistics)
English; French; Yue Chinese; Chinese; Portuguese
hasPart primary data (corpus) sldr000027 Videos of CID
hasPart secondary data (resource) sldr000720 Transcriptions of CID
hasPart tool sldr000841 MarsaTag
hasPart secondary data (resource) sldr000850 MarsaLex - MarsaLex
hasPart tool sldr000800 SPPAS - Automatic Annotation of Speech
Corpus ANCOR Centre
ortolang-000903
(Jean-Yves Antoine, LI)
Secondary data (resource)
Created 2014-10-30
Modified 2014-10-26
Laboratoire d‘Informatique (LI, Tours FR)
Laboratoire Ligérien de Linguistique (LLL, Orléans FR)
Langues, textes, traitements informatiques, cognition - UMR 8094 (LaTTiCe, Paris FR)

ANCOR Centre is a French spoken corpus annotated in coreference whose size (488,000 words) is sufficient to investigate the achievement of data oriented systems of coreference resolution. The annotation was conducted on three different corpora of conversational speech (Accueil_UBS, OTG, ESLO). It is freely available under Creative Commons CC-BY-SA or CC-BY-SA-NC licence


(computational_linguistics, text_and_corpus_linguistics, speech_prosody, general_linguistics)
French
isRequiredBy http://eslo.huma-num.fr
isRequiredBy primary data (corpus) sldr000890 Accueil_UBS
isRequiredBy primary data (corpus) sldr000831 OTG
description http://hal.archives-ouvertes.fr/hal-01075679
description http://www.taln2013.org/actes/www/TALN-2013/a...
description https://hal.archives-ouvertes.fr/hal-01016562...
290 Mb
1409 files
Largest file: 12.29 Mb
Berlin
ortolang-000893
(Cyrille Granget)
Primary data (corpus)
Created 2014-07-09
Modified 2014-10-29
Didactique des langues, des textes et des cultures - EA2288 (DILTEC, Paris FR)

Corpus pseudo-longitudinal de récits oraux d’une séquence filmique muette. Les récits sont produits en français L2 par des adolescents berlinois, en situation de face à face avec un locuteur dont le français est la L1. Une fiche de vocabulaire est à disposition des narrateurs.
Le film vu est un extrait de Modern Times de Charlie Chaplin, plus précisément la version expérimentale conçue [...]


(applied_linguistics, language_acquisition, psycholinguistics)
French (français)
isReferencedBy http://sldr.org/publi/113/en
1725 Mb
46 files
Largest file: 109.68 Mb
Annotations of CID
ortolang-000722
(Berthille Pallaud, Laboratoire parole et langage)
Secondary data (resource)
Created 2014-11-01
Modified 2014-11-01
Laboratoire parole et langage - UMR 7309 (LPL, Aix-en-Provence FR)

Annotations auto-interruptions: A l’oral, les énoncés des locuteurs diffèrent par des auto-interruptions et la présence plus ou moins nombreuse d’éléments « étrangers » insérés dans l’énoncé : pauses silencieuses ou remplies et insertions parenthétiques (marqueurs de discours et interjections) suspendent le déroulement syntagmatique. Toutes les auto-interruptions [...]


(phonology, phonetics, speech_prosody, general_linguistics, computational_linguistics)
French
requires secondary data (resource) sldr000720 Transcriptions of CID
isRequiredBy primary data (corpus) sldr000027 Videos of CID
40 Mb
31 files
Largest file: 4.15 Mb
Projet ADys (MEG)
blri-000904
(Jean-Michel Badier, INS, Eddy Cavalli, LPC, Pascale Colé, LPC, Catherine Liégeois-Chauvel, INS, Chotiga Pattamadilok, LPL, Florence Poracchia-George, INS, Johannes Ziegler, LPC)
Primary data (corpus)
Created 2014-12-15
Modified 2014-12-15
Laboratoire de Psychologie Cognitive - UMR7290 (LPC, Marseille FR)
Laboratoire parole et langage - UMR 7309 (LPL, Aix-en-Provence FR)
Institut de Neurosciences des Systèmes (INS, Marseille FR)

Traitement morphologique, orthographique et sémantique dans la lecture des étudiants dyslexiques mis en évidence avec la MEG
Morphological, orthographic and semantic processing in reading of dyslexic university students revealed by MEG


(neurolinguistics, linguistic_theories)
French (français)
isPartOf collection blri-000835 BLRI
202786 Mb
1427 files
Largest file: 48478.93 Mb
Projet IDIOME (EEG) - Projet IDIOME (EEG)
blri-000905
(Philippe Blache, LPL, Deirdre Bolger, BLRI, Sophie Dufour, LPL, Chotiga Pattamadilok, LPL, Carlos Ramisch, LPL, Stéphane Rauzy, LPL)
Primary data (corpus)
Created 2014-12-15
Modified 2014-12-15
Laboratoire de Psychologie Cognitive - UMR7290 (LPC, Marseille FR)

Interaction between complexity and facilitation in language processing. Study of the violation of constraints idioms


(cognitive_science, neurolinguistics)
French (français)
isPartOf collection blri-000835 BLRI
58036 Mb
201 files
Largest file: 22664.85 Mb
Kavalan OntoLex
ortolang-000906
(Chu-Ren Huang, The Hong Kong Polytechnic University)
Secondary data (resource)
Created 2014-12-20
Modified 2014-12-20
Laboratoire parole et langage - UMR 7309 (LPL, Aix-en-Provence FR)

Linked Data for Endangered Languages: linking basic lexicon to shared ontology


(anthropological_linguistics, computational_linguistics, language_documentation)
Kavalan
Seediq Ontolex
ortolang-000907
(Chu-Ren Huang, The Hong Kong Polytechnic University)
Secondary data (resource)
Created 2014-12-20
Modified 2014-12-20
Laboratoire parole et langage - UMR 7309 (LPL, Aix-en-Provence FR)

Linked Data for Endangered Languages: linking basic lexicon to shared ontology


(anthropological_linguistics, computational_linguistics, language_documentation)
???
description http://sldr.org/publi/126/en
description http://sldr.org/publi/127/en
20978 bytes
4 files
Largest file: 8018 bytes
Loup garou - annotations
ortolang-000908
(Laboratoire parole et langage - UMR 7309 (LPL, Aix-en-Provence FR), Département de linguistique et phonétique générales, Université d'Aix-Marseille (Aix-en-Provence FR))
Secondary data (resource)
Created 2015-01-03
Modified 2015-01-03
This material is Open Data Laboratoire parole et langage - UMR 7309 (LPL, Aix-en-Provence FR)
Département de linguistique et phonétique générales, Université d'Aix-Marseille (Aix-en-Provence FR)

Annotations of interactions betweeen players of the ’Werewolf’ game. Students of the LEX Master course (SCLQ41) on 7 October 2014.


(applied_linguistics, phonetics, sociolinguistics)
French
requires primary data (corpus) ortolang-000900 Werewolf
37 Mb
127 files
Largest file: 3.10 Mb
MarsaTag
sldr000841
(Stéphane Rauzy, CNRS - LPL)
Tool
Created 2013-10-17
Modified 2015-01-21
This material is Open Data Laboratoire parole et langage - UMR 7309 (LPL, Aix-en-Provence FR)

Tools for textual data enrichment (written text and speech transcription) : tokenizer, morphosyntactic & POS tagger, ...


(general_linguistics, computational_linguistics)
French
(hasPart secondary data (resource) sldr000850 MarsaLex - MarsaLex)
(isPartOf collection ortolang-000901 VariAMU)
description http://hal.archives-ouvertes.fr/hal-00433879
description http://sldr.org/publi/110/en
description http://sldr.org/publi/112/en
71 Mb
14 files
Largest file: 35.83 Mb
Europe
ortolang-000909
(Cristel PORTES)
Primary data (corpus)
Created 2015-01-27
Modified 2015-01-27
Laboratoire parole et langage - UMR 7309 (LPL, Aix-en-Provence FR)

Enregistrement sur France Culture de l’émission «La Suite dans les idées» produite par Sylvain Bourmeau et diffusée le 25 février 2000 à partir du magnétophone d’une chaîne compacte privée (une seule piste). L’émission dure environ 45 minutes. Elle a pour titre «L’Europe et son élargissement, question taboue ?». Deux journalistes (un homme et une femme) conduisent le débat [...]


(general_linguistics)
French (français)
183 Mb
20 files
Largest file: 89.18 Mb
CoFee - CoFee
ortolang-000911
(Laboratoire parole et langage - UMR 7309 (LPL, Aix-en-Provence FR))
Collection
Created 2015-02-15
Modified 2015-02-15
Laboratoire parole et langage - UMR 7309 (LPL, Aix-en-Provence FR)

4 corpora used as primary dataset for the ANR JCJC “Conversational Feedback“. Some of the corpora were pre-existing while some other have been created within this project. In addition to Primary data, annotations (ELAN files) and resulting datasets (CSV files) related to conversational feedback are alao included.


hasPart primary data (corpus) sldr000027 Videos of CID
hasPart secondary data (resource) sldr000720 Transcriptions of CID
hasPart primary data (corpus) sldr000732 MAPTASK-AIX
hasPart primary data (corpus) sldr000875 Audio-visual condition of Aix Map Task
hasPart primary data (corpus) sldr000891 Aix-DVD
project http://cofee.hypotheses.org/
15785 bytes
4 files
Largest file: 5324 bytes
Annotations of grindmill songs
sldr000735
(Guy Poitevin, Hema Rairkar)
Secondary data (resource)
Created 2008-11-29
Modified 2015-02-15
This material is Open Data Centre for Cooperative Research in Social Sciences (CCRSS, Pune IN)

Annotations of grindmill songs
• Transcriptions in Devanagari and Roman Devanagari scripts
* Conversion of all databases to Unicode UTF8 (XML, TAB, CSV)
• English translations (to be completed)
• French translations (to be completed)
• Information about performers
• Information about recording places
• Melodic classification (incomplete)


(
sociolinguistics, anthropological_linguistics)
Marathi; Marathi -> Marathi, rural
(isRequiredBy primary data (corpus) sldr000717 Grindmill songs of Maharashtra)
isPartOf collection ccrss-000749 Popular cultural productions in Marathi language
979 Mb
146 files
Largest file: 481.56 Mb
CREAGEST
ortolang-000912
(Structures formelles du langage - UMR 7023 (SFL, Paris FR), Savoirs, textes et langage - UMR 8163 (STL, Lille FR), Groupe d’imagerie neurofonctionnelle - UMR 6194, CNRS/CEA/Université de Caen (Caen FR))
Collection
Created 2015-02-17
Modified 2015-02-17
Structures formelles du langage - UMR 7023 (SFL, Paris FR)
Savoirs, textes et langage - UMR 8163 (STL, Lille FR)
Groupe d’imagerie neurofonctionnelle - UMR 6194, CNRS/CEA/Université de Caen (Caen FR)

CREAGEST rassemble des discours de Langue des signes française (LSF). Il comprend deux corpus :
1) “CREAGEST - Acquisition“ est un corpus de LSF enfantine recueilli auprès de 65 enfants sourds (bilingues LSF-Français) et 17 adultes sourds (50 heures), piloté par quatre enquêtrices sourdes de quatre régions de France différentes (4 stimuli, 2 caméras);
2) “CREAGEST - Dialogue [...]


(discourse_analysis, sociolinguistics, language_acquisition)
French Sign Language
(hasPart primary data (corpus) ortolang-000916 CREAGEST - Acquisition)
(hasPart primary data (corpus) ortolang-000926 CREAGEST - Dialogue entre adultes sourds)
description http://sldr.org/publi/137/en
description http://sldr.org/publi/138/en
1 Mb
6 files
Largest file: 923183 bytes
Grindmill songs of Maharashtra
sldr000717
(Guy Poitevin, Hema Rairkar)
Primary data (corpus)
Created 2008-04-26
Modified 2015-02-21
This material is Open Data Centre for Cooperative Research in Social Sciences (CCRSS, Pune IN)

Grindmill songs of Maharashtra (India): the complete collection of the Centre for Cooperative Research in Social Sciences (CCRSS, Pune).
Original DAT cassettes (UVS-01 to UVS-55) are stored at the Centre for Cooperative Research in Social Sciences (CCRSS) in Pune (Maharashtra).
http://ccrss.org
Cassettes UVS-01 to UVS-30 have been copied and deposited at the Archive and Research Center [...]
महाराष्ट्रातील(भारत) जात्यावरील गाणी : समाजशास्त्रीय सहकारी संशोधन केंद्राने(पुणे) केलेला समग्र संग्रह


(sociolinguistics, anthropological_linguistics)
Marathi -> Marathi, rural (ग्रामीण मराठी)
requires secondary data (resource) sldr000735 Annotations of grindmill songs
project http://lpl-aix.fr/projet/67
isPartOf collection ccrss-000749 Popular cultural productions in Marathi language
description http://sldr.org/publi/35/en
isReferencedBy http://hal.archives-ouvertes.fr/hal-00256388
113336 Mb
1215 files
Largest file: 9362.79 Mb
CrowdED_english
ortolang-000913
(Andrew CAINES)
Primary data (corpus)
Created 2015-03-26
Modified 2015-03-26
Individual contribution

Crowdsourced corpus of English native speakers answering business-topic questions of the type found in language learning oral exams. Contains soundfiles and annotated transcriptions. Reported in special session on ’Advanced Crowdsourcing for Speech and Beyond’ at the INTERSPEECH Conference 2015. Funded by Crowdee and CrowdFlower.


(text_and_corpus_linguistics, applied_linguistics, language_acquisition)
English
CrowdED_bilingual
ortolang-000914
(Andrew CAINES)
Primary data (corpus)
Created 2015-03-26
Modified 2015-03-26
Individual contribution

Crowdsourced corpus of German/English bilingual speakers answering business-topic questions of the type found in language learning oral exams, in both German and English. Contains soundfiles and annotated transcriptions. Reported in special session on ’Advanced Crowdsourcing for Speech and Beyond’ at the INTERSPEECH Conference 2015. Funded by Crowdee and CrowdFlower.


German (Deutsch)
VILLA : Varieties of Initial Learners in Language Acquisition
ortolang-000915
(Sarra EL AYARI)
Primary data (corpus)
Created 2015-03-30
Modified 2015-03-30
Structures formelles du langage - UMR 7023 (SFL, Paris FR)

VILLA (ANR, DFG, NWO) investigates the very first phases of foreign language acquisition under controlled input conditions. Our learners, total beginners with five different native languages (French, German, Dutch, English, Italian) were exposed to 14 hours of target language Polish. Regular tasks in Polish documented the acquisitional path of all learners.


(language_acquisition, applied_linguistics, psycholinguistics)
Polish (język polski)
isReferencedBy https://benjamins.com/#catalog/journals/euros...
CREAGEST - Acquisition
ortolang-000916
(Structures formelles du langage - UMR 7023 (SFL, Paris FR), Savoirs, textes et langage - UMR 8163 (STL, Lille FR), Groupe d’imagerie neurofonctionnelle - UMR 6194, CNRS/CEA/Université de Caen (Caen FR))
Primary data (corpus)
Created 2015-04-08
Modified 2015-04-08
This material is Open Data Structures formelles du langage - UMR 7023 (SFL, Paris FR)
Savoirs, textes et langage - UMR 8163 (STL, Lille FR)
Groupe d’imagerie neurofonctionnelle - UMR 6194, CNRS/CEA/Université de Caen (Caen FR)

Il s’agit d’un corpus de LSF enfantine recueilli auprès de 65 enfants sourds (bilingues LSF-Français) et 17 adultes sourds (50 heures), piloté par quatre enquêtrices sourdes de quatre régions de France différentes (4 stimuli, 2 caméras).

Les vidéos mises à disposition ici constituent une sous-partie du corpus global : 10 extraits filmés avec deux caméras (soit 20 fic [...]


(discourse_analysis, sociolinguistics, language_acquisition)
(Unclassified)
isPartOf collection ortolang-000912 CREAGEST
273 Mb
27 files
Largest file: 18.87 Mb
MarsaGram
ortolang-000917
(Grégoire MOREAU DE MONTCHEUIL)
Tool
Created 2015-04-09
Modified 2015-04-09
This material is Open Data Laboratoire parole et langage - UMR 7309 (LPL, Aix-en-Provence FR)

Tool to explore treebanks.
Extract a context-free grammar (CFG) and properties over the CFG rules.
Generate HTML pages to explore the rules and their properties.
Versatile : for constituents or dependencies treebanks.


(applied_linguistics, cognitive_science, language_documentation, computational_linguistics, text_and_corpus_linguistics)
isPartOf collection lpl-000763 LPL tools
description http://sldr.org/publi/140/en
187532 bytes
6 files
Largest file: 154808 bytes
CID-DISP
ortolang-000918
(Roxane Bertrand, Laboratoire parole et langage - UMR 7309, Laurent Prévot, Laboratoire parole et langage - UMR 7309, Stéphane Rauzy, Laboratoire parole et langage - UMR 7309)
Secondary data (resource)
Created 2015-04-30
Modified 2015-04-30
Laboratoire parole et langage - UMR 7309 (LPL, Aix-en-Provence FR)

Syntactic, Prosodic, Discourse and Disfluencies annotation for the CID corpus


(general_linguistics, phonetics, phonology, speech_prosody, computational_linguistics)
isRequiredBy secondary data (resource) sldr000720 Transcriptions of CID
76 Mb
130 files
Largest file: 1.76 Mb
Corpus IPC
ortolang-000919
(Jorane SAUBESTY)
Primary data (corpus)
Created 2015-05-12
Modified 2015-05-12
Individual contribution

Corpus de formation des médecins à l’annonce d’événements indésirables graves.

Source du corpus : l’Institut Paoli-Calmettes de Marseille.

Corpus vidéo/audio avec transcriptions et annotations.


(discourse_analysis)
French (français)
isPartOf collection blri-000835 BLRI
isPartOf collection ortolang-000922 ACORFORMed
6048 Mb
39 files
Largest file: 3746.39 Mb
Lexical functions of Spanish verb-noun collocations
ortolang-000920
(Olga KOLESNIKOVA)
Secondary data (resource)
Created 2015-05-14
Modified 2015-05-14
This material is Open Data Individual contribution

The resource include first most frequent 1000 verb-noun pairs extracted automatically from the Spanish Web Corpus (located at http://www.sketchengine.co.uk). Each pair is classified as a free word combination (FWC), a collocation, or an error (of parsing). In FWCs and collocations, verbs and nouns are annotated with the Spanish WordNet senses. Each collocation is annotated with its lexical function. [...]


Spanish
isReferencedBy http://sldr.org/publi/141/en
374106 bytes
7 files
Largest file: 290304 bytes
Happy Birthday Corpus - Data
ortolang-000921
(Pauline LARROUY-MAESTRI)
Secondary data (resource)
Created 2015-05-15
Modified 2015-05-15
Individual contribution

Acoustical analyses (pitch accuracy, tempo, vocal quality) and ratings (18 music experts and 18 non experts matched in gender and age) of the 166 performances of the Happy Birthday Corpus.


isRequiredBy primary data (corpus) sldr000774 Happy Birthday corpus
664991 bytes
10 files
Largest file: 348156 bytes
ACORFORMed
ortolang-000922
(Jorane SAUBESTY)
Collection
Created 2015-05-21
Modified 2015-05-21

Corpus de formation des médecins à l’annonce d’événements indésirables graves.
Deux sources du corpus : l’Institut Paoli-Calmettes de Marseille et le CHU d’Angers.


(discourse_analysis)
(hasPart primary data (corpus) ortolang-000919 Corpus IPC)
(hasPart primary data (corpus) ortolang-000923 Corpus Aquarel Santé)
10441 bytes
5 files
Largest file: 3176 bytes
Corpus Aquarel Santé
ortolang-000923
(Jorane SAUBESTY)
Primary data (corpus)
Created 2015-05-21
Modified 2015-05-21
Individual contribution

Corpus de formation des médecins à l’annonce d’événements indésirables graves mise en place au CHU d’Angers par l’organisme Aquarel Santé.

Corpus vidéo.


(discourse_analysis)
French (français)
isPartOf collection ortolang-000922 ACORFORMed
16933 bytes
5 files
Largest file: 6641 bytes
Corpus St-Maur (corpus)
prax000925
(Praxiling - UMR 5267 (Montpellier FR))
Primary data (corpus)
Created 2015-06-09
Modified 2015-06-09
Praxiling - UMR 5267 (Montpellier FR)

Le “Corpus Saint-Maur“ a été recueilli de Mai 2008 à juin 2009 dans le cadre du travail doctoral « Interactions et pratiques d’un processus d’innovation pédagogique en environnement carcéral » (Alidières 2013).

Il est constitué d’enregistrements audiovisuels portant, d’une part, sur les activités dans l’espace informatique de la Maison d’Arrêt et, d’autre [...]


(anthropological_linguistics, applied_linguistics)
French (français)
(isRequiredBy secondary data (resource) prax000929 Corpus St-Maur (documents))
isRequiredBy secondary data (resource) prax000929 Corpus St-Maur (documents)
(isPartOf collection prax000928 Corpus St-Maur)
19608 bytes
4 files
Largest file: 7626 bytes
CREAGEST - Dialogue entre adultes sourds
ortolang-000926
(Structures formelles du langage - UMR 7023 (SFL, Paris FR), Savoirs, textes et langage - UMR 8163 (STL, Lille FR), Groupe d’imagerie neurofonctionnelle - UMR 6194, CNRS/CEA/Université de Caen (Caen FR))
Primary data (corpus)
Created 2015-06-11
Modified 2015-06-11
This material is Open Data Structures formelles du langage - UMR 7023 (SFL, Paris FR)
Savoirs, textes et langage - UMR 8163 (STL, Lille FR)
Groupe d’imagerie neurofonctionnelle - UMR 6194, CNRS/CEA/Université de Caen (Caen FR)

Il s’agit d’un corpus de dialogues entre adultes sourds (106 heures) : 51 entretiens, animés par quatre enquêteurs sourds de quatre régions de France différentes (entretiens semi-directifs, 3 caméras).

Les vidéos mises à disposition ici constituent une sous-partie du corpus global : 7 extraits filmés avec trois caméras (soit 21 fichiers).


(discourse_analysis, sociolinguistics, language_acquisition)
(Unclassified)
isPartOf collection ortolang-000912 CREAGEST
327 Mb
28 files
Largest file: 21.40 Mb
Corpus St-Maur
prax000928
(Praxiling - UMR 5267 (Montpellier FR))
Collection
Created 2015-06-25
Modified 2015-06-25
Praxiling - UMR 5267 (Montpellier FR)

Le “Corpus Saint-Maur“ a été recueilli de Mai 2008 à juin 2009 dans le cadre du travail doctoral « Interactions et pratiques d’un processus d’innovation pédagogique en environnement carcéral » (Alidières 2013).

Il est constitué d’enregistrements audiovisuels portant, d’une part, sur les activités dans l’espace informatique de la Maison d’Arrêt et, d’autre [...]


(applied_linguistics, anthropological_linguistics)
French
hasPart primary data (corpus) prax000925 Corpus St-Maur (corpus)
hasPart secondary data (resource) prax000929 Corpus St-Maur (documents)
Corpus St-Maur (documents)
prax000929
(Praxiling - UMR 5267 (Montpellier FR))
Secondary data (resource)
Created 2015-06-29
Modified 2015-06-29
Praxiling - UMR 5267 (Montpellier FR)

Documents et travaux portant sur le corpus “Corpus St-Maur“


(applied_linguistics, anthropological_linguistics)
French
requires primary data (corpus) prax000925 Corpus St-Maur (corpus)
(requires primary data (corpus) prax000925 Corpus St-Maur (corpus))
(isPartOf collection prax000928 Corpus St-Maur)
GypsyLang
prax000930
(Praxiling - UMR 5267 (Montpellier FR))
Collection
Created 2015-06-29
Modified 2015-06-29
Praxiling - UMR 5267 (Montpellier FR)

Le présent corpus « Paroles de locuteurs gitans et non gitans sur trois générations à Perpignan » a été recueilli entre septembre 2013 et juin 2014. Il est issu d’une recherche financée par un fonds social européen (FSE) en partenariat avec l’Education Nationale, la Direction de l’Action Éducative et de l’Enfance de la Ville de Perpignan, afin de mieux comprendre les mécanismes [...]


(sociolinguistics, language_acquisition, anthropological_linguistics)
hasPart secondary data (resource) prax000932 GypsyLang (documents)
hasPart primary data (corpus) prax000931 GypsyLang (corpus)
hasPart secondary data (resource) prax000933 GypsyLang (transcriptions)
GypsyLang (corpus)
prax000931
(Praxiling - UMR 5267 (Montpellier FR))
Primary data (corpus)
Created 2015-06-29
Modified 2015-06-29
Praxiling - UMR 5267 (Montpellier FR)

Données primaires du corpus “GypsyLang“


(anthropological_linguistics, sociolinguistics, language_acquisition)
Northern Catalan
isRequiredBy secondary data (resource) prax000933 GypsyLang (transcriptions)
(isPartOf collection prax000930 GypsyLang)
isRequiredBy secondary data (resource) prax000932 GypsyLang (documents)
GypsyLang (documents)
prax000932
(Praxiling - UMR 5267 (Montpellier FR))
Secondary data (resource)
Created 2015-06-29
Modified 2015-06-29
Praxiling - UMR 5267 (Montpellier FR)

Documents et travaux portant sur le corpus “GypsyLang“


(sociolinguistics, language_acquisition, anthropological_linguistics)
(isPartOf collection prax000930 GypsyLang)
(requires primary data (corpus) prax000931 GypsyLang (corpus))
GypsyLang (transcriptions)
prax000933
(Praxiling - UMR 5267 (Montpellier FR))
Secondary data (resource)
Created 2015-06-29
Modified 2015-06-29
Praxiling - UMR 5267 (Montpellier FR)

Transcriptions du corpus “GypsyLang“


(sociolinguistics, language_acquisition, anthropological_linguistics)
Catalan -> Northern Catalan
(requires primary data (corpus) prax000931 GypsyLang (corpus))
(isPartOf collection prax000930 GypsyLang)
Corpus ese’eja
ortolang-000934
(Marine VUILLERMET)
Primary data (corpus)
Created 2015-07-03
Modified 2015-07-09
Dynamique du langage - UMR 5596 (DDL, Lyon FR)

I recorded the data between 2005 and 2013, with the financial support of the following institutions: Endangered Language Documentation Programme (2007), Afrique Amérique Latine Langues En Danger (ANR AALLED, 2008-2009) et Endangered Language Fund (2008). Ese’eja is an endangered language spoken by about 1,700 people in the Bolivian and Peruvian Amazon. The recordings mostly come from speakers from [...]


(language_documentation, sociolinguistics, anthropological_linguistics)
Ese ejja
356626 Mb
29206 files
Largest file: 80948.86 Mb
Methodology and software for Semi-Automatic multi-domain annotations: annotating, exploring and sharing data
ortolang-000937
(Grégoire MOREAU DE MONTCHEUIL)
Primary data (corpus)
Created 2015-09-21
Modified 2015-09-14
This material is Open Data Individual contribution

A tutorial that presents how to collect a large set of annotations of various domains: phonetic (words, syllables, phonemes), prosody (Momel and INTSINT), syntax (categories, groups), self-repetitions and gestures. It shows that, with good practices, it is possible to merge such annotations in a single representation for sharing and exploring in an efficient way.


(applied_linguistics, computational_linguistics, text_and_corpus_linguistics)
English
28 Mb
110 files
Largest file: 14.36 Mb
SITAF (tandems anglais/français)
ortolang-000939
(Céline HORGUES)
Primary data (corpus)
Created 2015-09-23
Modified 2015-09-30
Individual contribution

Video-recorded corpus of tandem interactions between French-speaking students and English-speaking students at the University Sorbonne Nouvelle- Paris 3. Each of the 21 tandem pairs performs the collaborative speaking tasks (story-telling, debating, reading) on 2 occasions (2 recording sessions) 3 months apart. The corpus also comprises L1-L1 control interactions for the participants.
Metada is [...]


(applied_linguistics, language_acquisition, phonetics)
French; English
description http://sldr.org/publi/146/en
Projet Noms Verbes (MEG)
blri-000940
(ALARIO François-Xavier, LPC, BADIER Jean-Michel, INS, STRIJKERS Kristof, LPC, CHANOINE Valérie, BLRI)
Primary data (corpus)
Created 2015-09-30
Modified 2015-09-30
Laboratoire de Psychologie Cognitive - UMR7290 (LPC, Marseille FR)
Institut de Neurosciences des Systèmes (INS, Marseille FR)

Response preparation and encoding in word production. Testing recent models (2011-2012-2013) of word production based on psycholinguistics, motor control and neuroscience.


(cognitive_science, neurolinguistics, psycholinguistics)
French (français)
isPartOf collection blri-000835 BLRI
131208 Mb
2656 files
Largest file: 1338.93 Mb
Projet Neuroling (IRMf)
blri-000941
(FRENCK-MESTRE Cheryl, LPL, ANTON Jean-Luc, INT, BARKAT Mélissa, Praxiling, DJOURI Rym)
Primary data (corpus)
Created 2015-09-30
Modified 2015-09-30
Laboratoire parole et langage - UMR 7309 (LPL, Aix-en-Provence FR)
Institut des Neurosciences de la Timone (INT, Marseille FR)
Praxiling - UMR 5267 (Montpellier FR)

Reconnaissance de la langue arabe.


(cognitive_science, neurolinguistics)
French (français)
isPartOf collection blri-000835 BLRI
23967 bytes
4 files
Largest file: 9442 bytes
Projet Intermod (IRMf)
blri-000942
(Pattamadilok Chotiga, LPL, ZIEGLER Johannes, LPC, BELIN Pascal, INT, CHANOINE Valérie, BLRI)
Primary data (corpus)
Created 2015-09-30
Modified 2015-09-30
Laboratoire parole et langage - UMR 7309 (LPL, Aix-en-Provence FR)
Laboratoire de Psychologie Cognitive - UMR7290 (LPC, Marseille FR)
Institut des Neurosciences de la Timone (INT, Marseille FR)

The role of consciousness and top-down processes on the contribution of the auditory cortex during reading.


(neurolinguistics, cognitive_science)
French (français)
isPartOf collection blri-000835 BLRI
52777 bytes
6 files
Largest file: 26199 bytes
Projet anti-tabac (IRMf)
blri-000943
(SORIANO Alice, LPC, OULLIER Olivier, LPC)
Primary data (corpus)
Created 2015-09-30
Modified 2015-09-30
Laboratoire de Psychologie Cognitive - UMR7290 (LPC, Marseille FR)

Effet des avertissements sanitaires combinés anti-tabac sur l’activité cérébrale évoquée par des stimuli liés au tabac ou « smoking cue reactivity ».


(cognitive_science)
French (français)
isPartOf collection blri-000835 BLRI
16813 bytes
4 files
Largest file: 6438 bytes

TOTAL

82 items
1206626 Mb
531145 files
Largest file: 80948.86 Mb

2 secondes