ORTOLANG Deposit and sharing

Speech and Language Data Repository (SLDR/ORTOLANG)

Investissements d'avenir  Huma-Num  CLARIN

Open archives (OAI-PMH)

Publications (9)

SPPAS - Automatic Annotation of Speech

Tool sldr000800

ID Bibliographical reference Abstract
129Brigitte Bigi (2012).
SPPAS: a tool for the phonetic segmentations of Speech,
Language Resources and Evaluation Conference, Istanbul (Turkey), pages 1748-1755, ISBN 978-2-9517408-7-7.
http://www.lpl-aix.fr/~bigi/Doc/bigi2012lrecsppas.pdf
Tool
SPPAS - Automatic Annotation of Speech (sldr000800)
SPPAS is a tool to produce automatic annotations which inclu
de utterance, word, syllabic and phonemic segmentations fr
om a recorded speech sound and its transcription. SPPAS is distributed under the terms of the GNU Public License. It was successfully applied during the Evalita 2011 campaign, on Italian map-task dialogues. It can also deal with French, English and Chinese and there is an easy way to add other languages. The paper describes the development of resources and free tools, consisting of acoustic mode
ls, phonetic dictionaries, and libraries and programs to deal with these data. All of them are publicly available
131Brigitte Bigi (2013).
A phonetization approach for the forced-alignment task.
3rd Less-Resourced Languages workshop, 6th Language & Technology Conference, Poznan (Poland).
http://www.lpl-aix.fr/~bigi/Doc/bigi2013ltc.pdf
Tool
SPPAS - Automatic Annotation of Speech (sldr000800)
The phonetization of text corpora requires a sequence of processing steps and resources in order to convert a normalized text in its
constituent phones and then to directly exploit it by a given application. This paper presents a generic approach for text phonetization
and concentrates on the aspects of phonetizing unknown words, which serve to develop a phonetizer in the context of forced-alignement
application. It is a dictionary-based approach, which is as language-independent as possible: this approach is applied to French, English,
Vietnamese, Khmer and Pinyin for Chinese. The tool with linked resources are distributed under the terms of the GPL license.
132Brigitte Bigi (2014).
Automatic Speech Segmentation of French: Corpus Adaptation.
2nd Asian Pacific Corpus Linguistics Conference, p. 32, Hong Kong.
http://www.lpl-aix.fr/~bigi/Doc/bigi2014apclc-slides.pdf
Tool
SPPAS - Automatic Annotation of Speech (sldr000800)
135Brigitte Bigi (2014).
The SPPAS participation to Evalita 2014.
Proceedings of the First Italian Conference on Computational Linguistics CLiC-it 2014 and the Fourth International Workshop EVALITA 2014. Pisa (Italy). Editors R. Basili, A. Lenci, B. Magnini. ISBN 978-886741-472-7. Volume 2. Pages 127-130.
http://www.lpl-aix.fr/~bigi/Doc/bigi_EVALITA2014.pdf
Tool
SPPAS - Automatic Annotation of Speech (sldr000800)
SPPAS is a tool to automatically produce annotations which includes utterance, word, syllabic and phonemic segmentation from a recorded speech sound and its transcription. This paper describes the participation of SPPAS in evaluations related to the “Forced Alignment on Chil-
dren Speech” task of Evalita 2014. SPPAS is a ”user-friendly” software mainly dedicated to Linguists and open source.
136Brigitte Bigi, Caterina Petrone (2014).
A generic tool for the automatic syllabification of Italian.
Proceedings of the First Italian Conference on Computational Linguistics CLiC-it 2014 and the Fourth International Workshop EVALITA 2014. Pisa (Italy). Editors R. Basili, A. Lenci, B. Magnini. ISBN 978-886741-472-7. Volume 1. Pages 73-77.
http://www.lpl-aix.fr/~bigi/Doc/bigi_CLIC2014.pdf
Tool
SPPAS - Automatic Annotation of Speech (sldr000800)
This paper presents a rule-based automatic syllabification for Italian. Differently from previously proposed syllabifiers, our approach is more user-friendly since the Python algorithm includes both a Command-Line User and a Graphical User interfaces. Moreover, phonemes, classes and rules are listed in an external configuration file of the tool which can be easily modified by any user. Syllabification performance is consistent with manual annotation. This algorithm is included in SPPAS, a software for automatic speech segmentation, and distributed under the
terms of the GPL license.
128Brigitte Bigi, Daniel Hirst (2012).
SPeech Phonetization Alignment and Syllabification (SPPAS): a tool for the automatic analysis of speech prosody
Speech Prosody, Tongji University Press, ISBN 978-7-5608-4869-3, pages 19-22, Shanghai (China).
http://lpl-aix.fr/~bigi/Doc/bigi2012speechprosody.pdf
Tool
SPPAS - Automatic Annotation of Speech (sldr000800)
SPPAS, SPeech Phonetization Alignment and Syllabification, is a tool to automatically produce annotations which include utterance, word, syllable and phoneme segmentations from a recorded speech sound and its transcription. SPPAS is currently implemented for French, English, Italian and Chinese and there is a very simple procedure to add other languages. The tool is developed for Unix based platforms (Linux, MaxOS and Cygwin on Windows) and is specifically designed to be used directly by linguists in conjunction with other tools for the automatic analysis of speech prosody. The tools will all be
distributed under a GPL license.
130Brigitte Bigi, Daniel Hirst (2013).
What's new in SPPAS 1.5?,
Proceedins of Tools and Resources for the Analysis of Speech Prosody, Aix-en-Provence, France, Eds B. Bigi and D. Hirst, ISBN: 978-2-7466-6443-2, pp. 62-65.
http://www.lpl-aix.fr/~trasp/Proceedings/20354-trasp2013.pdf
Tool
SPPAS - Automatic Annotation of Speech (sldr000800)
During Speech Prosody 2012, we presented SPPAS, SPeech Phonetization Alignment and Syllabification, a tool to auto-
matically produce annotations which include utterance, word,
syllabic and phonemic segmentations from a recorded speech
sound and its transcription. SPPAS is open source software issued under the GNU Public License. SPPAS is multi-platform
(Linux, MacOS and Windows) and it is specifically designed to be used directly by linguists in conjunction with other tools for the automatic analysis of speech prosody. This paper presents various improvements implemented since the previously described version.
134Brigitte Bigi, Roxane Bertrand, Mathilde Guardiola (2014).
Automatic detection of other-repetition occurrences: application to French conversational speech.
9th International conference on Language Resources and Evaluation (LREC), Reykjavik (Iceland), pages 836-842. ISBN: 978-2-9517408-8-4.
http://www.lpl-aix.fr/~bigi/Doc/bigi_LREC2014_71.pdf
Tool
SPPAS - Automatic Annotation of Speech (sldr000800)
This paper investigates the discursive phenomenon called other-repetitions (OR), particularly in the context of spontaneous French dialogues. It focuses on their automatic detection and characterization. A method is proposed to retrieve automatically OR: this detection is based on rules that are applied on the lexical material only. This automatic detection process has been used to label other-repetitions on 8 dialogues of CID - Corpus of Interactional Data. Evaluations performed on one speaker are good with a F1-measure of 0.85. Retrieved OR occurrences are then statistically described: number of words, distance, etc
133Brigitte Bigi, Tatsuya Watanabe, Laurent Prévot (2014).
Representing Multimodal Linguistics Annotated Data.
9th International conference on Language Resources and Evaluation (LREC), Reykjavik (Iceland). pages 3386-3392. ISBN: 978-2-9517408-8-4.
http://www.lpl-aix.fr/~bigi/Doc/bigi_LREC2014_51.pdf
Tool
SPPAS - Automatic Annotation of Speech (sldr000800)
The question of interoperability for linguistic annotated resources requires to cover different aspects. First, it requires a representation framework making it possible to compare, and potentially merge, different annotation schema. In this paper, a general description level representing the multimodal linguistic annotations is proposed. It focuses on time and data content representation: This paper reconsiders and enhances the current and generalized representation of annotations. An XML schema of such annotations is proposed. A Python API is also proposed. This framework is implemented in a multi-platform software and distributed under the terms of the GNU Public License.