Publications
2011
Christian Raymond & Vincent Claveau
Participation de l'IRISA à DEFT 2011: expériences avec des approches d'apprentissage supervisé et non-supervisé
Défi Fouille de Texte (DEFT2011)  
Juillet, Montpellier
Abstract: This article presents the participation of IRISA TexMex team at DEFT in 2011. We participated in the two proposed tasks and all tracks. We explored different approaches. We employed specific learning techniques based on boosting over decision trees and lazy-learning together with weights from the information retrieval field. These different approaches enabled us to obtain good results since we rank first on the task of dating and we obtained an accuracy of 99% and 99.5% on the pairing task.
BibTeX:

@inproceedings{Raymond.Claveau_2011,
  author = {Christian Raymond and Vincent Claveau},
  title = {Participation de l'IRISA à DEFT 2011: expériences avec des approches d'apprentissage supervisé et non-supervisé},
  booktitle = {Défi Fouille de Texte (DEFT2011)},
  year = {2011},
  month = {Juillet},
  address = {Montpellier}
}

2010
Frédéric Béchet, Christian Raymond, Frédéric Duvert & Renato De Mori
Frame Based Interpretation Of Conversational Speech
Spoken Language Technologies Workshop  
December, Berkeley, California, U.S.A
Abstract: Two approaches to Spoken Language Understanding based on frames describing chunked knowledge are described. They are applied to the MEDIA corpus annotated in terms of concepts expressing chunks of spoken sentences. General rules of knowledge composition and inference appear to be adequate to effectively applying the application ontology for obtaining frame based representations of dialogue turns. The main difficulty appears to be the characterization of the syntactic knowledge expressing semantic links between knowledge chunks. This knowledge can be hand-crafted or automatically learned from examples. It is shown that the latter approach outperforms the former if applied to ASR error prone transcriptions.
BibTeX:

@inproceedings{Bechet.Raymond.ea_2010,
  author = {Frédéric Béchet and Christian Raymond and Frédéric Duvert and De Mori, Renato},
  title = {Frame Based Interpretation Of Conversational Speech},
  booktitle = {Spoken Language Technologies Workshop},
  year = {2010},
  month = {December},
  address = {Berkeley, California, U.S.A}
}

Julien Fayolle, Fabienne Moreau, Christian Raymond, Guillaume Gravier & Patrick Gros
CRF-based Combination of Contextual Features to Improve A Posteriori Word-level Confidence Measures
International Conference on Speech Communication and Technologies  
September, Makuari, Japan
Abstract: This paper addresses the issue of confidence measure reliability provided by automatic speech recognition systems for use in various spoken language processing applications. In this context, we propose a conditional random field (CRF)-based combination of contextual features to improve word-level confidence measures. More precisely, the method consists in combining phonetic, lexical, linguistic and semantic features to enhance confidence measures, explicitely exploiting context information. The combination is performed using CRF whose selected patterns enable to establish a precise diagnosis about the interest of individual and contextual features. Experiments, conducted the large French broadcast news corpus ESTER, demonstrate the added-value of the proposed CRF-based combination of contextual features, with a significant improvement of the normalized cross entropy and of the equal error rate.
BibTeX:

@inproceedings{Fayolle.Moreau.ea_2010,
  author = {Julien Fayolle and Fabienne Moreau and Christian Raymond and Guillaume Gravier and Patrick Gros},
  title = {CRF-based Combination of Contextual Features to Improve A Posteriori Word-level Confidence Measures},
  booktitle = {International Conference on Speech Communication and Technologies},
  year = {2010},
  month = {September},
  address = {Makuari, Japan}
}

Julien Fayolle, Fabienne Moreau, Christian Raymond & Guillaume Gravier
Reshaping Automatic Speech Transcripts for Robust High-level Spoken Document Analysis
Analytics for Noisy Unstructured Text Data  
October, Toronto, Canada
Best Student Paper Award
Abstract: High-level spoken document analysis is required in many applications seeking access to the semantic content of audio data, such as information retrieval, machine translation or automatic summarization. It is nevertheless a difficult task that is generally based on transcripts provided by an automatic speech recognition system. Unlike standard texts, transcripts belong to the category of highly noisy data because of word recognition errors that affect, in particular, very significant words such as named entities (eg. person's names, locations, organizations). Transcripts also contain specificities of spoken language that make ineffective their processing by natural language processing tools designed for texts. To overcome these issues, this paper proposes a method to reshape automatic speech transcripts for robust high-level spoken document analysis. The method consists in conceiving a new word-level confidence measure that may efficiently ensure the reliability of transcribed words, focusing on words that are relevant for high-level spoken document analysis such as named entities. The approach consists in combining different features collected from various sources of knowledge thanks to a machine learning method based on conditional random fields. In addition to standard features (morphosyntactic, linguistic and phonetic), we introduce new semantic features based on the decisions of three robust named entity recognition systems to better estimate the reliability of named entities. Experiments, conducted on the French broadcast news corpus ESTER, demonstrate the added-value of the proposed word-level confidence measure for error detection and named entity recognition, with respect to the basic confidence measure provided by an automatic speech recognition system.
BibTeX:

@inproceedings{Fayolle.Moreau.ea_2010a,
  author = {Julien Fayolle and Fabienne Moreau and Christian Raymond and Guillaume Gravier},
  title = {Reshaping Automatic Speech Transcripts for Robust High-level Spoken Document Analysis},
  booktitle = {Analytics for Noisy Unstructured Text Data},
  year = {2010},
  month = {October},
  address = {Toronto, Canada},
  note = {Best Student Paper Award}
}

Stefan Hahn, Marco Dinarelli, Christian Raymond, Fabrice Lefèvre, Patrick Lehnen, Renato De Mori, Alessandro Moschitti, Hermann Ney & Giuseppe Riccardi
Comparing Stochastic Approaches to Spoken Language Understanding in Multiple Languages
IEEE Transactions on Audio, Speech and Language Processing
vol. PP (99), pages 1,
Abstract: One of the first steps in building a spoken language understanding (SLU) module for dialogue systems is the extraction of flat concepts out of a given word sequence, usually provided by an automatic speech recognition (ASR) system. In this paper, six different modeling approaches are investigated to tackle the task of concept tagging. These methods include classical, well-known generative and discriminative methods like Finite State Transducers (FST), Statistical Machine Translation (SMT), Maximum Entropy Markov Models (MEMM), or Support Vector Machines (SVM) as well as techniques recently applied to natural language processing such as Conditional Random Fields (CRF) or Dynamic Bayesian Networks (DBN). Following a detailed description of the models, experimental and comparative results are presented on three corpora in different languages and with different complexity. The French MEDIA corpus has already been exploited during an evaluation campaign and so a direct comparison with existing benchmarks is possible. Recently collected Italian and Polish corpora are used to test the robustness and portability of the modeling approaches. For all tasks, manual transcriptions as well as ASR inputs are considered. Additionally to single systems, methods for system combination are investigated. The best performing model on all tasks is based on conditional random fields. On the MEDIA evaluation corpus, a concept error rate of 12.6% could be achieved. Here, additionally to attribute names, attribute values have been extracted using a combination of a rule-based and a statistical approach. Applying system combination using weighted ROVER with all six systems, the CER drops to 12.0%.
BibTeX:

@article{Hahn.Dinarelli.ea_2010,
  author = {Stefan Hahn and Marco Dinarelli and Christian Raymond and Fabrice Lefèvre and Patrick Lehnen and De Mori, Renato and Alessandro Moschitti and Hermann Ney and Giuseppe Riccardi},
  title = {Comparing Stochastic Approaches to Spoken Language Understanding in Multiple Languages},
  journal = {{IEEE} {T}ransactions on {A}udio, {S}peech and {L}anguage {P}rocessing},
  year = {2010},
  volume = {PP},
  number = {99},
  pages = {1},
  doi = {10.1109/TASL.2010.2093520}

}

Christian Raymond & Julien Fayolle
Reconnaissance robuste d'entités nommées sur de la parole transcrite automatiquement
Traitement Automatique des Langues Naturelles  
July, Montréal, Canada
Abstract: Automatic speech transcripts are an important, but noisy, ressource to index spoken multimedia documents (e.g. broadcast news). In order to improve both indexation and information retrieval, extracting semantic information from these erroneous transcripts is an interesting challenge. Among these meaningful contents, there are named entities (e.g. names of persons) which are the subject of this work. Traditional named entity taggers are based on manual and formal grammars. They obtain correct performance on text or clean manual speech transcripts, but they have a lack of robustness when applied on automatic transcripts. We are introducing, in this work, three methods for named entity recognition based on machine learning algorithms, namely conditional random fields, support vector machines, and finite state transducers. We are also introducing a method to make consistant the training data when they are annotated with slightly different conventions. We show that our tagger systems are among the most robust when applied to the evaluation data of the French ESTER 2 campaign in the most difficult conditions where transcripts are particularly noisy.
BibTeX:

@inproceedings{Raymond.Fayolle_2010,
  author = {Christian Raymond and Julien Fayolle},
  title = {Reconnaissance robuste d'entités nommées sur de la parole transcrite automatiquement},
  booktitle = {Traitement Automatique des Langues Naturelles},
  year = {2010},
  month = {July},
  address = {Montréal, Canada}
}

Christophe Servan, Nathalie Camelin, Christian Raymond, Frédéric Béchet & Renato De Mori
On The Use Of Machine Translation For Spoken Language Understanding Portability
Proceedings of the International Conference on Acoustic Speech and Signal Processing  
pages 5330 -5333, Dallas, Texas, USA
Abstract: Across language portability of a spoken language understanding system (SLU) deals with the possibility of reusing with moderate effort in a new language knowledge and data acquired for another language.
The approach proposed in this paper is motivated by the availability of the fairly large MEDIA corpus carefully transcribed in French and semantically annotated in terms of constituents. A method is proposed for manually translating a portion of the training set for training an automatic machine translation (MT) system to be used for translating the remaining data. As the source language is annotated in terms of concept tags, a solution is presented for automatically transferring these tags to the translated corpus. Experimental results are presented on the accuracy of the translation expressed with the BLEU score as function of the size of the training corpus. It is shown that the process leads to comparable concept error rates in the two languages making the proposed approach suitable for SLU portability across languages.
BibTeX:

@inproceedings{Servan.Camelin.ea_2010,
  author = {Christophe Servan and Nathalie Camelin and Christian Raymond and Frédéric Béchet and De Mori, Renato},
  title = {On The Use Of Machine Translation For Spoken Language Understanding Portability},
  booktitle = {{P}roceedings of the {I}nternational {C}onference on {A}coustic {S}peech and {S}ignal {P}rocessing},
  year = {2010},
  pages = {5330 -5333},
  address = {Dallas, Texas, USA},
  doi = {10.1109/ICASSP.2010.5494960}

}

2008
Stefan Hahn, Patrick Lehnen, Christian Raymond & Hermann Ney
A Comparison of Various Methods for Concept Tagging for Spoken Language Understanding
Proceedings of the Language Resources and Evaluation Conference  
May, Marrakech, Morocco
Abstract: The extraction of flat concepts out of a given word sequence is usually one of the first steps in building a spoken language understanding (SLU) or dialogue system. This paper explores five different modelling approaches for this task and presents results on a state-of-the art corpus. Additionally, two log-linear modelling approaches could be further improved by adding morphologic knowledge. This paper goes beyond what has been reported in the literature, e.g. in (Raymond & Riccardi 07). We applied the models on the same training and testing data and used the NIST scoring toolkit to evaluate the experimental results to ensure identical conditions for each of the experiments and the comparability of the results.
BibTeX:

@inproceedings{Hahn.Lehnen.ea_2008,
  author = {Stefan Hahn and Patrick Lehnen and Christian Raymond and Hermann Ney},
  title = {A Comparison of Various Methods for Concept Tagging for Spoken Language Understanding},
  booktitle = {Proceedings of the Language Resources and Evaluation Conference },
  year = {2008},
  month = {May},
  address = {Marrakech, Morocco}
}

Christian Raymond & Giuseppe Riccardi
Learning with Noisy Supervision for Spoken Language Understanding
Proceedings of the International Conference on Acoustic Speech and Signal Processing  
pages 4989-4992, Las Vegas, USA
Abstract: Data-driven Spoken Language Understanding (SLU) systems need semantically annotated data which are expensive, time consuming and prone to human errors. Active learning has been successfully applied to automatic speech recognition and utterance classification. In general, corpora annotation for SLU involves such tasks as sentence segmentation, chunking or frame labeling and predicate-argument annotation. In such cases human annotations are subject to errors increasing with the annotation complexity. We investigate two alternative noise-robust active learning strategies that are either data-intensive or supervision-intensive. The strategies detect likely erroneous examples and improve significantly the SLU performance for a given labeling cost. We apply uncertainty based active learning with conditional random fields on the concept segmentation task for SLU. We perform annotation experiments on two databases, namely ATIS (English) and Media (French). We show that our noise-robust algorithm could improve the accuracy up to 6% (absolute) depending on the noise level and the labeling cost.
BibTeX:

@inproceedings{Raymond.Riccardi_2008,
  author = {Christian Raymond and Giuseppe Riccardi},
  title = {Learning with Noisy Supervision for Spoken Language Understanding},
  booktitle = {{P}roceedings of the {I}nternational {C}onference on {A}coustic {S}peech and {S}ignal {P}rocessing},
  year = {2008},
  pages = {4989-4992},
  address = {Las Vegas, USA},
  doi = {10.1109/ICASSP.2008.4518778}

}

Christian Raymond, Kepa Joseba Rodriguez & Giuseppe Riccardi
Active Annotation in the LUNA Italian Corpus of Spontaneous Dialogues
Proceedings of the Language Resources and Evaluation Conference  
May, Marrakech, Morocco.
Abstract: In this paper we present an active approach to annotate with lexical and semantic labels an Italian corpus of conversational human-human and Wizard-of-Oz dialogues. This procedure consists in the use of a machine learner to assist human annotators in the labeling task. The computer assisted process engages human annotators to check and correct the automatic annotation rather than starting the annotation from un-annotated data. The active learning procedure is combined with an annotation error detection to control the reliablity of the annotation. With the goal of converging as fast as possible to reliable automatic annotations minimizing the human effort, we follow the active learning paradigm, which selects for annotation the most informative training examples required to achieve a better level of performance. We show that this procedure allows to quickly converge on correct annotations an thus minimize the cost of human supervision.
BibTeX:

@inproceedings{Raymond.Rodriguez.ea_2008,
  author = {Christian Raymond and Kepa Joseba Rodriguez and Giuseppe Riccardi},
  title = {Active Annotation in the LUNA Italian Corpus of Spontaneous Dialogues},
  booktitle = {Proceedings of the Language Resources and Evaluation Conference },
  year = {2008},
  month = {May},
  address = {Marrakech, Morocco.}
}

Christian Raymond & Kepa Joseba Rodriguez
Annotation dynamique dans le corpus italien de dialogues spontanés LUNA
Journées d'Études sur la Parole  
Juin, Avignon, France
Abstract: In the context of the LUNA project, this paper presents the semantic annotation procedure we are following on an italian corpus. This corpus consists in human-human spontaneous dialogues recorded in the call center of the help desk facility of the Consortium for Information Systems of the Piedmont. The aim of our semantic annotation procedure is to speed up and make more reliable the manual annotation of the corpus. This procedure consists in using a statistical learner to annotate automatically at the semantic level transcribed files and to generate automatically annotated files in the input format of the annotation tool : human annotators have just to check and correct these annotations instead of starting from scratch. In order to converge as fast as possible to reliable automatic annotations and so minimizing the human effort, this procedure follows the active learning paradigm. The active learning procedure is coupled with an annotation error detection to assure more reliable annotation.
BibTeX:

@inproceedings{Raymond.Rodriguez_2008,
  author = {Christian Raymond and Kepa Joseba Rodriguez},
  title = {Annotation dynamique dans le corpus italien de dialogues spontanés LUNA},
  booktitle = {Journées d'Études sur la Parole},
  year = {2008},
  month = {Juin},
  address = {Avignon, France}
}

2007
Alessandro Moschitti, Giuseppe Riccardi & Christian Raymond
Spoken Language Understanding With Kernels For Syntactic/Semantic Structures
Proceedings IEEE Workshop on Automatic Speech Recognition and Understanding  
Kyoto, Japan
Abstract: Automatic concept segmentation and labeling are the fundamental problems of Spoken Language Understanding in dialog systems. Such tasks are usually approached by using generative or discriminative models based on n-grams. As the uncertainty or ambiguity of the spoken input to dialog system increase, we expect to need dependencies beyond n-gram statistics. In this paper, a general purpose statistical syntactic parser is used to detect syntactic/semantic dependencies between concepts in order to increase the accuracy of sentence segmentation and concept labeling. The main novelty of the approach is the use of new tree kernel functions which encode syntactic/semantic structures in discriminative learning models. We experimented with Support VectorMachines and the above kernels on the standard ATIS dataset. The proposed algorithmautomatically parses natural language text with offthe-shelf statistical parser and labels the syntactic (sub)trees with concept labels. The results show that the proposedmodel is very accurate and competitive with respect to state-of-the art models when combined with n-gram based models.
BibTeX:

@inproceedings{Moschitti.Riccardi.ea_2007,
  author = {Alessandro Moschitti and Giuseppe Riccardi and Christian Raymond},
  title = {Spoken Language Understanding With Kernels For Syntactic/Semantic Structures},
  booktitle = {{P}roceedings {IEEE} {W}orkshop on {A}utomatic {S}peech {R}ecognition and {U}nderstanding},
  year = {2007},
  address = {Kyoto, Japan},
  doi = {10.1109/ASRU.2007.4430106}

}

Christian Raymond, Frédéric Béchet, Nathalie Camelin, Renato De Mori & Géraldine Damnati
Sequential decision strategies for machine interpretation of speech
IEEE Transactions on Audio, Speech and Language Processing
vol. 15 pages 162-171,
Abstract: Recognition errors made by Automatic Speech Recognition (ASR) systems may not prevent the development of useful dialogue applications if the interpretation strategy has an introspection capability for evaluating the reliability of the results. This paper proposes an interpretation strategy which is particularly effective when applications are developped with a training corpus of moderate size. From the lattice of word hypotheses generated by an ASR system, a short list of conceptual structures is obtained with a set of Finite State Machines (FSM). Interpretation or a rejection decision is then performed by a treebased strategy. The nodes of the tree correspond to elaboration decision units containing a redundant set of classifiers. A decision tree based and two large margin classifiers are trained with a development set to become interpretation knowledge sources. Discriminative training of the classifiers selects linguistic and confidence based features for contributing to a cooperative assessment of the reliability of an interpretation. Such an assessment leads to the definition of a limited number of reliability states. The probability, that a proposed interpretation is correct, is provided by its reliability state and transmitted to the dialogue manager. Experimental results are presented for a telephone service application.
BibTeX:

@article{Raymond.Bechet.ea_2007,
  author = {Christian Raymond and Frédéric Béchet and Nathalie Camelin and De Mori,Renato and Géraldine Damnati},
  title = {Sequential decision strategies for machine interpretation of speech},
  journal = {{IEEE} {T}ransactions on {A}udio, {S}peech and {L}anguage {P}rocessing},
  year = {2007},
  volume = {15},
  pages = {162-171},
  doi = {10.1109/TASL.2006.876862}

}

Christian Raymond, Giuseppe Riccardi, Kepa Joseba Rodriguez & Joanna Wiśniewska
LUNA Corpus: an Annotation Scheme for a Multi-domain Multi-lingual Dialogue Corpus
Workshop on the Semantics and Pragmatics of Dialogue, DECALOG'2007  
May, Rovereto, Italy
Abstract: The LUNA corpus is a multi-domain multilingual dialogue corpus currently under development. The corpus will be annotated at multiple levels to include annotations of syntactic, semantic and discourse information and used to develop a robust natural spoken language understanding toolkit for multilingual dialogue services.
BibTeX:

@inproceedings{Raymond.Riccardi.ea_2007,
  author = {Christian Raymond and Giuseppe Riccardi and Kepa Joseba Rodriguez and Joanna Wi\'{s}niewska},
  title = {LUNA Corpus: an Annotation Scheme for a Multi-domain Multi-lingual Dialogue Corpus},
  booktitle = {Workshop on the Semantics and Pragmatics of Dialogue, DECALOG'2007},
  year = {2007},
  month = {May},
  address = {Rovereto, Italy}
}

Christian Raymond & Giuseppe Riccardi
Generative and Discriminative Algorithms for Spoken Language Understanding
International Conference on Speech Communication and Technologies  
August, Antwerp, Belgium
Abstract: Spoken Language Understanding (SLU) for conversational systems (SDS) aims at extracting concept and their relations from spontaneous speech. Previous approaches to SLU have modeled concept relations as stochastic semantic networks ranging from generative approach to discriminative. As spoken dialog systems complexity increases, SLU needs to perform understanding based on a richer set of features ranging from a-priori knowledge, long dependency, dialog history, system belief, etc. This paper studies generative and discriminative approaches to modeling the sentence segmentation and concept labeling. We evaluate algorithms based on Finite State Transducers (FST) as well as discriminative algorithms based on Support Vector Machine sequence classifier based and Conditional Random Fields (CRF). We compare them in terms of concept accuracy, generalization and robustness to annotation ambiguities. We also show how non-local non-lexical features (e.g. a-priori knowledge) can be modeled with CRF which is the best performing algorithm across tasks. The evaluation is carried out on two SLU tasks of different complexity, namely ATIS and MEDIA corpora.
BibTeX:

@inproceedings{Raymond.Riccardi_2007,
  author = {Christian Raymond and Giuseppe Riccardi},
  title = {Generative and Discriminative Algorithms for Spoken Language Understanding},
  booktitle = {International Conference on Speech Communication and Technologies},
  year = {2007},
  month = {August},
  address = {Antwerp, Belgium}
}

Kepa Joseba Rodriguez, Stefanie Dipper, Michael Götze, Massimo Poesio, Giuseppe Riccardi, Christian Raymond & Joanna Rabiega-Wiśniewska
Standoff Coordination for Multi-Tool Annotation in a Dialogue Corpus
Linguistic Annotation Workshop  
Prague
Abstract: The LUNA corpus is a multi-lingual, multidomain spoken dialogue corpus currently under development that will be used to develop a robust natural spoken language understanding toolkit for multilingual dialogue services. The LUNA corpus will be annotated at multiple levels to include annotations of syntactic, semantic, and discourse information; specialized annotation tools will be used for the annotation at each of these levels. In order to synchronize these multiple layers of annotation, the PAULA standoff exchange format will be used. In this paper, we present the corpus and its PAULA-based architecture.
BibTeX:

@inproceedings{Rodriguez.Dipper.ea_2007,
  author = {Kepa Joseba Rodriguez and Stefanie Dipper and Michael Götze and Massimo Poesio and Giuseppe Riccardi and Christian Raymond and Joanna Rabiega-Wi\'{s}niewska},
  title = {Standoff Coordination for Multi-Tool Annotation in a Dialogue Corpus},
  booktitle = {Linguistic Annotation Workshop},
  year = {2007},
  address = {Prague}
}

2006
Christian Raymond, Frédéric Béchet, Renato De Mori & Géraldine Damnati
On the use of finite state transducers for semantic interpretation
Speech Communication
vol. 48 (3-4), pages 288-304, March-April
Abstract: A Spoken Language Understanding (SLU) system is described. It generates hypotheses of conceptual constituents with a translation process. This process is performed by Finite State Transducers (FST) which accept word patterns from a lattice of word hypotheses generated by an Automatic Speech Recognition (ASR) system. FSTs operate in parallel and may share word hypotheses at their input. Semantic hypotheses are obtained by composition of compatible translations under the control of composition rules. Interpretation hypotheses are scored by the sum of the posterior probabilities of paths in the lattice of word hypotheses supporting the interpretation. A compact structured n-best list of interpretation is obtained and used by the SLU interpretation strategy.
BibTeX:

@article{Raymond.Bechet.ea_2006,
  author = {Christian Raymond and Frédéric Béchet and De Mori, Renato and Géraldine Damnati},
  title = {On the use of finite state transducers for semantic interpretation},
  journal = {Speech Communication},
  year = {2006},
  volume = {48},
  number = {3-4},
  pages = {288-304},
  month = {March-April},
  doi = {10.1016/j.specom.2005.06.012}

}

Christophe Servan, Christian Raymond, Frédéric Béchet & Pascal Nocéra
Conceptual decoding from word lattices: application to the spoken dialogue corpus MEDIA
Proceedings of the International Conference on Spoken Language Processing  
Pittsburgh, USA
Abstract: Within the framework of the French evaluation program MEDIA on spoken dialogue systems, this paper presents the methods proposed at the LIA for the robust extraction of basic conceptual constituents (or concepts) from an audio message. The conceptual decoding model proposed follows a stochastic paradigm and is directly integrated into the Automatic Speech Recognition (ASR) process. This approach allows us to keep the probabilistic search space on sequences of words produced by the ASR module and to project it to a probabilistic search space of sequences of concepts. This paper presents the first ASR results on the French spoken dialogue corpus MEDIA, available through ELDA. The experiments made on this corpus show that the performance reached by our approach is better than the traditional sequential approach that looks first for the best sequence of words before looking for the best sequence of concepts.
BibTeX:

@inproceedings{Servan.Raymond.ea_2006,
  author = {Christophe Servan and Christian Raymond and Frédéric Béchet and Pascal Nocéra},
  title = {Conceptual decoding from word lattices: application to the spoken dialogue corpus MEDIA},
  booktitle = {{P}roceedings of the {I}nternational {C}onference on {S}poken {L}anguage {P}rocessing},
  year = {2006},
  address = {Pittsburgh, USA}
}

Christophe Servan, Christian Raymond, Frédéric Béchet & Pascal Nocéra
Décodage conceptuel à partir de graphes de mots sur le corpus de dialogue Homme-Machine MEDIA
Journées d'Études sur la Parole (JEP)  
Juin, Dinard, France
Abstract: Within the framework of the French evaluation program MEDIA on spoken dialogue systems, this paper presents the methods proposed at the LIA for the robust extraction of basic conceptual constituents (or concepts) from an audio message. The conceptual decoding model proposed follows a stochastic paradigm and is directly integrated into the Automatic Speech Recognition (ASR) process. This approach allows us to keep the probabilistic search space on sequences of words produced by the ASR module and to project it to a probabilistic search space of sequences of concepts. The experiments carried on on the MEDIA corpus show that the performance reached by our approach is better than the traditional sequential approach that looks first for the best sequence of words before looking for the best sequence of concepts.
BibTeX:

@inproceedings{Servan.Raymond.ea_2006a,
  author = {Christophe Servan and Christian Raymond and Frédéric Béchet and Pascal Nocéra},
  title = {Décodage conceptuel à partir de graphes de mots sur le corpus de dialogue Homme-Machine MEDIA},
  booktitle = {Journées d'Études sur la Parole (JEP)},
  year = {2006},
  month = {Juin},
  address = {Dinard, France}
}

2005
Christian Raymond, Frédéric Béchet, Nathalie Camelin, Renato De Mori & Géraldine Damnati
Semantic Interpretation With Error Correction
Proceedings of the International Conference on Acoustic Speech and Signal Processing  
vol.1, pages 29-32, Philadelphie, USA
Abstract: This paper presents a semantic interpretation strategy, for Spoken Dialogue Systems, including an error correction process. Semantic interpretations output by the Spoken Understanding module may be incorrect, but some semantic components may be correct. A set of situations will be introduced, describing semantic confidence based on the agreement of semantic interpretations proposed by different classification methods. The interpretation strategy considers, with the highest priority, the validation of the interpretation arising from the most likely sequence of words. If the probability, given by our confidence score model, that this interpretation is not correct is high, then possible corrections of it are considered using the other sequences in the N-best lists of possible interpretations. This strategy is evaluated on a dialogue corpus provided by France Telecom R&D and collected for a tourism telephone service. Significant reduction in understanding error rate are obtained as well as powerful new confidence measures.
BibTeX:

@inproceedings{Raymond.Bechet.ea_2005,
  author = {Christian Raymond and Frédéric Béchet and Nathalie Camelin and De Mori, Renato and Géraldine Damnati},
  title = {Semantic Interpretation With Error Correction},
  booktitle = {{P}roceedings of the {I}nternational {C}onference on {A}coustic {S}peech and {S}ignal {P}rocessing},
  year = {2005},
  volume = {1},
  pages = {29-32},
  address = {Philadelphie, USA},
  doi = {10.1109/ICASSP.2005.1415042}

}

Christian Raymond
Décodage conceptuel: co-articulation des processus de transcription et compréhension dans les systèmes de dialogue
School: Universite d'Avignon et des Pays du Vaucluse  
Décembre, Avignon
Abstract: In spoken dialog systems, the process of understanding consists in building a semantic representation from some elementary semantic units called concept. We propose in this document a SLU (Spoken Language Understanding) module. First we introduces a conceptual language model for the detection and the extraction of semantic basic concepts from a speech signal. A decoding process is described with a simple example. This decoding process extracts, from a word lattice generated by an Automatic Speech Recognition (ASR) module, a structured n-best list of interpretations (set of concepts). This list contains all the interpretations that can be found in the word lattice, with their posterior probabilities, and the n-best values for each interpretation.
Then we introduces some confidence measures used to estimate the quality of the result of the previous decoding process. Finally, we describes the integration of the proposed SLU module in a dialogue application, involving a decision strategy based on the confidence measures introduced before.
BibTeX:

@phdthesis{Raymond_2005,
  author = {Christian Raymond},
  title = {Décodage conceptuel: co-articulation des processus de transcription et compréhension dans les systèmes de dialogue},
  school = {Universite d'Avignon et des Pays du Vaucluse},
  year = {2005},
  month = {Décembre},
  address = {Avignon}
}

2004
Christian Raymond, Fréderic Béchet, Renato De Mori & Géraldine Damnati
Stratégie de décodage conceptuel pour les applications de dialogue oral
XXVième Journées d'Études sur la parole (JEP)  
19-22 Avril, Fès, Maroc
Abstract: The approach proposed in this paper is an alternative to the traditional sequential architecture of Spoken Dialogue Systems where transcribing and understanding a speech signal are two separate processes. By representing all the conceptual structures handled by the Dialogue Manager by Finite State Machines and by building a conceptual model that contains all the possible interpretations at a given dialogue state, we propose a decoding architecture that search first for the best conceptual interpretations before looking for the best strings of words. The output of this process is a structured n-best list of hypotheses, at the concept and word levels. Several confidence measures are then used in order to rescore and select a candidate from this list. This paper reports significant understanding error rate reduction on a tourist inquiry application developped by France Telecom R&D.
BibTeX:

@inproceedings{Raymond.Bechet.ea_2004,
  author = {Christian Raymond and Fréderic Béchet and De Mori, Renato and Géraldine Damnati},
  title = {Stratégie de décodage conceptuel pour les applications de dialogue oral},
  booktitle = {XXVième Journées d'Études sur la parole (JEP)},
  year = {2004},
  month = {19-22 Avril},
  address = {Fès, Maroc}
}

Christian Raymond, Frédéric Béchet, Renato De Mori & Géraldine Damnati
On the Use of Confidence for Statistical Decision in Dialogue Strategies
Proceedings of the 5th SIGdial Workshop on Discourse and Dialogue  
pages 102--107, April 30 - May 1, Cambridge, Massachusetts, USA
Abstract: This paper describes an interpretation and decision strategy that minimizes interpretation errors and perform dialogue actions which may not depend on the hypothesized concepts only, but also on confidence of what has been recognized. The concepts introduced here are applied in a system which integrates language and interpretation models into Stochastic Finite State Transducers (SFST). Furthermore, acoustic, linguistic and semantic confidence measures on the hypothesized word sequences are made available to the dialogue strategy. By evaluating predicates related to these confidence measures, a decision tree automatically learn a decision strategy for rescoring a n-best list of candidates representing a user's utterance. The different actions that can be then performed are chosen according to the confidence scores given by the tree.
BibTeX:

@inproceedings{Raymond.Bechet.ea_2004a,
  author = {Christian Raymond and Frédéric Béchet and De Mori, Renato and Géraldine Damnati},
  title = {On the Use of Confidence for Statistical Decision in Dialogue Strategies},
  booktitle = {{P}roceedings of the 5th SIGdial Workshop on Discourse and Dialogue},
  publisher = {Association for Computational Linguistics},
  year = {2004},
  pages = {102--107},
  month = {April 30 - May 1},
  address = {Cambridge, Massachusetts, USA}
}

Christian Raymond, Fréderic Béchet, Renato De Mori, Géraldine Damnati & Yannick Estève
Automatic learning of interpretation strategies for spoken dialogue systems
Proceedings of the International Conference on Acoustic Speech and Signal Processing  
vol.1, pages 425-428, Montreal, Canada
Abstract: This paper proposes a new application of automatically trained decision trees to derive the interpretation of a spoken sentence. A new strategy for building structured cohorts of candidates is also described. By evaluating predicates related to the acoustic confidence of the words expressing a concept, the linguistic and semantic consistency of candidates in the cohort and the rank of a candidate within a cohort, the decision tree automatically learn a decision strategy for rescoring or rejecting a n-best list of candidates representing a user's utterance. A relative reduction of 18.6% in the Understanding Error Rate is obtained by our rescoring strategy with no utterance rejection and a relative reduction of 43.1% of the same error rate is achieve with a rejection rate of only 8% of the utterances.
BibTeX:

@inproceedings{Raymond.Bechet.ea_2004b,
  author = {Christian Raymond and Fréderic Béchet and De Mori, Renato and Géraldine Damnati and Yannick Estève},
  title = {Automatic learning of interpretation strategies for spoken dialogue systems},
  booktitle = {{P}roceedings of the {I}nternational {C}onference on {A}coustic {S}peech and {S}ignal {P}rocessing},
  year = {2004},
  volume = {1},
  pages = {425-428},
  address = {Montreal, Canada},
  doi = {10.1109/ICASSP.2004.1326013}

}

2003
Yannick Estève, Christian Raymond, Frédéric Béchet & Renato De Mori
Conceptual Decoding for Spoken Dialog Systems
Proceedings of European Conference on Speech Communication and Technology  
pages 3033-3036, Geneva, Switzerland
Abstract: A search methodology is proposed for performing conceptual decoding process. Such a process provides the best sequence of word hypotheses according to a set of conceptual interpretations. The resulting models are combined in a network of Stochastic Finite State Transducers. This approach is a framework that tries to bridge the gap between speech recognition and speech understanding processes. Indeed, conceptual interpretations are generated according to both a semantic representation of the task and a system t belief which evolves according to the dialogue states. Preliminary experiments on the detection of semantic entities (mainly named entities) in a dialog application have shown that interesting results can be obtained even if the Word Error Rate is pretty high.
BibTeX:

@inproceedings{Esteve.Raymond.ea_2003,
  author = {Yannick Estève and Christian Raymond and Frédéric Béchet and De Mori, Renato},
  title = {Conceptual Decoding for Spoken Dialog Systems},
  booktitle = {{P}roceedings of {E}uropean {C}onference on {S}peech {C}ommunication and {T}echnology},
  year = {2003},
  pages = {3033-3036},
  address = {Geneva, Switzerland}
}

Yannick Estève, Christian Raymond, Renato De Mori & David Janiszek
On the use of linguistic consistency in systems for human-computer dialogs
IEEE Transactions on Speech and Audio Processing
vol. 11 (6), pages 746-756, November
Abstract: This paper introduces new recognition strategies based on reasoning about results obtained with different Language Models (LMs). Strategies are built following the conjecture that the consensus among the results obtained with different models gives rise to different situations in which hypothesized sentences have different word error rates (WER) and may be further processed with other LMs. New LMs are built by data augmentation using ideas from latent semantic analysis and trigram analogy. Situations are defined by expressing the consensus among the recognition results produced with different LMs and by the amount of unobserved trigrams in the hypothesized sentence. The diagnostic power of the use of observed trigrams or their corresponding class trigrams is compared with that of situations based on values of sentence posterior probabilities. In order to avoid or correct errors due to syntactic inconsistence of the recognized sentence, automata, obtained by explanationbased learning, are introduced and used in certain conditions. Semantic Classification Trees are introduced to provide sentence patterns expressing constraints of long distance syntactic coherence. Results on a dialogue corpus provided by France Telecom R&D have shown that starting with a WER of 21.87% on a test set of 1422 sentences, it is possible to subdivide the sentences into three sets characterized by automatically recognized situations. The first one has a coverage of 68% with a WER of 7.44%. The second one has various types of sentences with a WER around 20%. The third one contains 13% of the sentences that should be rejected with a WER around 49%. The second set characterizes sentences that should be processed with particular care by the dialogue interpreter with the possibility of asking a confirmation from the user.
BibTeX:

@article{Esteve.Raymond.ea_2003a,
  author = {Yannick Estève and Christian Raymond and De Mori, Renato and David Janiszek},
  title = {On the use of linguistic consistency in systems for human-computer dialogs},
  journal = {{IEEE} {T}ransactions on {S}peech and {A}udio {P}rocessing},
  year = {2003},
  volume = {11},
  number = {6},
  pages = {746-756},
  month = {November},
  doi = {10.1109/TSA.2003.818318}

}

Christian Raymond, Yannick Estève, Fréderic Béchet, Renato De Mori & Géraldine Damnati
Belief confirmation in Spoken Dialogue Systems using confidence measures
Proceedings IEEE Workshop on Automatic Speech Recognition and Understanding  
St. Thomas, US-Virgin Islands
Abstract: The approach proposed is an alternative to the traditional architecture of Spoken Dialogue Systems where the system belief is either not taken into account during the Automatic Speech Recognition process or included in the decoding process but never challenged. By representing all the conceptual structures handled by the Dialogue Manager by Finite State Machines and by building a conceptual model that contains all the possible interpretations of a given wordgraph, we propose a decoding architecture that searches first for the best conceptual interpretation before looking for the best string of words. Once both N-best sets (at the concept level and at the word level) are generated, a verification process is performed on each N-best set using acoustic and linguistic confidence measures. A first selection strategy that does not include for the moment the Dialogue context is proposed and significant error reduction on the understanding measures are obtained.
BibTeX:

@inproceedings{Raymond.Esteve.ea_2003,
  author = {Christian Raymond and Yannick Estève and Fréderic Béchet and De Mori, Renato and Géraldine Damnati},
  title = {Belief confirmation in Spoken Dialogue Systems using confidence measures},
  booktitle = {{P}roceedings {IEEE} {W}orkshop on {A}utomatic {S}peech {R}ecognition and {U}nderstanding},
  year = {2003},
  address = {St. Thomas, US-Virgin Islands},
  doi = {10.1109/ASRU.2003.1318420}

}

Christian Raymond
Mesures de confiance pour la reconnaissance de la parole dans des applications de dialogue homme-machine
Majecstic  
Octobre, Marseille, France
Abstract: Dans les applications de dialogue homme-machine, l'interprétation sémantique de la phrase exprimée par un utilisateur est effectuée sur la transcription générée par le moteur de reconnaissance de la parole. Cette reconnaissance n'est pas optimale, et la qualité de l'interprétation sémantique de la phrase est bien sur très dépendante de la qualité de la reconnaissance. Ce papier introduit des mesures de confiance sur des hypothèses de reconnaissance de parole afin de pouvoir prédire ou estimer la qualité de la reconnaissance afin d'en informer le module de gestion de dialogue qui pourra choisir le type de stratégie a appliquer : si la phrase a un indice de confiance correct, elle peut être passée au module d'interprétation ; si elle a un indice de confiance très faible, le module de dialogue peut choisir de la rejeter et de demander une répétition a l'utilisateur ; dans les autres cas, des méthodes spécifiques peuvent éventuellement être appliquées afin de tenter de les corriger.
BibTeX:

@inproceedings{Raymond_2003,
  author = {Christian Raymond},
  title = {Mesures de confiance pour la reconnaissance de la parole dans des applications de dialogue homme-machine},
  booktitle = {Majecstic},
  year = {2003},
  month = {Octobre},
  address = {Marseille, France}
}

2002
Renato De Mori, Yannick Estève & Christian Raymond
On the use of structures in language models for dialogue
Proceedings of the International Conference on Spoken Language Processing  
pages 929-932, Denver, Colorado, USA
Abstract: The paper describes the combined use of three new language modelling paradigms. They are: generation of plausible trigrams by analogy, explanation-based generation of error-correcting automata, and disambiguation using Semantic Classification Trees. Tangible word error rate reduction is observed by the combined use of these paradigms.
BibTeX:

@inproceedings{DeMori.Esteve.ea_2002,
  author = {De Mori, Renato and Yannick Estève and Christian Raymond},
  title = {On the use of structures in language models for dialogue},
  booktitle = {{P}roceedings of the {I}nternational {C}onference on {S}poken {L}anguage {P}rocessing},
  year = {2002},
  pages = {929-932},
  address = {Denver, Colorado, USA}
}

Yannick Estève, Christian Raymond & Renato De Mori
On the use of structure in language models for dialogue : specific solutions for specific problems
ISCA Tutorial and Research Workshop on Multi-Modal Dialogue in Mobile Environments  
June, Kloster Irsee, Germany
Abstract: Availability of large corpora for training language models to develop dialogue systems is rare. Fortunately, for specific dialogue application, many sentences follow a limited number of typical patterns. In a language like French, frequent errors are due to homophones.Three paradigms are proposed in this paper to rescore a trellis of hypothesized words. They are based on sentence patterns detected in the most likely sentence hypothesized in a first recognition phase.
BibTeX:

@inproceedings{Esteve.Raymond.ea_2002,
  author = {Yannick Estève and Christian Raymond and De Mori, Renato},
  title = {On the use of structure in language models for dialogue : specific solutions for specific problems},
  booktitle = {{ISCA} {T}utorial and {R}esearch {W}orkshop on {M}ulti-{M}odal {D}ialogue in {M}obile {E}nvironments},
  year = {2002},
  month = {June},
  address = {Kloster Irsee, Germany}
}

Christian Raymond, Patrice Bellot & Marc El-Bèze
Enrichissement de requêtes pour la recherche documentaire selon une classification non-supervisée
13ème Congrès Francophone AFRIF-AFIA de Reconnaissance des Formes et d'Intelligence Artificielle (RFIA'2002)  
pages 625 à 632, Janvier, Angers, France
Abstract: Natural language query formulation is a crucial task in the information retrieval (IR) process. Automatic expanding and refining of queries can be realized in different ways : extracting some words from top retrieved documents (retrieval feedback) or from thesauri, computing new query term weights according to top retrieved documents... In this paper, the information retrieval system SIAC is employed to obtain an initial set of documents from a query. Then, a classification method employing unsupervised decision trees (UDTs) is performed to classify the document retrieved sentences according to some words extracted automatically from these documents (some sentences contain the chosen words, some do not). A boolean expression composed of these selected words is directly associated to each decision tree node. This paper shows that expanding queries with the words connected with the best nodes allows to significantly improve retrieval precision.
BibTeX:

@inproceedings{Raymond.Bellot.ea_2002,
  author = {Christian Raymond and Patrice Bellot and Marc El-Bèze},
  title = {Enrichissement de requêtes pour la recherche documentaire selon une classification non-supervisée},
  booktitle = {13ème Congrès Francophone AFRIF-AFIA de Reconnaissance des Formes et d'Intelligence Artificielle (RFIA'2002)},
  year = {2002},
  pages = {625 à 632},
  month = {Janvier},
  address = {Angers, France}
}

2001
Christian Raymond
Réécriture de requêtes pour la recherche documentaire selon une methode de classification à base d'arbres de décision non-supervisé
School: University Of Avignon  
Juin, Marseille, Luminy
Abstract: Une difficulté majeure dans l'utilisation d'un système de recherche documentaire est le choix du vocabulaire à employer pour exprimer une requête. L'enrichissement de la requête peut prendre plusieurs formes : ajout de mots extraits automatiquement des documents rapportes, reestimation des poids attribues à chacun des mots de la requête initiale, etc. Le système de recherche documentaire SIAC est utilise pour extraire un premier jeu de documents à partir d'une requête. Une méthode de classification non supervisée, à base d'arbres de décision, est ensuite exploitée pour classer les phrases des documents trouves et les documents eux-mêmes, selon qu'elles/ils contiennent ou non certains mots extraits automatiquement de l'ensemble des documents rapportés. À chaque nœud de l'arbre, peut être associée une expression booléenne mettant en jeu les mots sélectionnés
lors de la classification. Nous montrons, À l'aide des données de la seconde campagne d'évaluation Amaryllis, que la réécriture de la requête suivant les expressions booléennes correspondant aux meilleures feuilles permet d'améliorer la précision de la recherche documentaire. La réécriture de la requête, au vu de la structure des arbres de décision, amène à se pencher sur le traitement de la négation dans les systèmes de recherche ainsi qu'à une reformulation des critères de pondérations habituellement utilisés.
BibTeX:

@mastersthesis{Raymond_2001,
  author = {Christian Raymond},
  title = {Réécriture de requêtes pour la recherche documentaire selon une methode de classification à base d'arbres de décision non-supervisé},
  school = {University Of Avignon},
  year = {2001},
  month = {Juin},
  address = {Marseille, Luminy}
}