You are here

Deep neural natural language style transfer

Team and supervisors
Department / Team: 
Team Web Site: 
www-expression.irisa.fr/en
PhD Director
Damien Lolive
Co-director(s), co-supervisor(s)
John Kelleher
Gwénolé Lecorvé
Contact(s)
NameEmail address
Damien Lolive
damien.lolive@irisa.fr
John Kelleher
john.d.kelleher@dit.ie
Gwénolé Lecorvé
gwenole.lecorve@irisa.fr
PhD subject
Abstract

Context

For decades, natural language processing has mainly focused on extracting information from the propositional content of data (grammar, semantics, topic, opinion, etc.) but, since a few years, interest is growing on language generation. Important advances have, for instance, been presented in machine translation [1], automatic captionning [2], question answering [3] and chatbots [4]. Such progress is driven by major recent developments in recurrent neural networks (LTSMs, GRUs) and adversarial architectures (GANs), leading to efficient sequence-to-sequence models and realistic outputs.

Current natural language generation methods still lack some control regarding the style of the generated texts (emotion, linguistic register, technical degree, etc.). However, this problem of style has recently been successfully tackled in image and video processing, still backed by neural machine learning. In particular, style transfer, i.e., mapping a given target style on a given picture (e.g., making a photograph look like a Van Gogh painting), has become a very active research topic [5, 6].

Injecting these style transfer mechanisms into natural language generation would answer to increasing industrial needs, for instance in the domains of customer/public relationships, smart advertisement or entertainement.

Description of the thesis

The objective of the PhD thesis is to propose, develop and validate style transfer methods for natural language generation using state-of-the-art neural approaches (LSTMs, embeddings, GANs). Several styles with an increasing complexity will be studied along the PhD: insertion of disfluencies and sentence simplification will be the two first tasks under study while openings to lexical, morphosyntactic and syntactic variations are expected in a second time. As a byproduct, the achieved work should thus provide a better understanding of the way neural networks can embed, translate and reconstruct linguistic traits.

Industrial outlets

Results of the conducted work would benefit to companies related to human-computer interaction by offering possibilities to deliver textual or spoken messages whose style is adapted to their recipients, thus improving their probability of acceptance. As such, the PhD is in perfect line with the "digital society technologies" axis of the regional research strategy. Regional prospect companies of this research are technology providers such as Orange, Vocagen, Solocal, etc.

Thesis environment

The thesis would jointly take place in the Expression team of IRISA lab (Lannion site) and at the ADAPT Centre (Dublin).

Expression focuses on expressivity in human languages. It has strong skills in natural language processing, text-to-speech, and machine learning. Research topics of Gwénolé Lecorvé and Damien Lolive currenly focus on prosody control [7, 8], pronunciation adaptation [9, 10], disfluency generation [9, 11], linguistic register transformation [12], and feature embedding using neural networks. These activities are conducted through projects (leading ANR SynPaFlex and ANR TREMoLo) and ongoing PhDs.

The ADAPT Centre dedicates his research on digital data analysis, personalisation, and delivery. More particularly, John D. Kelleher has a strong expertise in neural machine learning [13], machine translation [14], and dialogue [15]. He also recently co-authored the book "Fundamentals of Machine Learning for Predictive Data Analytics: Algorithms, Worked Examples, and Case Studies" (MIT Press).

Required skills

  • Master of science or equivalent in Computer Science

  • Knowledge in artificial intelligence, natural language, speech, machine learning and/or human-machine interaction

  • Experience in Python, Linux and shell scripting

  • Good English level

  • Good communication skills.

Bibliography

[1] D. Bahdanau, K. Cho, Y. Bengio (2015). Neural machine translation by jointly learning to align and translate. In Proceedings of International Conference on Learning Representations.

[2] V. Ordonez, G. Kulkarni, T. L. Berg (2011). Im2text: Describing images using 1 million captioned photographs. In Advances in Neural Information Processing Systems.

[3] C. Xiong, V. Zhong, R. Socher (2017). Dynamic coattention networks for question answering. In Proceedings of International Conference on Learning Representations.

[4] J. Li, W. Monroe, A. Ritter, M. Galley, J. Gao, D. Jurafsky (2016). Deep Reinforcement Learning for Dialogue Generation. In Proceedings of Conference on Empirical Methods in Natural Language Processing.

[5] L. A. Gatys, A. S. Ecker, M. Bethge (2016). Image style transfer using convolutional neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.

[6] J. Y. Zhu, T. Park, P. Isola, P., A. A. Efros (2017). Unpaired image-to-image translation using cycle-consistent adversarial networks. In Proceedings of the IEEE International Conference on Computer Vision.

[7] M. Avanzi, G. Christodoulides, D. Lolive, E. Delais-Roussarie, N. Barbot (2014). Towards the adaptation of prosodic models for expressive text-to-speech synthesis. In Proceedings of nterspeech.

[8] E. Delais-Roussarie, D. Lolive, H. Yoo, D. Guennec (2016). Rhythmic Patterns and Literary Genres in Synthesized Speech. In Speech Prosody.

[9] R. Qader (2017). Pronunciation and disfluency modelling for spontaneous speech synthesis. Ph.D. dissertation, University of Rennes 1.

[10] M. Tahon, R. Qader, G. Lecorvé, D. Lolive (2016). Improving TTS with corpus-specific pronunciation adaptation. In Proceedings of Interspeech.

[11] R. Qader, G. Lecorvé, D. Lolive, P. Sébillot (2017). Ajout automatique de disfluences pour la synthèse de la parole spontanée : formalisation et preuve de concept. In Actes de TALN.

[12] J. Mekki, D. Battistelli, N. Béchet, G. Lecorvé (2017). "Nous nous arrachâmes promptement avec ma caisse" : quels descripteurs linguistiques caractérisent les registres de langue ? Technical report.

[13] G. Salton, R. J. Ross, J. D. Kelleher (2016). Idiom Token Classification using Sentential Distributed Semantics. In ACL.

[14] G. Salton, R. J. Ross, J. D. Kelleher (2014). An Empirical Study of the Impact of Idioms on Phrase Based Statistical Machine Translation of English to Brazilian-Portuguese. In Proceeding of the Workshop on Hybrid Approaches to Translation (HyTra) at Conference of the European Chapter of the Association for Computational Linguistics.

[15] S. Donik, J. D. Kelleher (2017). A Model for Attention-Driven Judgements in Type Theory with Records. In Proceedings of JerSem: The 20th Workshop on the Semantics and Pragmatics of Dialogue.

Work start date: 
2nd semester, 2018
Keywords: 
Natural language processing, natural language generation, sequence-to-sequence neural networks, deep learning
Place: 
IRISA, Lannion / ADAPT research centre, Dublin, Ireland