Automatic Verification of Structural and Relational Properties

Publié le
Equipe
Date de début de thèse (si connue)
Septembre 2021
Lieu
Rennes
Unité de recherche
IRISA - UMR 6074
Description du sujet de la thèse

Blockchain systems enable the distributed management of virtual assets---such as property rights, proofs of payments, or execution of  smart contracts--- and play a growing, critical role in our societies. Any failure in the safety, availability, or security of a blockchain-based system could have dramatic consequences on industries, on public infrastructures, and eventually on people.

A promising approach to developing safe code that can serve as the foundations of such system is using high-level programming languages. One such language is the  OCaml functional language which has been used for programming the blockchain called Tezos [5]. The Tezos core software is expected to attain the highest levels of clarity and quality, and to get as close as possible to zero defects. This is where formal methods can help. But, since the code rapidly evolves and validation is done by Tezos' developers, formal verification has to be automatic, directly performed on the code, and fast. This is why static analysis is Tezos's preferred choice for formal verification of their large code base.

Recently, we have proposed a framework for static analysis of functional programming languages like OCaml [1,2]. This framework builds upon tree automata which finitely represent the infinite set of data structures manipulated by a functional program. For instance, analyses built in this framework can prove that lists or trees manipulated by a program remains non-empty, well-sorted, etc. Automatically proving such well-formedness properties provide valuable safety invariants on the program and thus improves confidence in the code base.

Nevertheless, a current limitation of the methods presented in [1,2] is that they cannot verify *relational* properties that describe how input and output of a function are related. More precisely, our analysis can prove that the output of a function satisfies some structural property but it cannot compute a description of how this output relates with the input of the function. Inferring such relations permits to modularize, and thus scale, inter-procedural analysis on large programs. To represent relations on structures, there exists an extension of tree automata, called tuple automata (or automatic relations, automatic structures), which can represent regular languages of relations [3].

The objective of this PhD thesis is to lift structural static analysis from [1,2] to structural **and** relational analysis using tuple automata. A first step is to reformulate analyses from [1,2] in the generic "Implication Counter-Example" (ICE) invariant learning framework [4]. Applying ICE principles on [1,2] should help to split the analysis into two parts: program semantics on one side and a generic tree automaton invariant learning procedure on the other side. Then, it should be possible to generalize the latter learning procedure so that it learns a tuple automaton instead of a unique automaton. Having an analysis which is both structural and relational should significantly broaden the family of properties automatically provable on a functional program. The final objective is to implement those results in a static analyzer for OCaml, and evaluate their utility on the Tezos code base. Combining relational and non-relational analyses in the same tool should produce a powerful yet automatic verification tool, usable by Tezos' engineers in their day-to-day development and validation process.

Bibliographie

[1] T. Haudebourg, T. Genet and T. Jensen. Regular Language Type Inference with Term Rewriting. In Proceedings of ICFP'20. ACM, 2020.

[2] T. Genet, T. Haudebourg and T. Jensen. Verifying Higher-Order Functions with Tree Automata. In Proceedings of FoSSaCS'18. Volume 10803 Lecture Notes in Computer Science. Springer Verlag, 2018.

[3] H. Comon, F. Jacquemard, M. Dauchet, R. Gilleron, D. Lugiez, C. Löding, S. Tison, M. Tommasi. Tree Automata Techniques and Applications. http://tata.gforge.inria.fr/

[4] A. Champion, T. Chiba, N. Kobayashi, R. Sato. ICE-Based Refinement Type Discovery for Higher-Order Functional Programs. In Proceedings of TACAS'18. 2018.

[5] Tezos. https://tezos.com/

Liste des encadrants et encadrantes de thèse

Nom, Prénom
Thomas GENET
Type d'encadrement
Directeur.trice de thèse
Unité de recherche
UMR 6074

Nom, Prénom
Thomas JENSEN
Type d'encadrement
2e co-directeur.trice (facultatif)
Unité de recherche
UMR 6074
Contact·s
Mots-clés
Verification, static analysis, tree automata, security critical code.