Automatic Verification of Structural and Relational Properties

Submitted by Thomas GENET on
Team
Date of the beginning of the PhD (if already known)
Septembre 2021
Place
Rennes
Laboratory
IRISA - UMR 6074
Description of the subject

Blockchain systems enable the distributed management of virtual assets---such as property rights, proofs of payments, or execution of  smart contracts--- and play a growing, critical role in our societies. Any failure in the safety, availability, or security of a blockchain-based system could have dramatic consequences on industries, on public infrastructures, and eventually on people.

A promising approach to developing safe code that can serve as the foundations of such system is using high-level programming languages. One such language is the  OCaml functional language which has been used for programming the blockchain called Tezos [5]. The Tezos core software is expected to attain the highest levels of clarity and quality, and to get as close as possible to zero defects. This is where formal methods can help. But, since the code rapidly evolves and validation is done by Tezos' developers, formal verification has to be automatic, directly performed on the code, and fast. This is why static analysis is Tezos's preferred choice for formal verification of their large code base.

Recently, we have proposed a framework for static analysis of functional programming languages like OCaml [1,2]. This framework builds upon tree automata which finitely represent the infinite set of data structures manipulated by a functional program. For instance, analyses built in this framework can prove that lists or trees manipulated by a program remains non-empty, well-sorted, etc. Automatically proving such well-formedness properties provide valuable safety invariants on the program and thus improves confidence in the code base.

Nevertheless, a current limitation of the methods presented in [1,2] is that they cannot verify *relational* properties that describe how input and output of a function are related. More precisely, our analysis can prove that the output of a function satisfies some structural property but it cannot compute a description of how this output relates with the input of the function. Inferring such relations permits to modularize, and thus scale, inter-procedural analysis on large programs. To represent relations on structures, there exists an extension of tree automata, called tuple automata (or automatic relations, automatic structures), which can represent regular languages of relations [3].

The objective of this PhD thesis is to lift structural static analysis from [1,2] to structural **and** relational analysis using tuple automata. A first step is to reformulate analyses from [1,2] in the generic "Implication Counter-Example" (ICE) invariant learning framework [4]. Applying ICE principles on [1,2] should help to split the analysis into two parts: program semantics on one side and a generic tree automaton invariant learning procedure on the other side. Then, it should be possible to generalize the latter learning procedure so that it learns a tuple automaton instead of a unique automaton. Having an analysis which is both structural and relational should significantly broaden the family of properties automatically provable on a functional program. The final objective is to implement those results in a static analyzer for OCaml, and evaluate their utility on the Tezos code base. Combining relational and non-relational analyses in the same tool should produce a powerful yet automatic verification tool, usable by Tezos' engineers in their day-to-day development and validation process.

Bibliography

[1] T. Haudebourg, T. Genet and T. Jensen. Regular Language Type Inference with Term Rewriting. In Proceedings of ICFP'20. ACM, 2020.

[2] T. Genet, T. Haudebourg and T. Jensen. Verifying Higher-Order Functions with Tree Automata. In Proceedings of FoSSaCS'18. Volume 10803 Lecture Notes in Computer Science. Springer Verlag, 2018.

[3] H. Comon, F. Jacquemard, M. Dauchet, R. Gilleron, D. Lugiez, C. Löding, S. Tison, M. Tommasi. Tree Automata Techniques and Applications. http://tata.gforge.inria.fr/

[4] A. Champion, T. Chiba, N. Kobayashi, R. Sato. ICE-Based Refinement Type Discovery for Higher-Order Functional Programs. In Proceedings of TACAS'18. 2018.

[5] Tezos. https://tezos.com/

Researchers

Lastname, Firstname
Thomas GENET
Type of supervision
Director
Laboratory
UMR 6074

Lastname, Firstname
Thomas JENSEN
Type of supervision
Co-director (optional)
Laboratory
UMR 6074
Contact·s
Keywords
Verification, static analysis, tree automata, security critical code.