Information backdooring in graphs: countermeasures

Publié le ven 21/01/2022 - 15:38

Equipe

WIDE

Site web de l'équipe

https://team.inria.fr/wide/

Date de début de thèse (si connue)

septembre 2022

Lieu

Rennes

Unité de recherche

IRISA - UMR 6074

Description du sujet de la thèse

Graphs are probably the most ubiquitous data structure in most scientific domains such as physics, biology, computer science, and importantly in computer science in sensitive applications [10]. A graph typically connects vertices (representing any object) with edges to model their pair-wise relation. Such a structure is then analyzed to extract information and knowledge [6], by using graph mining algorithms [9].

Yet, a critical precedent has been proven possible recently by research works in [12, 4, 2]: it is possible to watermark–that is to embed arbitrary information in– the very structure of a given graph. This is possible by for instance by adding/removing a few edges/nodes, and then relates to information backdooring. Paper [4] shows that a ratio as little as one edge added/deleted out of 10,000 of the original graph is enough to embed a mark; this does not modify significantly the structure: the graph properties remain similar.

This has major consequences: there are plenty of platforms freely distributing graphs (e.g., Stanford SNAP or the KONECT projet to cite a few), that are leveraged by scientists or by companies to train or develop algorithms. There are thus no guarantee that the graphs shared are not already watermarked or backdoored, so that ownership or other claims may not be later filed against a user or entity. This thus question the sharing of graphs when users do not have a full tracability of these.

Related Works: Watermarking Graphs

The concept of watermarking was introduced in the signal processing community decades ago [11, 3]. It has since then seen countless developments, and is now a well established field [8]. Watermarking embeds information into a target digital object. There has been a recent interest in finding new fields where the concept of watermarking can apply: most recently, it has found a tremendous research attention for the watermarking of neural-network models [7]. As other digital objects, graphs have been also considered in three research works: two works are focusing on structural properties of graphs so that a mark can be embedded [12, 4]. Eppstein et al. [4] propose some formal guarantees on two graph models: Erdos-Renyi and power law graphs. They aim at embedding a mark by flipping edges from some sets of nodes in the high degree and medium degree nodes. Zhao et al. [12] use a computational puzzle to prevent attackers from finding the mark: they rely on subgraph isomorphism for mark extraction, a well know NP-complete problem (thus very inefficient). For doing so, a subgraph of the size of the key to embed is selected in the graph; a XOR operation is then performed between the key and the subgraph, modifying edges in the graph. More generally, we see that the complexity required to solve NP-hard problems is core to applications in the field of security [5]. The third work [2] is converting graphs in the spectral domain, to use techniques from signal processing to mark it.

While the three research papers interested in watermarking graphs are also considering the resilience of their marks, they do so with with basic attacks trying to remove them through simple edge flips. Continuing the parallel with the field of machine learning and watermarked neural networks, researchers have shown more efficient attacks [1], as far as they are carefully adapted to the watermarked object itself.

Thesis Objectives

In alignment with the motivation to efficiently remove marks and backdoored information:

• Finding disruptive attacks against graph watermarks, that have a high probability to remove these watermarks (i.e., prevent their extraction from the graph structure). We believe options for watermark removal that preserve the structure are against the 1) watermarking protocol itself (e.g., maximum-likelihood estimation, pattern matching) or 2) outliers in the graph structure (to be removed e.g., via graph sparsification or coarsening).

• The scientific lock is to perform efficient removal without damaging the graph structure to preserve its value. Another crucial objective is then to find metrics to showcase that the value or correctness or graphs with regards to certain analysis are maintained.

• The attacks will be first assessed analytically, and then evaluated again real graphs taken from the SNAP or KONECT repositories.

Bibliographie

[1] William Aiken, Hyoungshick Kim, Simon Woo, and Jungwoo Ryoo. Neural network laundering: Removing black-box backdoor watermarks from deep neural networks. Computers & Security, 106:102277, 2021.

[2] H. Al-Khafaji and C. Abhayaratne. Graph spectral domain blind watermarking. In ICASSP, 2019.

[3] I. J. Cox, J. Kilian, F. T. Leighton, and T. Shamoon. Secure spread spectrum watermarking for multimedia. IEEE Transactions on Image Processing, 6:1673–1687, 1997.

[4] David Eppstein, Michael T. Goodrich, Jenny Lam, Nil Mamano, Michael Mitzenmacher, and Manuel Torres. Models and algorithms for graph watermarking. In Matt Bishop and Anderson C A Nascimento, editors, Information Security, 2016.

[5] Laurent Evain. Knapsack cryptosystems built on np-hard instance. CoRR, abs/0803.0845, 2008.

[6] Aidan Hogan, Eva Blomqvist, Michael Cochez, Claudia d’Amato, Gerard de Melo, Claudio Gutierrez, Sabrina Kirrane, Jose Emilio Labra Gayo, Roberto Navigli, Sebastian Neumaier, et al. Knowledge graphs. Synthesis Lectures on Data, Semantics, and Knowledge, 12(2):1–257, 2021.

[7] Erwan Le Merrer, Patrick Perez, and Gilles Tredan. Adversarial frontier stitching for remote neural network watermarking. Neural Computing and Applications, Aug 2019.

[8] V. M. Potdar, S. Han, and E. Chang. A survey of digital image watermarking techniques. In INDIN, 2005.

[9] Sabri Skhiri and Salim Jouili. Large graph mining: recent developments, challenges and potential solutions. In European Business Intelligence Summer School, pages 103–124. Springer, 2012.

[10] Anurag K. Srivastava, Timothy A. Ernster, Ren Liu, and Vignesh G. Krishnan. Graph-theoretic algorithms for cyber-physical vulnerability analysis of power grid with incomplete information. Journal of Modern Power Systems and Clean Energy, 6(5):887–899, 2018.

[11] R. B. Wolfgang and E. J. Delp. A watermark for digital images. In ICIP, 1996.

[12] Xiaohan Zhao, Qingyun Liu, Haitao Zheng, and Ben Y. Zhao. Towards graph watermarks. In COSN, 2015

Liste des encadrants et encadrantes de thèse

Le Merrer, Erwan

Type d'encadrement

Directeur.trice de thèse

Unité de recherche

Inria

Département

D1 - Systèmes sécurisés et large échelle

Equipe

WIDE

Taïani, François

Type d'encadrement

2e co-directeur.trice (facultatif)

Unité de recherche

UMR 6074

Contact·s

Mots-clés

Graph algorithms, security: watermarking, backdooring