Local search for combinatorial bandits

Publié le
Equipe
Date de début de thèse (si connue)
septembre 2024
Lieu
Rennes
Unité de recherche
IRISA - UMR 6074
Description du sujet de la thèse

The goal of the project is to solve combinatorial optimization problems in the presence of bandit feedback using local search approaches.

Multi-armed bandits deal with the optimization of problems in which certain parameters are to be inferred during interactions. This area has been studied in machine learning since the work of P. Auer [A02], which led to the emergence of a class of algorithms applicable to real-world problems. This pioneering work was limited to the situation where the possible configurations of the optimization problem (called arms) were independent. Subsequently, several works have extended the classes of problems covered, such as combinatorial problems [C15, A21], including optimization on matroids and polymatroids [K14, P19]. However, these approaches rely on global optimization, which comes at a cost in terms of regret and computational time. Recently, we have adapted an algorithm based on local search [C14] to two matching problems, thus reducing these costs [G21, G22]. However, the approaches we proposed rely on properties specific to these two matching problems, in particular the existence of a small set of parameters sufficient to identify the best match.

The PhD project will extend the idea of local search to a larger class of combinatorial optimization problems. It will first focus on optimization problems on weighted matroids (e.g., task scheduling) and on the intersection of weighted matroids (e.g., the assignment problem). Subsequently, it will study other classes of problems, such as optimization problems on greedoids

Bibliographie

[A02] Auer, P.; Cesa-Bianchi, N.; and Fischer, P. (2002). Finite-time analysis of the multiarmed bandit problem. Machine Learning 47(2-3): 235–256.

[A21] Agarwal, M., Aggarwal, V., Umrawal, A. K., & Quinn, C. (2021). DART: Adaptive Accept Reject Algorithm for Non-Linear Combinatorial Bandits. Proceedings of the AAAI Conference on Artificial Intelligence (AAAI’21), 35(8), 6557-6565.

[C14] Combes, R. and Proutiere, A. (2014). Unimodal bandits: Regret lower bounds and optimal algorithms. In proc. of the 31st Int. Conf. on Machine Learning, (ICML’14).

[C15] Richard Combes, M. Sadegh Talebi, Alexandre Proutiere, Marc Lelarge. (2015). Combinatorial Bandits Revisited. In proc. of Advances in Neural Information Processing Systems 28 (NeurIPS’15).

[K14] Branislav Kveton, Zheng Wen, Azin Ashkan, Hoda Eydgahi, and Brian Eriksson. (2014). Matroid bandits: fast combinatorial optimization with learning. In Proceedings of the Thirtieth Conference on Uncertainty in Artificial Intelligence (UAI'14). AUAI Press, Arlington, Virginia, USA, 420–429.

[G21] Camille-Sovaneary Gauthier, Romaric Gaudel, Elisa Fromont, and Boammani Aser Lompo (2021). Parametric Graph for Unimodal Ranking Bandit. In Proc. of the 38th International Conference on Machine Learning (ICML’21), pp. 3630–3639.

[G22] C.-S. Gauthier, Romaric Gaudel, and Elisa Fromont (2022). UniRank: Unimodal Bandit Algorithms for Online Ranking. In Proc. of the 39th International Conference on Machine Learning (ICML’22), pp. 7279–7309.

[P19] Pierre Perrault, Vianney Perchet, Michal Valko (2019). Exploiting Structure of Uncertainty for Efficient Matroid Semi-Bandits. In Proc. of the 36th International Conference on Machine Learning (ICML’19)

Liste des encadrants et encadrantes de thèse

Nom, Prénom
Gaudel, Romaric
Type d'encadrement
Directeur.trice de thèse
Unité de recherche
IRISA (UMR 6074) and Inria
Equipe

Nom, Prénom
Fromont, Élisa
Type d'encadrement
2e co-directeur.trice (facultatif)
Unité de recherche
IRISA (UMR 6074) and Inria
Equipe
Contact·s
Nom
Gaudel, Romaric
Email
Romaric.gaudel@irisa.fr
Nom
Fromont, Élisa
Email
elisa.fromont@irisa.fr
Mots-clés
Combinatorial Multi-armed Bandits, Unimodal Bandits, Learning to Rank