Passer au contenu
  Team SemLIS  

[PhD] Geopalis : Exploration of patterns extracted with data mining techniques from relational and geographical data.

Supervisors :  Peggy Cellier, Olivier Ridoux, Sébastien Ferré.

funding: University Rennes 1, allocated October 1st, 2012, not available any more

    Data mining techniques are used in order to discover emerging knowledge (patterns) in databases [1].
The problem of such techniques is that there are, in general, too many resulting patterns for a user to
explore them all by hand. Some methods try to reduce the number of patterns without a priori pruning,
for example condensed representation [2,3] or constraints [4]. The number of patterns remains,
nevertheless, high. Other approaches, based on a total ranking, propose to show to a user the top-k
patterns with respect to a specific measure. However, those methods do not take into account the
user's knowledge and the dependencies that exist between patterns. In recent work [5], we have
proposed an application of Logical Concept Analysis (LCA) [6] to build a generic framework to explore
patterns extracted by data mining techniques. The framework is based on a data structure which
organizes the set of patterns, and provides operations on that structure, namely navigation in the set of
patterns, selection of patterns of interest and pruning off patterns without interest. The data structure
exploits the fact that patterns are naturally partially ordered. Users can thus benefit from their
background knowledge to navigate through the patterns until their goal(s) have been reached, without
a priori pruning.
    The subject of the PhD is the exploration of patterns extracted with data mining techniques from data
that are represented in a semantic web format. Close to problems addressed by Inductive Databases
(IDB) [7], an expected goal is the definition of a method to help a user to explore relevant patterns
directly from relational data by computing patterns on demand. The approach will be adapted to treat
geographical data, for example GPS track logs in order to discover behavior patterns. The work will be
led in collaboration with Erwan Quesseveur, a researcher in geography from University of Rennes 2.

[1] U. M. Fayyad, G. Piatetsky-Shapiro, and P. Smyth. From data mining to knowledge discovery: an
overview. In Advances in knowledge discovery and data mining. American Association for Artificial
Intelligence, 1996.
[2] N. Pasquier, Y. Bastide, R. Taouil, and L. Lakhal. Discovering frequent closed itemsets for
association rules. In Int. Conf. on Database Theory, pages 398–416. Springer-Verlag, 1999.
[3] M. Plantevit and B. Crémilleux. Condensed representation of sequential patterns according to
frequency-based measures. In Int. Symp. on Advances in Intelligent Data Analysis, LNCS(5772).
Springer, 2009.
[4] J. Pei, J. Han, and L. V. S. Lakshmanan. Mining frequent itemsets with convertible constraints. In
Int. Conf. on Data Engineering. IEE computer society, 2001.
[5] P. Cellier, S. Ferré, M. Ducassé, and T. Charnois. Partial orders and logical concept analysis to
explore patterns extracted by data mining.In Int. Conf. on Conceptual Structures, LNCS : Springer,
2011
[6] S. Ferré and O. Ridoux. An introduction to logical information systems. Information Processing &
Management, 40(3):383–419, Elsevier, 2004.
[7] T. Imielinski et H. Mannila. A database perspective on knowledge discovery. Communications of
The ACM, 39 :58–64, 1996.