TEXMEX Research Team
Efficient Exploitation of Multimedia Documents
Exploration, Indexing, Navigation, and Access to Very Large Databases

PhD thesis subject proposed for fall 2010

Image Classification

Key words

Content Based Image Retrieval, Classification, SIFT, SVM, Clustering

Description

The amount of data stored is constantly increasing, among these data are images more and more stored on photo sharing web sites (like FlickR for example) and it becomes more and more difficult to find interesting information (here images). We are interested in content based image retrieval. Different methods can perform this classification task, one of the most frequently used today is based on SIFT descriptors (Scale Invariant Feature Transform) (Lowe, 1999), they are local vectors invariant to translation, scaling, rotation and partially to affine transformation or projection. The number of SIFTs obtained can be very large especially with large image data bases and high resolution photos. The next step is to perform a clustering to get similar SIFTs grouped in clusters and compute the distribution of each image SIFT in these different clusters. These distributions will be the data in input of the supervised classification algorithm. A lot of high performance supervised classification can be used, like SVM (Support Vector Machine, Vapnik, 1995), Random Forests (Breiman, 2001) or ensemble methods (Bagging, Boosting,...).

The different steps of the process imply choices to be performed, and these choices will influence the quality of the obtained results. Among these choices, we can find: the supervised classification algorithm (and its input parameters), the clustering algorithm used: usually a simple k-means algorithm is used, it could be interesting to investigate the use of recent algorithms (e.g. Optics or sub-space clustering algorithms) and the use of different kinds of descriptors simultaneously.

Another point is the computational cost (in memory and run time) of the process, is it scalable to very large (real world) datasets? We can be interested in parallelization of the whole process or some steps of the process for exemple with the use of massively parallel architecture like news GPU (Graphic Processing Units) now available at low cost for GP-GPU (General Purpose Computation using Graphics Hardware, cf. http://www.gpgpu.org/sc2007/).

The experimental part will be performed on large datasets like the Dataset of 3D object categories (Savarese & Fei Fei, 2007) or ImageNet (http://www.image-net.org/)

References