Bilkent University
Insightful Corporation
Seattle, WA
Content-based image retrieval (CBIR) has become a very popular research
area in the recent years as it provides new application areas and new
challenges to image processing, machine learning, computer vision and
pattern recognition. In this talk, I will give an overview of two
probabilistic approaches to CBIR. In the system developed at the University
of Washington, we pose the retrieval problem in a classification framework
where the goal is to minimize the classification error in a setting of two
classes: the relevance and irrelevance classes of the query. We propose
effective solutions to different levels of the retrieval process within
this framework. Feature extraction and normalization is done by maximizing
class separability. Similarity is measured using likelihood of two images
being similar or dissimilar. A key aspect of our framework is a two-level
modeling of probability. The first level maps high-dimensional feature
spaces to two-dimensional probability spaces using parametric density
models for features. The second level uses combinations of classifiers
trained in multiple probability spaces and corresponds to a modeling of
``probability of probability'' to compensate for errors in modeling
probabilities in feature spaces. The Bayesian formulation also provides
a unified framework for fusion of information from different features
as well as support for relevance feedback.
In the second part of the talk, I will describe our work on interactive
classification and retrieval of remote sensing images at Insightful
Corporation. We developed a hierarchical representation that models
image content in pixel, region and scene levels to reduce the gap between
low-level features and high-level user semantics. Pixel level representations
are learned through automatic fusion of spectral, textural and ancillary
features using decision tree, rule-based and naive Bayes classifiers.
Region level representations consist of shape characteristics and statistics
of groups of pixels. Scene level representations include a visual grammar
that models spatial interactions of region groups using attributed relational
graph structures. The visual grammar can be used to summarize image content
and to classify images into high-level categories using Bayesian classifiers
that automatically learn groups of regions that can distinguish particular
classes of scenes from others. Quantitative results and example queries
will be presented for both systems.
DATE:
September 2, 2003, Tuesday @ 13:40
PLACE: EA-502