Abstract: Within the ever-growing video archives is a vast amount of interesting information regarding human action/activities. In this thesis, we approach the problem of extracting this information and understanding human motion from a computer vision perspective. We propose solutions for two distinct scenarios, ordered from simple to complex. In the first scenario, we deal with the problem of single action recognition in relatively simple settings. We believe that human pose encapsulates many useful clues for recognizing the ongoing action, and we can represent this shape information for 2D single actions in very compact forms, before going into details of complex modeling. We show that high-accuracy single human action recognition is possible 1) using spatial oriented histograms of rectangular regions when the silhouette is extractable, 2) using the distribution of boundary-fitted lines when the silhouette information is missing. We demonstrate that, inside videos, we can further improve recognition accuracy by means of adding local and global motion information. We also show that within a discriminative framework, shape information is quite useful even in the case of human action recognition in still images.
Our second scenario involves recognition and retrieval of complex human activities within more complicated settings, like the presence of changing background and viewpoints. We describe a method of representing human activities in 3D that allows a collection of motions to be queried without examples, using a simple and effective query language. Our approach is based on units of activity at segments of the body, that can be composed across time and across the body to produce complex queries. The presence of search units is inferred automatically by tracking the body, lifting the tracks to 3D and comparing to models trained using motion capture data. Our models of short time scale limb behaviour are built using labelled motion capture set. Our query language makes use of finite state automata and requires simple text encoding and no visual examples. We show results for a large range of queries applied to a collection of complex motion and activity. We compare with discriminative methods applied to tracker data; our method offers significantly improved performance. We show experimental evidence that our method is robust to view direction and is unaffected by some important changes of clothing.
Keywords:
Human motion, action recognition, activity recognition, activity
retrieval, image and video processing, classification.
(PDF copy)
Abstract: The peer-to-peer (P2P) network paradigm has attracted a significant amount of interest as a popular and successful alternative to traditional client-server model for resource sharing and content distribution. However, researchers have observed the existence of high degrees of free riding in P2P networks which poses a serious threat to effectiveness and efficient operation of these networks, and hence to their future. Therefore, eliminating or reducing the impact of free riding on P2P networks has become an important issue to investigate and a considerable amount of research has been conducted on it.
In this thesis, we propose two novel solutions to reduce the adverse effects of free riding on P2P networks and to motivate peers to contribute to P2P networks. These solutions are also intended to lead to performance gains for contributing peers and to penalize free riders. As the first solution, we propose a distributed and localized scheme, called Detect and Punish Method (DPM), which depends on detection and punishment of free riders. Our second solution to the free riding problem is a connection-time protocol, called P2P Connection Management Protocol (PCMP), which is based on controlling and managing link establishments among peers according to their contributions.
To evaluate the proposed solutions and compare them with other alternatives, we developed a new P2P network simulator and conducted extensive simulation experiments. Our simulation results show that employing our solutions in a P2P network considerably reduces the adverse effects of free riding and improves the overall performance of the network. Furthermore, we observed that P2P networks utilizing the proposed solutions become more robust and scalable.
Keywords:
Free riding, Peer-to-Peer networks, distributed computing, performance
evaluation.
(PDF copy)
Abstract: Modeling and visualization of large geometric environments is a popular research area in computer graphics. In this dissertation, a framework for modeling and stereoscopic visualization of large and complex urban environments is presented. The occlusion culling and view-frustum culling is performed to eliminate most of the geometry that do not contribute to the user.s final view. For the occlusion culling process, the shrinking method is employed but performed using a novel Minkowski-difference-based approach. In order to represent partial visibility, a novel building representation method, called the slice-wise representation is developed. This method is able to represent the preprocessed partial visibility with huge reductions in the storage requirement. The resultant visibility list is rendered using a graphics-processing-unit-based algorithm, which perfectly fits into the proposed slice-wise representation. The stereoscopic visualization depends on the calculated eye positions during walkthrough and the visibility lists for both eyes are determined using the preprocessed occlusion information. The view-frustum culling operation is performed once instead of two for both eyes. The proposed algorithms were implemented on personal computers. Performance experiments show that, the proposed occlusion culling method and the usage of the slice-wise representation increase the frame rate performance by 81%; the graphics-processing-unit-based display algorithm increases it by an additional 315% and decrease the storage requirement by 97% as compared to occlusion culling using building-level granularity and not using the graphics hardware. We show that, a smooth and real-time visualization of large and complex urban environments can be achieved by using the proposed framework.
Keywords:
Stereoscopic visualization, slice-wise representation, space
subdivision, octree, occlusion culling, occluder shrinking, Minkowski
difference, fromregion visibility, urban visualization, visibility
processing.
(PDF copy)
Abstract: Cellular processes form the hardware layer of living organisms. Malfunctions in cellular processes are responsible for most of the currently incurable diseases. Not surprisingly, knowledge about cellular processes are growing at an enormous rate. However, today's molecular biology suffers from lack of a formal representation system for cellular processes. Most of the knowledge is locked in literature, that are not accessible to computational analysis and modeling. Given the complexity of the system we are attacking, the need for a representation system and modeling tools for cellular processes are clear.
In this dissertation, we describe an ontology for modeling processes. Our ontology possesses several unique features, including ability to represent abstractions and multiple levels of detail, cellular compartments and molecular states. Furthermore, it was designed to meet several user and system requirements, including ease of integration, querying, analysis and visualization.
Based on this ontology we also implemented a set of software tools within the PATIKA project. Primary use cases of PATIKA are integration, querying and visualization, and we have obtained satisfactory results proving the feasibility of our ontology.
Compared with existing alternative methods of representing and querying information about cellular processes, PATIKA provides several advantages, including a regular representation system, powerful querying options, an advanced visualization. Moreover PATIKA models can be analyzed by computational methods such as flux analysis or pathway activity inference. Although it has a more steep learning curve compared to existing ad hoc representation systems, we believe that tools like PATIKA will be essential for molecular biology research in the future.
Keywords: Bioinformatics, ontology, pathway.
(PDF copy)
Abstract: Sparse matrix-vector multiply (SpMxV) operations are in the kernel of many scientific computing applications. Therefore, efficient parallelization of SpMxV operations is of prime importance to scientific computing community. Previous works on parallelizing SpMxV operations consider maintaining the load balance among processors and minimizing the total message volume. We show that the total message latency (start-up time) may be more important than the total message volume. We also stress that the maximum message volume and latency handled by a single processor are important communication cost metrics that should be minimized. We propose hypergraph models and hypergraph partitioning methods to minimize these four communication cost metrics in one dimensional and two dimensional partitioning of sparse matrices. Iterative methods used for solving linear systems appear to be the most common context in which SpMxV operations arise. Usually, these iterative methods apply a technique called preconditioning. Approximate inverse preconditioning—which can be applied to a large class of unsymmetric and symmetric matrices—replaces an SpMxV operation by a series of SpMxV operations. That is, a single SpMxV operation is only a piece of a larger computation in the iterative methods that use approximate inverse preconditioning. In these methods, there are interactions in the form of dependencies between the successive SpMxV operations. These interactions necessitate partitioning the matrices simultaneously in order to parallelize a full step of the subject class of iterative methods efficiently. We show that the simultaneous partitioning requirement gives rise to various matrix partitioning models depending on the iterative method used. We list the partitioning models for a number of widely used iterative methods. We propose operations to build a composite hypergraph by combining the previously proposed hypergraph models and show that partitioning the composite hypergraph models addresses the simultaneous matrix partitioning problem. We strove to demonstrate how the proposed partitioning methods—both the one that addresses multiple communication cost metrics and the other that addresses the simultaneous partitioning problem—help in practice. We implemented a library and investigated the performances of the partitioning methods. These practical investigations revealed a problem that we call message ordering problem. The problem asks how to organize the send operations to minimize the completion time of a certain class of parallel programs. We show how to solve the message ordering problem optimally under reasonable assumptions.
Keywords:
Sparse matrices, parallel matrix-vector multiplication, iterative
methods, preconditioning, approximate inverse preconditioner,
hypergraph partitioning.
(PDF copy)
Abstract: The advent of the Web has raised new searching and querying problems. Keyword matching based querying techniques that have been widely used by search engines, return thousands of Web documents for a single query, and most of these documents are generally unrelated to the users' information needs. Towards the goal of improving the information search needs of Web users, a recent promising approach is to index the Web by using metadata and annotations.
In this thesis, we model and query Web-based information resources using metadata for improved Web searching capabilities. Employing metadata for querying the Web increases the precision of the query outputs by returning semantically more meaningful results. Our Web data model, named "Web information space model", consists of Web-based information resources (HTML/XML documents on the Web), expert advice repositories (domain-expert-specified metadata for information resources), and personalized information about users (captured as user profiles that indicate users' preferences about experts as well as users' knowledge about topics). Expert advice is specified using topics and relationships among topics (i.e., metalinks), along the lines of recently proposed topic maps standard. Topics and metalinks constitute metadata that describe the contents of the underlying Web information resources. Experts assign scores to topics, metalinks, and information resources to represent the "importance" of them. User profiles store users' preferences and navigational history information about the information resources that the user visits. User preferences, knowledge level on topics, and history information are used for personalizing the Web search, and improving the precision of the results returned to the user.
We store expert advices and user profiles in an object relational database management system, and extend the SQL for efficient querying of Web-based information resources through the Web information space model. SQL extensions include the clauses for propagating input importance scores to output tuples, the clause that specifies query stopping condition, and new operators (i.e., text similarity based selection, text similarity based join, and topic closure). Importance score propagation and query stopping condition allow ranking of query outputs, and limiting the output size. Text similarity based operators and topic closure operator support sophisticated querying facilities. We develop a new algebra called Sideway Value generating Algebra (SVA) to process these SQL extensions.
We also propose evaluation algorithms for the text similarity based SVA directional join operator, and report experimental results on the performance of the operator. We demonstrate experimentally the effectiveness of metadata-based personalized Web search through SQL extensions over the Web information space model against keyword matching based Web search techniques.
Keywords: Metadata based Web querying, topic maps, user
profile, personalized Web querying, Sideway Value generating Algebra,
score management, text similarity based join.
(PDF copy)
Abstract: With the advances in information technology, the amount of multimedia data captured, produced and stored is increasing rapidly. As a consequence, multimedia content is widely used for many applications in today's world, and hence, a need for organizing this data and accessing it from repositories with vast amount of information has been a driving stimulus both commercially and academically. In compliance with this inevitable trend, first image and especially later video database management systems have attracted a great deal of attention since traditional database systems are not suitable to be used for multimedia data.
In this thesis, a novel architecture for a video database system is proposed. The architecture is original in that it provides full support for spatio-temporal queries that contain any combination of spatial, temporal, object-appearance, external-predicate, trajectory-projection and similarity-based object-trajectory conditions by a rule-based system built on a knowledge-base, while utilizing an object-relational database to respond to semantic (keyword, event/activity and category-based) and low-level (color, shape and texture) video queries. Research results obtained from this thesis work have been realized by a prototype video database management system, which we call BilVideo. Its tools, Fact-Extractor and Video-Annotator, its Web-based visual query interface and its SQL-like textual query language are presented. Moreover, the query processor of BilVideo and our spatio-temporal query processing strategy are also discussed.
Keywords:
Video databases, multimedia databases, information
systems, video data modeling, content-based retrieval, spatio-temporal
relations, spatio-temporal query processing, video query languages.
(PDF copy)
Abstract: An event history is a collection of events that have occurred in an event-based system over a period of time. There can be various types of events, among which are the temperature changes and power demands in a power management system, client requests for data items in a broadcast system, price increase of a stock in stock market, and so on. There is a lot of interesting information that can be extracted from an event history via data mining techniques. Our purpose in this thesis is to propose methods for extracting this useful information in the form of event sequences and event associations from a single or two correlated event histories. We also aim to show how the results of the mining process can be used for active and mobile data management. The results of the mining process demonstrates the relationships among the events which are generally captured as associations or sequences. The relationships among the events are shown to be a useful tool to enhance an event-based system via event organization, predictive event detection, and proactive rule execution.
We consider the mining of both a single event history and two correlated event histories. We first propose a method for mining binary relationships from a single event history. The binary relationships among events are used to organize the events into related groups of event. This organization is important because the number of events in an event-driven system may become very high and unmanageable. The groups of related events and the relationships among the events are exploited for predictive event detection and proactive rule execution in active database systems. We also consider the mining of two correlated event histories which are disjoint and the events in one history are related to the events in the other history. We describe how we can efficiently extract associations among the events spanning different event histories, which we call cross associations.
We have chosen data broadcast in mobile computing environments as a
case study for active data management. One of the important facts in
mobile computing environments with wireless communication medium is
that the server-to-client (downlink) communication bandwidth is much
higher than the client-to-server (uplink) communication
bandwidth. This asymmetry makes the dissemination of data to client
machines a desirable approach. However, the dissemination of data by
broadcasting may induce high access latency in case the number of
broadcast data items is large. Our purpose is to show how active data
management can be used to improve mobile data management through
broadcast data organization and prefetching from the broadcast
medium. In order to achieve this, the client requests of data items at
the server are considered as events and the chronological sequence of
items that have been requested by clients is considered as an event
history. We call the event history in broadcast medium a broadcast
history. The first step in our work is to analyze the broadcast
history to discover sequential patterns describing the client access
behavior. The sequential patterns are used to organize the data items
in the broadcast disk in such a way that the items requested
subsequently are placed close to each other. Then we utilize the
predictive event detection techniques to improve the cache hit ratio
to be able to decrease the access latency. Prediction of future
client access behavior enables clients to prefetch the data from the
broadcast disk based on the rules extracted from the broadcast
history.
(PDF copy)
Abstract: Punctuation marks have special importance in bringing out the meaning of a text. Geoffrey Nunberg's 1990 monograph bridged the gap between descriptive treatments of punctuation and prescriptive accounts, by spelling out the features of a text-grammar for the orthographic sentence. His research inspired most of the recent work concentrating on punctuation marks in Natural Language Processing (NLP). Several grammars incorporating punctuation were then shown to reduce failures and ambiguities in parsing. Nunberg's approach to punctuation (and other formatting devices) was partially incorporated into natural language generation systems. However, little has been done concerning how punctuation marks bring semantic and discourse cues to the text and whether these can be exploited computationally.
The aim of this thesis is to analyse the semantic and discourse
aspects of punctuation marks, within the framework of Hans Kamp and
Uwe Reyle's Discourse Representation Theory (DRT) (and its extension
by Nicholas Asher, Segmented Discourse Representation Theory (SDRT)),
drawing implications for NLP systems. The method used is the
extraction of patterns for four common punctuation marks (dashes,
semicolons, colons, and parentheses) from corpora, followed by formal
modeling and a modest computational prototype. Our observations and
results have revealed interesting occurrences of linguistic phenomena,
such as anaphora resolution and presupposition, in conjunction with
punctuation marks. Within the framework of SDRT such occurrences are
then tied with the overall discourse structure. The proposed model
can be taken as a template for NLP software developers for making
use of the punctuation marks more effectively. Overall, the thesis
describes the contribution of punctuation at the orthographic
sentence level to the information passed on to the reader of a text.
(Postscript copy)
Abstract:
Signature file approach is a well-known indexing technique. The Bit
Sliced Signature File (BSSF) method provides an efficient retrieval
environment by only accessing on-bits of query signatures. However,
its response time increases for increasing number of query terms. For
BSSF we define a stopping condition that tries to avoid the processing
of all on-bits of query signatures through partial evaluation. The aim
of the stopping condition is to reduce the expected number of false
drops to the level that also provides the lowest response time within
the framework of BSSF. We propose the Partially evaluated Bit- Sliced
Signature File (P-BSSF) method that employs the partial evaluation
strategy and minimizes the response time in a multi-term query
environment by considering the submission probabilities of the queries
with different number of terms. Experiments show that P-BSSF provides
85% improvement in response time over BSSF depending on space overhead
and the number of query terms. To provide better optimization of the
signature file parameters in multi-term query environments, we propose
Multi-Fragmented Signature File (MFSF) method as an extension of
P-BSSF. In MFSF, a signature file is divided into variable sized
vertical fragments with different on-bit densities to optimize the
response time using a similar query evaluation methodology. In query
evaluation the query signature on-bits of the lower on-bit density
fragments are used first. As the number of query terms increases, the
number of query signature on-bits in the lower on-bit density
fragments increases and the query stopping condition is reached in
fewer bit slice evaluation steps. Therefore, in MFSF, the response
time decreases for an increasing number of query terms. The analysis
shows that, with no space overhead, MFSF outperforms the P-BSSF and
generalized frame-sliced signature file organizations. Due to hashing
and superimposition operations used in obtaining signatures, the
signature of an irrelevant record may match the query signature, i.e.,
it is possible to have false drops. In signature file processing the
accurate estimation of the number of false drops is essential to
obtain system parameters that will yield a desirable response time.
We propose a more accurate false drop estimation method for databases
with varying record lengths instead of using an average number of
distinct terms per record. In this method, the records of a database
are conceptually partitioned according to the number of distinct terms
they contain and the number of false drops of each group is estimated
separately. Experiments with real data show that depending on the
space overhead, the proposed method obtains up to 33%, 25%, and 20%
response time improvements for the sequential, generalized
frame-sliced, and MFSF methods, respectively. For very large
databases even one bit slice of MFSF may occupy several disk blocks.
We propose the Compressed Multi-Fragmented Signature File (C-MFSF)
method that stores the bit slices of MFSF in a compact form which
provides a better response time. To minimize the number of disk
accesses, the signature size and the disk block size can be adjusted
such that most bit slices fit into a single disk block after
compression. In such environments, C-MFSF evaluates the queries with
more than two terms with only one disk access per query term rather
than two disk accesses of the inverted file method which are
respectively for the pointer of the query term posting list and the
list itself. Our projection based on real data shows that for a
database of one million records C-MFSF provides a response time of
0.85 seconds.
(Postscript copy)