Technical Reports 2004

Technical Reports Published in 2004

BU-CE-0401: PDF

Interactive Rule Interestingness Learning
Tolga Aydın and H. Altay Güvenir
Data mining is the efficient discovery of patterns in large databases, and classification rules are perhaps the most important type of patterns in data mining applications. However, the number of such classification rules is generally too huge that selection of interesting ones among all discovered rules becomes an important task. In this paper, factors related to the interestingness of a rule are investigated and some new factors are proposed. Following this, an interactive rule interestingness-learning algorithm (IRIL) is developed to automatically label the classification rules either as "interesting" or "uninteresting" with limited user participation. In our study, VFPI (Voting Feature Point Intervals), a feature projection based concept description learning and classification algorithm, is also developed in the framework of IRIL. Being specific to our concerns, VFPI takes the rule interestingness factors as features and is used to learn the rule interestingness concept and to classify the unlabeled classification rules. The success of the previously developed feature projection based learning and classification techniques encouraged us to develop VFPI. The empirical results based on TCMB (Central Bank of Turkish Republic) data set give promising results.
Keywords: Rule interestingness, voting, feature projection based classification
BU-CE-0402: PDF

PHR: A Parallel Hierarchical Radiosity System with Dynamic Load Balancing
Ali Kemal Sinop, Tolga Abacı, Ümit Akkuş, Attila Gürsoy, and Uğur Güdükbay
In this paper, we present a parallel system called PHR for computing hierarchical radiosity solutions of complex scenes. The system is targeted for multi-processor architectures with distributed memory. The system evaluates and subdivides the interactions level by level in a breadth first fashion, and the interactions are redistributed at the end of each level to keep load balanced. In order to allow interactions freely travel across processors, all the patch data is replicated on all the processors. Hence, the system favors load balancing at the expense of increased communication volume. However, the results show that the overhead of communication is negligible compared with total execution time. We obtained a speed-up of 25 for 32 processors in our test scenes.
Keywords: Hierarchical radiosity, distributed memory architectures, load balancing.
BU-CE-0403: PDF

Reputation-Based Trust Management for P2P Networks
Ali Aydın Selçuk, Ersin Uzun and Mark Reşat Pariente
The open and anonymous nature of a P2P network makes it an ideal medium for attackers to spread mali- cious content. In this paper, we describe a reputation- based trust management protocol for P2P networks where users rate the reliability of parties they deal with, and share this information with their peers. The protocol helps establishing trust among good peers as well as identifying the malicious ones.
Results of various simulation experiments show that the proposed system can be highly effective in prevent- ing the spread of malicious content in P2P networks.
BU-CE-0404: PDF

A Message Ordering Problem in Parallel Programs
Bora Uçar and Cevdet Aykanat
We consider certain classes of parallel program segments in which the order of messages sent affects the completion time. We give characterization of the subject class of parallel programs and propose a solution to minimize the completion time. With a sample parallel program, we experimentally evaluate the effect of the solution on a PC cluster.
BU-CE-0405: PDF

A Distributed and Measurement-Based Framework Against Free Riding in Peer-To-Peer Networks
Murat Karakaya, İbrahim Körpeoğlu and Özgür Ulusoy
Peer-to-peer networks have attracted a significant amount of interest as a popular and successful alternative to traditional client-server networks for resource sharing and content distribution. However, the existence of high degrees of free riding may be an important threat against P2P networks. In this report, we propose a distributed method with the aim of reducing the degree of free riding in P2P networks. We primarily focus on locating free riders and taking actions against them. We propose a model in which each peer monitors its neighbors, makes decisions and takes appropriate actions. We specify three different free riding types and their symptoms observable from neighboring peers. We employ simple formulas to determine if a peer exhibits any kind of free riding. The counter actions to be applied to free riders are defined. We combine the mechanisms proposed to detect free riders and take appropriate actions in an Event-Condition-Action rule and a state diagram. Furthermore, we describe possible attacks to the proposed mechanisms and show how the system can handle them. By reducing the amount of free riding in a P2P network, we expect to increase quality of service, availability of content and services, robustness of the system, network load balancing, and scalability of the network.
BU-CE-0406: PDF

Incremental Classification Learning by Using Feature Projections
Tolga Aydın and H. Altay Güvenir
Classification learning is an important research topic in machine learning and data mining disciplines. In our study, CUFP (Classification by Using Feature Projections), a feature projection-based incremental classification-learning algorithm, was developed and tested on real world data sets, giving promising results. The training phase of CUFP constructs points and determines the counts of the training instances of each class at each point in the case of nominal feature projections. For linear feature projections, gaussian probability density functions are constructed for each class. In the classification phase, each feature projection distributes its vote among possible classes. The vote vectors of features are used according to some vote evaluation types and the query instance's class is predicted.
BU-CE-0407: PDF

A Performance Analysis of Bluetooth Broadcasting Scheme
Kaan Doğan, Güray Gürel, A. Kerim Kamcı and İbrahim Körpeoğlu
This paper studies the performance of current Bluetooth broadcasting scheme. Current Bluetooth broadcasting scheme may repeat transmitting (broadcasting) the same broadcasted baseband ACL packet several times to increase the reliability of broadcast over an unreliable Bluetooth radio channel. We have analyzed the effects of different Bluetooth baseband ACL packet types, each of which has a different size and error protection scheme, on the broadcast performance in terms of reliability and effective throughput that can be achieved over a given radio channel characteristics (i.e. a given bit error rate). As the result of our analysis, we determined the optimal packet type and re-transmission count combinations that can provide the highest effective throughput values for various practical BER ranges. These results can be used at Bluetooth baseband layer to dynamically adapt to varying channel conditions and to achieve a good broadcast performance.
BU-CE-0408: PDF

Aspect-Oriented Evolution of Legacy Information Systems (M. Sc. Thesis)
Yasemin Şatıroğlu
A legacy information system is an old system that typically has been developed several years ago, and remains in operation within an organization. Since the software requirements change, legacy systems must be evolved accordingly. Various approaches such as wrapping, migration and redevelopment have been proposed to maintain legacy information systems. Unfortunately, these approaches have not explicitly considered the concerns that are dicult to capture in single components, and tend to crosscut many components. Examples of such crosscutting concerns include distribution, synchronization, persistence, security, logging and real-time behavior. The crosscutting property of concerns seriously complicates the maintenance of legacy systems because the code of the system needs to be changed at multiple places, and conventional maintenance techniques fall short to do this effectively.
Aspect-Oriented Software Development (AOSD) provides explicit mechanisms for coping with these crosscutting concerns. However, current AOSD approaches have primarily focused on coping with crosscutting concerns in software systems that are developed from scratch. Hereby, the crosscutting concerns are implemented as aspects at the beginning, hence localized in single modules. In this way the implementation and maintenance of crosscutting concerns can be prepared to a large extent so that the maintenance of these systems will be easier later on. Unfortunately, legacy systems impose harsher requirements, because crosscutting concerns in legacy systems are neither explicitly identifyinged nor have been prepared before.
We provide a systematic process for analyzing the impact of crosscutting concerns on legacy systems. The process, which is called Aspectual Legacy Analysis Process (ALAP), consists of three sub-processes, Feasibility Analysis, Aspectual Analysis and Maintenance Analysis. All the three sub-processes consist of a set of heuristic rules and the corresponding control. Feasibility Analysis, which consists of two phases, describes rules for categorizing legacy systems, in the correspondingrst phase; and describes the rules for evaluating legacy systems with respect to the ability to implement static crosscutting and ability to implement dynamic crosscutting, in the second phase. The rules of the secondrst phase are based on the categories of legacy systems that we have describesned after a thorough study to legacy information systems, and the rules of the second phase are based on our discussion of these categories with respect to crosscutting implementation. Once the legacy system has been categorized and evaluated with respect to crosscutting implementation, the Aspectual Analysis sub-process describes rules for identifying and specifying aspects in legacy systems. Based on the results of the Feasibility Analysis and Aspectual Analysis sub-processes, the Maintenance Analysis describes the rules for the selection of the appropriate legacy maintenance approach.
ALAP has been implemented in the Aspectual Legacy Analysis Tool (ALAT), which implements the rules of the three sub-processes and as such helps to support the legacy maintainer in analyzing the legacy system and identifying the appropriate maintenance approach.
Keywords: Legacy Information Systems, Aspect-Oriented Software Development, Heuristic Rule Modelling.
BU-CE-0409: PDF

3D Garment Design and Simulation System (M. Sc. Thesis)
Funda Durupınar
In this thesis study, a 3D graphics environment for virtual garment design and simulation is presented. The proposed system enables the three dimensional construction of a garment from its two dimensional cloth panels, for which the underlying structure is a mass-spring model. Construction of the garment is performed through cutting, boundary smoothing , seaming and scaling. Afterwards, it is possible to do fitting on virtual mannequins like in the real life as if in a tailor's workshop. The behavior of cloth under different environmental conditions is implemented applying a physically-based approach. As well as the simulation of the draping of garments, efficient and realistic visualization of garments is an important issue in cloth modelling. There are various material types and reflectance properties for fabrics. We have implemented a number of material and rendering options such as knitwear, woven cloth and standard shading methods such as Gouraud shading. Performance results of the system are presented at the end.
Keywords: Garment design, Garment simulation, Fabric rendering, Physically-based modeling.
BU-CE-0410: PDF

Metadata-Based and Personalized Web Querying (Ph. D. Thesis)
Selma Ayşe Özel
The advent of the Web has raised new searching and querying problems. Keyword matching based querying techniques that have been widely used by search engines, return thousands of Web documents for a single query, and most of these documents are generally unrelated to the users' information needs. Towards the goal of improving the information search needs of Web users, a recent promising approach is to index the Web by using metadata and annotations.
In this thesis, we model and query Web-based information resources using metadata for improved Web searching capabilities. Employing metadata for querying the Web increases the precision of the query outputs by returning semantically more meaningful results. Our Web data model, named "Web information space model", consists of Web-based information resources (HTML/XML documents on the Web), expert advice repositories (domain-expert-specified metadata for information resources), and personalized information about users (captured as user profiles that indicate users' preferences about experts as well as users' knowledge about topics). Expert advice is specified using topics and relationships among topics (i.e., metalinks), along the lines of recently proposed topic maps standard. Topics and metalinks constitute metadata that describe the contents of the underlying Web information resources. Experts assign scores to topics, metalinks, and information resources to represent the "importance" of them. User profiles store users' preferences and navigational history information about the information resources that the user visits. User preferences, knowledge level on topics, and history information are used for personalizing the Web search, and improving the precision of the results returned to the user.
We store expert advices and user profiles in an object relational database management system, and extend the SQL for efficient querying of Web-based information resources through the Web information space model. SQL extensions include the clauses for propagating input importance scores to output tuples, the clause that specifies query stopping condition, and new operators (i.e., text similarity based selection, text similarity based join, and topic closure). Importance score propagation and query stopping condition allow ranking of query outputs, and limiting the output size. Text similarity based operators and topic closure operator support sophisticated querying facilities. We develop a new algebra called Sideway Value generating Algebra (SVA) to process these SQL extensions.
We also propose evaluation algorithms for the text similarity based SVA directional join operator, and report experimental results on the performance of the operator. We demonstrate experimentally the effectiveness of metadata-based personalized Web search through SQL extensions over the Web information space model against keyword matching based Web search techniques.
Keywords: Metadata based Web querying, topic maps, user profile, personalized Web querying, Sideway Value generating Algebra, score management, text similarity based join.
BU-CE-0411: PDF

An Energy-Efficient Scatternet Formation Algorithm for Bluetooth-Based Sensor Networks
Sain Saginbekov and İbrahim Körpeoğlu
In this paper, we propose an energy-efficient scatternet formation algorithm for Bluetooth based sensor networks. The algorithm is based on first computing a shortest path tree from the base station to all sensor nodes and then solving the degree constraint problem so that the degree of each node in the network is not greater than seven, which is a Bluetooth constaint. In this way, less amount of energy is spent in each round of communication in the sensor network. The algorithm also tries to balance the load evenly on the high-energy consuming nodes which are the nodes that are close to the base station. In this way, the lifetime of the first dying node is also prolonged. We obtained promising results in the simulations.
BU-CE-0412: PDF

Moving Object Detection, Tracking and Classification for Smart Video Surveillance (M. Sc. Thesis)
Yiğithan Dedeoğlu
Video surveillance has long been in use to monitor security sensitive areas such as banks, department stores, highways, crowded public places and borders. The advance in computing power, availability of large-capacity storage devices and high speed network infrastructure paved the way for cheaper, multi sensor video surveillance systems. Traditionally, the video outputs are processed online by human operators and are usually saved to tapes for later use only after a forensic event. The increase in the number of cameras in ordinary surveillance systems overloaded both the human operators and the storage devices with high volumes of data and made it infeasible to ensure proper monitoring of sensitive areas for long times. In order to filter out redundant information generated by an array of cameras, and increase the response time to forensic events, assisting the human operators with identification of important events in video by the use of "smart" video surveillance systems has become a critical requirement. The making of video surveillance systems "smart" requires fast, reliable and robust algorithms for moving object detection, classification, tracking and activity analysis.
In this thesis, a smart visual surveillance system with real-time moving object detection, classification and tracking capabilities is presented. The system operates on both color and gray scale video imagery from a stationary camera. It can handle object detection in indoor and outdoor environments and under changing illumination conditions. The classification algorithm makes use of the shape of the detected objects and temporal tracking results to successfully categorize objects into pre-defined classes like human, human group and vehicle. The system is also able to detect the natural phenomenon fire in various scenes reliably. The proposed tracking algorithm successfully tracks video objects even in full occlusion cases. In addition to these, some important needs of a robust smart video surveillance system such as removing shadows, detecting sudden illumination changes and distinguishing left/removed objects are met.
Keywords: Video-Based Smart Surveillance, Moving Object Detection, Background Subtraction, Object Tracking, Silhouette-Based Object Classification, Fire Detection.
BU-CE-0413: PDF

Experiments on Hypergraph Models for Parallelizing Preconditioned Iterative Methods
Bora Uçar and Cevdet Aykanat
We have developed hypergraph models to efficiently parallelize the preconditioned iterative methods that use explicit preconditioners. The models are discussed elsewhere. Here, we report our experiments.
Keywords: matrix partitioning, preconditioning, iterative method, parallel computing.
BU-CE-0414: PDF

Computing Moments of First Passage Times to a Subset of States in Markov Chains
Tuğrul Dayar and Nail Akar
This paper presents a relatively efficient and accurate method to compute the moments of first passage times to a subset of states in finite ergodic Markov chains. With the proposed method, the moment computation problem is reduced to the solution of a linear system of equations with the right-hand side governed by a novel recurrence for computing the higher order moments. We propose to use a form of the Grassmann-Taksar-Heyman (GTH) algorithm to solve these linear equations. Due to the form of the linear systems involved, the proposed method does not suffer from the drawbacks associated with GTH in a row-wise sparse implementation.
Keywords: Markov chains, unsafe states, first passage times, mean, variance, moments, GTH algorithm.
BU-CE-0415: PDF

Tight Conservative Occlusion Culling for Urban Visualization
Türker Yılmaz and Uğur Güdükbay
In this paper, we propose a framework for the visualization of urban environments. In this context, we also propose a novel tight conservative occlusion culling algorithm, which brings a solution to the partial occlusion problem. A conservative visibility culling algorithm accepts an object as completely visible even only a small portion of it is unoccluded. In our algorithm, we tried to avoid this by decomposing the objects into slices to achieve tight conservativeness. This sliced structure is queried for visibility at predefined grid locations in the scene. Since the slices of objects are checked for visibility instead of complete objects, only a very small portion of polygons of an object are sent to the graphics pipeline unnecessarily if a small portion of the object is unoccluded. The grid cells with approximately the same visibility lists are clustered to decrease the size of the visibility information that will be used during the visualization process. Empirical results show that an average speed up of 41% can be achieved using our tight conservative occlusion culling algorithm as compared to the classical conservative occlusion culling approach.
Keywords: 3D navigation, clustering, conservative visibility, occlusion culling, urban visualization, visibility preprocessing.
BU-CE-0416: PDF

Learning Interestingness of Streaming Classification Rules
Tolga Aydın and H. Altay Güvenir
Inducing classification rules on domains from which information is gathered at regular periods lead the number of such classification rules to be generally so huge that selection of interesting ones among all discovered rules becomes an important task. At each period, using the newly gathered information from the domain, the new classification rules are induced. Therefore, these rules stream through time and are so called streaming classification rules. In this paper, an interactive rule interestingness-learning algorithm (IRIL) is developed to automatically label the classification rules either as "interesting" or "uninteresting" with limited user interaction. In our study, VFP (Voting Feature Projections), a feature projection based incremental classification learning algorithm, is also developed in the framework of IRIL. The concept description learned by the VFP algorithm constitutes a novel approach for interestingness analysis of streaming classification rules.