All technical reports files are in compressed (gzip) postscript format.
The file names are of the sort BU-CEIS-YYXX.ps.z
BU-CEIS-9501
TITLE: Vertical Fragmentation of Superimposed Signature Files
Using Partial Evaluation of Queries
AUTHOR(S): Seyit Kocberber and Fazli Can
ABSTRACT: Our research extends the bit-sliced signature organization by
introducing a partial evaluation approach for queries and dividing the
signatures into variable sized vertical fragments. This approach, MFSF,
relaxes the optimality condition used in superimposed signature
generation. The analysis, which incorporates multi-term query schemes,
show that, with no space overhead, MFSF provides a query processing
time improvement of more than 70% with respect to the optimal bit-sliced
organization using partial query evaluation for both cases. Under the
sequentiality assumption of disk blocks, MFSF provides a desirable
response time of 1.5 seconds for a database size of one million records.
Due to partial evaluation, the desirable response time is guaranteed
for queries with several terms.
BU-CEIS-9502
TITLE: Logic Program Schemata: Synthesis and Analysis
AUTHOR(S): Pierre Flener
ABSTRACT: A program schema is a template program with a fixed control
and data flow, but without specific indications about the actual
parameters or the actual computations. A program schema and the
constraints on its place-holders encode a programming methodology, and
they are thus an interesting means of guiding many software engineering
tasks. First, I propose a notation and terminology for schemata and
their manipulations. Second, from a series of divide-and-conquer
programs, I synthesize four increasingly powerful divide-and-conquer
schemata and their constraints. Third, I outline a vision of
schema-guided program synthesis and report on particular synthesis
strategies that are guided by various schemata. Finally, I discuss the
related work and outline directions for future work.
BU-CEIS-9503
TITLE: A Network Access Protocol for Hard Real-Time Communication Systems
AUTHOR(S): Ozgur Ulusoy
ABSTRACT: Distributed hard real-time systems are characterized by
communication messages associated with timing constraints, typically
in the form of deadlines. A message should be received at the
destination before its deadline expires. Carrier sense multiple
access with collision detection CSMA/CD appears to be one of the most
common communication network access schemes that can be used in
distributed hard real-time systems. In this paper, we propose a new
real-time network access protocol which is based on the CSMA/CD
scheme. The protocol classifies the messages into two classes as
`critical' and `noncritical' messages. The messages close to their
deadlines are considered to be critical. A critical message is given
the right to access the network by preempting a noncritical message in
transmission. It is shown that the protocol can provide considerable
improvement over the virtual time CSMA/CD protocol proposed for hard
real-time communication by Zhao et al.
BU-CEIS-9504
TITLE: A Genetic Algorithm for Multicriteria Inventory Classification
AUTHOR(S): H. Altay Guvenir
ABSTRACT: One of application areas of the genetic algorithms is
parameter optimization. This paper addresses the problem of optimizing
a set of parameters that represent the weights of criteria, where the
sum of all weights is 1. A chromosome represents the values of the
weights, possibly along with some cut-off points. A new crossover
operation, called continuous uniform crossover, is proposed, such that
it produces valid chromosomes given that the parent chromosomes are
valid. The new crossover technique is applied to the problem of
multicriteria inventory classification. The results are compared with
the classical inventory classification technique using Analytical
Hierarchy Process.
BU-CEIS-9505
TITLE: Exploring States of Sparse, Large Markov Chains
AUTHOR(S): Tugrul Dayar and William J. Stewart
ABSTRACT: Dynamic state exploration is an effective technique to explore states
of sparse, large Markov chains having highly unbalanced stationary probabilities
and to compute related performance measures for them. States of interest are
explored starting from a particular state until a predetermined stopping
criterion is met. In this paper, the dynamic state exploration idea is extended
to the case where more than one starting state is considered simultaneously.
This approach easily lends itself to parallel implementation and reduces the
number of operations to be carried out for each starting state. Furthermore, it
is shown that this approach gives quite an accurate feeling for the nature of
the states without having to solve the complete Markov chain. Experiments are
carried out on a Markov chain arising from an Asynchronous Transfer Mode (ATM)
traffic queue.
BU-CEIS-9506
TITLE: Situations and Computation: An Overview of Recent Research
AUTHOR(S): Erkan Tin and Varol Akman
ABSTRACT: Situation theory has been developed over the last decade and
various versions of the theory have been applied to a number of
linguistic issues. However, not much work has been done in regard to
its computational aspects. In this paper, we review the existing
approaches towards `computational situation theory' with considerable
emphasis on our own research.
NOTE: This report will also appear in the University of Tuebingen
Seminer fuer Sprachwissenschaft's 1995 technical report series.
BU-CEIS-9507
TITLE: Modeling Context with Situations
AUTHOR(S): Mehmet Surav and Varol Akman
ABSTRACT: The issue of context arises in assorted areas of Artificial
Intelligence. Although its importance is realized by various
researchers, there is not much work towards a useful formalization.
In this paper, we will present a preliminary model (based on Situation
Theory) and give examples to show the use of context in various
fields, and the advantages gained by the acceptance of our proposal.
NOTE: This work will be presented in IJCAI-95, Workshop on "Modeling
Context in Knowledge Representation and Reasoning," Montreal, Canada
(August 20, 1995).
BU-CEIS-9508
TITLE: CLASSIFICATION WITH OVERLAPPING FEATURE INTERVALS
AUTHOR(S): Hakime Unsal Koc
ABSTRACT: This thesis presents a new form of exemplar-based learning
method, based on overlapping feature intervals. Classification with
Overlapping Feature Intervals (COFI) is a particular implementation
of this technique. In this incremental, inductive and supervised
learning method, the basic unit of the representation is an interval.
The COFI algorithm learns the projections of the intervals in each
class dimension for each feature. An interval is initially a point on
a class dimension, then it can be expanded through generalization.
No specialization of intervals is done on class dimensions by this
algorithm. Classification in the COFI algorithm is based on a majority
voting among the local predictions that are made individually by each
feature.
NOTE: MSc. Thesis.
BU-CEIS-9509
TITLE: Predicate Invention in Inductive Program Synthesis
AUTHOR(S): Pierre Flener
ABSTRACT: In Inductive Logic Programming, predicate invention is the
process of introducing a hitherto unknown predicate, and its description,
into the description of the currently learned predicate. This is only
necessary when a finite axiomatization of the current predicate is
otherwise impossible, in which case the description of the invented
predicate is recursive. So necessary predicate invention is a program
synthesis task, and had thus best be done by a synthesizer rather than
by a general purpose learner, because one can then re-use the vast body
of knowledge about recursive algorithm design. Taking a schema-guided
approach, I show how predicate invention can be performed (if not
avoided) by intelligent re-use from structured background knowledge,
by necessarily successful intelligent dialogue with the user, and by
structural or computational generalization of the original problem.
BU-CEIS-9510
TITLE: Contexts, Oracles, and Relevance
AUTHOR(S): Mehmet Surav and Varol Akman
ABSTRACT: We focus on how we should define the relevance of
information to a context for information processing agents, such as
oracles. We build our formalization of relevance upon works in
pragmatics which refer to contextual information without giving any
explicit representation of context. We use a formalization of context
(due to us) in Situation Theory, and demonstrate its power in this
task. We also discuss some computational aspects of this formalization.
NOTE: A revised version of this work will be presented in the AAAI-95
Fall Symposium on "Formalizing Context", MIT, Cambridge, MA (Nov. 1995).
BU-CEIS-9511
TITLE: A CONSTRAINT-BASED CASE FRAME LEXICON ARCITECTURE
AUTHOR(S): Kemal Oflazer and Okan Yilmaz
ABSTRACT: Recent advances in theoretical and implementational aspects
of feature and constraint-based formalisms for representing linguistic
information have fostered research on the use of such formalisms in
the design and implementation of computational lexicons. Case-frame
approach has been the representation of choice especially for
languages with free constituent order, explicit case marking of noun
phrases and embedded clauses filling nominal syntactic roles. The
semantics of such syntactic role fillers are usually determined by
their lexical semantic and morphosyntactic properties, instead of
position in the sentence \cite{onto}. In this paper we present our
approach to building a constraint-based case frame lexicon for use in
natural language processing in Turkish.
A number of observations that we have made on Turkish have indicated
that we have to go beyond the traditional transitive and intransitive
distinction, and utilize a framework where verb valence is considered
as the obligatory co-existence of an arbitrary subset of possible
arguments along with the obligatory exclusion of certain others,
relative to a verb sense. Additional morphosyntactic, lexical and
semantic constraints are utilized to map a given syntactic structure
to a specific verb sense.
NOTE: To appear in Proceedings of the Workshop on the Computational
Lexicon, in European Summer School on Logic, Language and Information,
Barcelona, Spain, August 1995.
BU-CEIS-9512
TITLE: ADAPTIVE SOURCE ROUTING AND ROUTE GENERATION FOR MULTICOMPUTERS
(M. Sc. Thesis)
AUTHOR: Yucel Aydogan
ABSTRACT: Scalable multicomputers are based upon interconnection
networks that typically provide multiple communication routes between
any given pair of processor nodes. In such networks, the selection of
the routes is an important problem because of its impact on the
communication performance. We propose the adaptive source routing
(ASR) scheme which combines adaptive routing and source routing into
one which has the advantages of both schemes. In ASR, the degree of
adaptivity of each packet is determined at the source processor. Every
packet can be routed in a fully adaptive or partially adaptive or
non-adaptive manner, all within the same network at the same time. The
ASR scheme permits any network topology to be used provided that
deadlock constraints are satisfied. We evaluate and compare
performance of the adaptive source routing and non-adaptive
randomized routing by simulations. Also we propose an algorithm to
generate adaptive routes for all pairs of processors in any multistage
interconnection network. Adaptive routes are stored in a route table
in each processor's memory and provide high bandwidth and reliable
interprocessor communication. We evaluate the performance of the
algorithm on IBM SP2 networks in terms of obtained bandwidth, time to
fill in the route tables, and efficiency exploited by the parallel
execution of the algorithm.
BU-CEIS-9513
TITLE: Generalized Vertical Partitioning of Signature Files
AUTHOR(S): Seyit Kocberber and Fazli Can
ABSTRACT: A new signature file method, called Generalized Multi-Fragmented
Signature File (GMFSF), is presented. The new method provides a unified
framework for other vertical partitioning signature files. The performance
of GMFSF is measured with a prototype information retrieval system using a
database of 152,850 MARC records. The experimental results agree with the
theoretical analysis. The response time of GMFSF decreases with an increasing
number of query terms and is independent of the number of hits to the query.
These features make the method competitive with inverted files, and the
experiments further show that GMFSF performs better than inverted files
for the non-zero hit conjunctive queries with more than three terms.
BU-CEIS-9514
TITLE: Analysis of Concurrency Control Protocols for Real-Time
Database Systems
AUTHOR: Ozgur Ulusoy
ABSTRACT: This paper provides an approximate analytic solution method
for evaluating the performance of concurrency control protocols
developed for real-time database systems (RTDBSs).
Transactions processed in a RTDBS are associated with timing
constraints typically in the form of deadlines. The primary
consideration in developing a RTDBS concurrency control protocol
is the fact that satisfaction of the timing constraints of transactions
is as important as maintaining the consistency of the underlying
database. The proposed solution method provides the evaluation
of the performance of concurrency control protocols
in terms of the satisfaction rate of timing constraints.
As a case study, a RTDBS concurrency control protocol,
called High Priority, is analyzed using the proposed method.
The accuracy of the performance results obtained is ascertained
via simulation. The solution method is also used to investigate
the real-time performance benefits of the High Priority over the
ordinary Two-Phase Locking.
BU-CEIS-9515
TITLE: Error-tolerant Tree Matching
AUTHOR: Kemal Oflazer
ABSTRACT: This paper presents an efficient algorithm for
retrieving from a database of trees, all trees that match a given
query tree approximately, that is, within a certain error tolerance. It
has natural language processing applications in searching for matches in
example-based translation systems, and retrieval from lexical
databases containing entries of complex feature structures. The
algorithm has been implemented on SPARCStations, and for large
randomly generated synthetic tree databases (some having tens of
thousands of trees) it can associatively search for trees with a
small error, in a matter of tenths of a second to few seconds.}