All technical reports files are in compressed (gzip) postscript format.
The file names are of the sort BU-CEIS-YYXX.ps.z
BU-CEIS-9401
TITLE: Spelling Correction in Agglutinative Languages
AUTHOR(S): Kemal Oflazer and Cemaleddin Guzey
ABSTRACT: This paper presents an approach to spelling
correction in agglutinative languages that is based on two-level
morphology and a dynamic programming based search algorithm. Spelling
correction in agglutinative languages is significantly different than in
languages like English, because the concept of a word in such
languages is much wider that the entries found in a dictionary, owing to
productive word formation by derivational and inflectional
affixations. After an overview of certain issues and relevant
mathematical preliminaries, we formally present the problem and the
our proposed solution. We then present results from our experiments
with spelling correction in Turkish, a Ural--Altaic agglutinative
language.
BU-CEIS-9402
TITLE: Parsing Turkish with the Lexical Functional Grammar Formalism
AUTHOR(S): Zelal Gungordu and Kemal Oflazer
(REVISED Oct 26 1994)
ABSTRACT: This paper describes our work on parsing Turkish using
the {\em lexical-functional grammar} formalism. This work represents the
first effort for parsing Turkish. Our implementation is
based on Tomita's parser developed at Carnegie-Mellon University
Center for Machine Translation. The grammar covers a substantial
subset of Turkish including simple and complex sentences, and deals
with a reasonable amount of word order freeness. The complex agglutinative
morphology of Turkish lexical structures is handled using a separate
two-level morphological analyzer. After a discussion of key relevant
issues regarding Turkish grammar, we discuss aspects of our system and
present results from our implementation. Our initial results suggest
that our system can parse about 82\% of the sentences directly and
almost all the remaining with very minor pre-editing.
BU-CEIS-9403
TITLE: EFFICIENCY OF RANDOMLY ASSIGNING TASKS TO PROCESSORS
AUTHOR(S): Wm. Randolph Franklin, Chandrasekhar Narayanaswami, Mohan
S. Kankanhalli, and Varol Akman
ABSTRACT: We have implemented the uniform grid technique - due to the first
author - on several parallel machines. It is complicated to globally
equalize the load across the processors to correct for the fact that
some grid cells have more data than others. Therefore, in our
implementations we randomly assigned grid cells to different
processors. The actual performance observed was very good and we
obtained a near-linear speedup for many different algorithms. In this
paper, we analyze the efficiency of this random assignment method of
load balancing for parallel machines. The problem of analyzing the
efficiency of leaving the loads uneven may be abstracted as follows.
Suppose that we have N independent tasks of unit weight and P
processors to execute them. Assume that we are randomly, uniformly,
and independently assigning each job to one of the processors. Since
some processors will have more tasks than others, the efficiency, or
the optimal time, N/P, divided by the expected time until the last
processor finishes will be less than unity. This paper analyzes this
and presents finite numbers and asymptotic equations. For example, N
= 1000, P = 100 gives 53% efficiency. N = 1000, P = 1000 gives 18%
efficiency. These efficiency figures are surprisingly high and they
explain the good performance of our implementations.
BU-CEIS-9404
TITLE: REVIEW OF ACTION SEMANTICS BY PETER D. MOSSES
AUTHOR(S): Varol Akman
ABSTRACT: This is a review of Peter D. Mosses' recent book ACTION SEMANTICS
(Cambridge Tracts in Theoretical Computer Science 26, Cambridge
University Press, U.K., 372 + xx pages, 1992).
BU-CEIS-9405
TITLE: THE LOGIC OF COUNTERACTIONS
AUTHOR(S): Erkan Tin and Varol Akman
ABSTRACT: We extend causal theories and study actions in domains involving
multiple agents. Causal theories, invented by Yoav Shoham, are based
on a temporal nonmonotonic logic and have computationally tractable
aspects. Since Shoham's formalism does not provide an adequate
mechanism for representing simultaneous actions and specifying their
consequences, we introduce the notion of counteractions while
preserving the efficiency and model-theoretic properties of causal
theories.
BU-CEIS-9406
TITLE: ISSUES IN COMMONSENSE SET THEORY
AUTHOR(S): Mujdat Pakkan and Varol Akman
ABSTRACT: The success of set theory as a foundation for mathematics inspires
its use in artificial intelligence, particularly in commonsense reasoning.
In this survey, we briefly review classical set theory from an AI
perspective, and then consider alternative set theories. Desirable
properties of a possible commonsense set theory are investigated,
treating different aspects like cumulative hierarchy, self-reference,
cardinality, etc. Assorted examples from the ground-breaking research
on the subject are also given.
BU-CEIS-9407
TITLE: COMMONSENSE ASPECTS OF BUYING AND SELLING
AUTHOR(S): Murat Ersan, Ebru Ersan, and Varol Akman
ABSTRACT: We describe an experimental program that implements a commonsense
micro-theory for buying and selling. Our system characterizes how
intelligent agents hold items and money, how they buy and sell items,
and the way money and items are transferred. The ontology of the
system includes money (cash, check, credit card), agents (people or
organizations), items (movable, real estate, service), barter, and the
notions of transfer, loan, buying by installments, profit, and
loss. Our work has been motivated by the Cyc project of Douglas Lenat.
BU-CEIS-9408
TITLE: REPRESENTING EMOTIONS IN TERMS OF OBJECT DIRECTEDNESS
AUTHOR(S): Varol Akman and Hakime G. Unsal
ABSTRACT: A logical formalization of emotions is considered to be tricky because
they appear to have no strict types, reasons, and consequences. On
the other hand, such a formalization is crucial for commonsense
reasoning. Here, the so-called ``object directedness'' of emotions is
studied by using Helen Nissenbaum's influential ideas.
BU-CEIS-9409
TITLE: NONSTANDARD SET THEORIES AND INFORMATION MANAGEMENT
AUTHOR(S): Varol Akman and Mujdat Pakkan
ABSTRACT: The merits of set theory as a foundational tool in mathematics
stimulate its use in various areas of artificial intelligence, in
particular intelligent information systems. In this paper, a study of
various nonstandard treatments of set theory from this perspective is
offered. Applications of these alternative set theories to information
or knowledge management are surveyed.
BU-CEIS-9410
TITLE: SETS IN CONTEXT AND RELATED ISSUES
AUTHOR(S): Varol Akman and Mehmet Surav
ABSTRACT: We formalize the notion of ``context'' for sets. We employ two
predefined constructs for this - an extension predicate and a context
sensitive membership relation. Ext returns the members of its first
argument, with the context C specified as its second argument. The new
context sensitive membership relation holds when the left hand side of
the relation is in the extension of the right hand side. We show that
the new membership relation is useful for a ``commonsense set theory''
a la Perlis and Zadrozny, and discuss related issues.
BU-CEIS-9411
TITLE: AXIOMATIC SET THEORIES AND COMMON SENSE
AUTHOR(S): Mehmet Surav and Varol Akman
ABSTRACT: Various axiomatic set theories (ZF, NBG, NF, and KPU) are studied
with a critical eye. The basic mathematical and philosophical reasons
behind their axioms are given, as well as their review from the
commonsense point of view. An introduction to a ``commonsense set
theory'' is attempted at the end.
BU-CEIS-9412
TITLE: REAL-TIME TRANSACTION SCHEDULING IN DATABASE SYSTEMS
AUTHOR(S): Ozgur Ulusoy and Geneva G. Belford
ABSTRACS: A database system supporting a real-time application, which can
be called `a real-time database system (RTDBS)', has to provide real-time
information to the executing transactions.Each RTDB transaction is
associated with a timing constraint, usually in the form of a
deadline. Efficient resource scheduling algorithms and concurrency
control protocols are required to schedule the transactions so as to
satisfy both timing constraints and data consistency requirements. In
this paper we concentrate on the concurrency control problem in
RTDBS's. Our work has two basic goals: real-time performance
evaluation of existing concurrency control approaches in RTDBS's,
and proposing new concurrency control protocols with improved
performance. One of the new protocols is locking-based, and
it prevents the priority inversion problem by scheduling the data lock
requests based on prioritizing data items. The second new protocol
extends the basic timestamp-ordering method by involving real-time
priorities of transactions in the timestamp assignment procedure.
Performance of the protocols is evaluated through simulations by using a
detailed model of a single-site RTDBS. The relative performance
of the protocols is examined as a function of transaction load,
data contention (which is determined by a number of system parameters),
and resource contention. The protocols are also tested under various
real-time transaction processing environments. The performance of the
proposed protocols appears to be good, especially under conditions of
high transaction load and high data contention.
Keywords: Real-time database systems, transaction scheduling, concurrency
control, performance evaluation.
BU-CEIS-9413
TITLE: PROCESSING REAL-TIME TRANSACTIONS IN A REPLICATED DATABASE SYSTEM
AUTHOR(S): Ozgur Ulusoy
ABSTRACT: A database system supporting a real-time application
has to provide real-time information to the executing transactions.
Each real-time transaction is associated with a timing constraint,
typically in the form of a deadline. It is difficult to satisfy all
timing constraints due to the consistency requirements of the
underlying database. In scheduling the transactions it is aimed to
process as many transactions as possible within their deadlines.
Replicated database systems possess desirable features for real-time
applications, such as a high level of data availability, and
potentially improved response time for queries. On the other hand,
multiple copy updates lead to a considerable overhead due to the
communication required among the data sites holding the copies.
In this paper, we investigate the impact of storing multiple copies of data
on satisfying the timing constraints of real-time transactions.
A detailed performance model of a distributed database system is employed
in evaluating the effects of various workload parameters and design
alternatives on the system performance. The performance is expressed
in terms of the fraction of satisfied transaction deadlines. A
comparison of several real-time concurrency control protocols, which
are based on different approaches in involving timing constraints of
transactions in scheduling, is also provided in performance
experiments.
Index Terms -- Real-time database systems, data replication,
transaction scheduling, concurrency control, performance evaluation.
BU-CEIS-9414
TITLE: EXCURSIONS TO A TOPOLOGICAL PICTUREBOOK
AUTHOR(S): Ahmet Arslan and Varol Akman
ABSTRACT: George K. Francis' A TOPOLOGICAL PICTUREBOOK [Springer-Verlag,
New York (1987)] is full of complicated topological figures, mostly
drawn manually. Our goal is to automate the production of such
illustrations and to obtain publication-quality hardcopy using
assorted techniques of computer graphics. To that end, sweeping is
discussed as a major surface modelling tool. Some interactive methods
are given to produce interesting topological surfaces. The program Tb
(which stands for `Topologybook') is described and various pictures
generated by this software are presented. Tb is a free-form surface
modeller and produces topological shapes with little effort. Central
to the implementation of Tb is a paradigm of solid modelling in which
computation of a shape is regarded as sweeping with some parametric
variations, viz. shape = sweep + control.
CAVEAT: The final copy of this manuscript includes some color pictures
which are absent from this version. Please note that the postscript
file is large (approx. 14 Mbytes).
BU-CEIS-9415
TITLE: BABY-SIT: A COMPUTATIONAL MEDIUM BASED ON SITUATIONS
AUTHOR(S): Erkan Tin and Varol Akman
ABSTRACT: While situation theory and situation semantics provide an
appropriate framework for a realistic model-theoretic treatment of
natural language, serious thinking on their `computational' aspects
has just started. Existing proposals mainly offer a Prolog- or
Lisp-like programming environment with varying degrees of divergence
from the ontology of situation theory. In this paper, we introduce a
computational medium (called BABY-SIT) based on situations. The
primary motivation underlying BABY-SIT is to facilitate the
development and testing of programs in domains ranging from
linguistics to artificial intelligence in a unified framework built
upon situation-theoretic constructs.
CAVEAT: The final copy of this manuscript includes Figure 4 (a screen
dump) which is absent from this version.
BU-CEIS-9416
TITLE: A Tool for Tagging Turkish Text
AUTHOR(S): Kemal Oflazer and Ilker Kuruoz
ABSTRACT:Automatic text tagging is an important component in higher
level analysis of text corpora, and its functionality output can also
be used in many natural language processing applications. This paper
describes a part-of-speech (POS) tagger for Turkish text. It is based
on a full scale two-level morphological specification of Turkish
implemented on the PC-KIMMO environment, augmented with statistical
information compilation and use, multi-word construct recognition and
constraint- and heuristics-based morphological, and POS ambiguity
resolution. The tagger also has additional functionality for fine
tuning of the morphological analyzer, such as logging erroneous
parses, commonly used roots etc. The output of the tagger can be used
in further syntactic and semantic analysis.
BU-CEIS-9417
TITLE: SITUATED MODELING OF EPISTEMIC PUZZLES
AUTHOR(S): Murat Ersan and Varol Akman
ABSTRACT: Situation theory is a mathematical theory of meaning
introduced by Jon Barwise and John Perry. It has evoked great
theoretical and practical interest and motivated the framework of a
few `computational' systems. Unfortunately, there is a lack of `real
life' applications on these systems and this paper is a preliminary
attempt to remedy this situation. Here, we solve a class of epistemic
puzzles, introduced by Raymond Smullyan, using the situation-theoretic
programming language PROSIT.
BU-CEIS-9418
TITLE: Symbols, Signs and Cognition
AUTHOR(S): David Davenport
Abstract: Cognition is inextricably linked with the concepts of signs
and symbols. The classical, computational, view of cognition,
explicitly states this in its "physical symbol system hypothesis." Its
major competitor, connnectionism, while claiming to be subsymbolic,
implicitly acknowledges both signs and symbols. Unfortunately, both
theories have their problems. The computational paradigm fails to
explain where symbols come from and how they acquire meaning. The
connectionist paradigm fails to provide sufficient representational
power and anyway is based on the discredited associationalist
philosophy. While the problems are different, the precise nature of
the two approaches and the essential difference between them, remains
something of an enigma. This paper outlines a theory of "symbols"
which not only appears to provide a clear basis for distinguishing
between the two paradigms, but also points the way to a new
architecture which may help solve the mystery of cognition.
Keywords: Artificial intelligence, symbol systems,
connectionism, cognition
BU-CEIS-9419
TITLE: SITUATED PROCESSING OF PRONOMINAL ANAPHORA
AUTHOR(S): Erkan Tin and Varol Akman
ABSTRACT: We describe a novel approach to the analysis of pronominal
anaphora in Turkish. A computational medium which is based on
situation theory is used as our implementation tool. The task of
resolving pronominal anaphora is demonstrated in this environment
which employs situation-theoretic constructs for processing.
BU-CEIS-9420
TITLE: EFFECTS OF STEMMING ON TURKISH TEXT RETRIEVAL
AUTHOR(S): Aysin Solak and Fazli Can
ABSTRACT: A stemming algorithm developed for the Turkish language is presented.
The retrieval performance of the algorithm is then measured in terms
of its effect on retrieval performance of a best-match retrieval system.
The results show that the stemming algorithm reduces the index dictionary
size significantly and also improves the retrieval effectiveness.
Keywords: text retrieval
BU-CEIS-9421
TITLE: A Performance Evaluation Model for Distributed Real-Time Databases
AUTHOR(S) OZGUR ULUSOY, GENEVA G. BELFORD
ABSTRACT: A real-time database system (RTDBS) is designed to provide timely
response to the transactions of data-intensive applications. Each transaction
processed in a RTDBS is associated with a timing constraint in the form of
a deadline.
In this paper, a performance evaluation model is provided to enable distributed
RTDBS designers to analyze transaction scheduling algorithms.
The model is developed progressing from a simple mathematical analysis to
complicated simulations. The performance is expressed in terms of the
fraction of satisfied transaction deadlines.
The paper also provides an example simulation experiment implemented
using the model presented.
BU-CEIS-9422
TITLE: A Study of Two Transaction Processing Architectures for Distributed
Real-Time Database Systems
AUTHOR(S): Ozgur Ulusoy
ABSTRACT: A real-time database system (RTDBS) is designed to provide timely
response to the transactions of data-intensive applications.
Processing a transaction in a distributed RTDBS environment presents the design
choice of how to provide access to remote data referenced by the transaction.
Satisfaction of the timing constraints of transactions should be the primary
factor to be considered in scheduling accesses to remote data. In this paper,
we describe and analyze two different alternatives to this fundamental design
decision. With the first alternative, transaction operations are executed at
the sites where required data pages reside. The other alternative is based on
transmitting data pages wherever they are needed. Although the latter approach
is characterized by large message volumes carrying data pages, it is shown
in our experiments to perform better than the other approach under most of
the workloads and system configurations tested. The performance metric used
in the evaluations is the fraction of transactions that satisfy their timing
constraints.
Keywords: Real-time database systems, distributed transaction processing,
performance evaluation.
BU-CEIS-9423
TITLE:USING A CORPUS FOR TEACHING TURKISH MORPHOLOGY
AUTHOR(S): Altay Guvenir and Kemal Oflazer
ABSTRACT:This paper reports on the preliminary phase of our ongoing research
towards developing an intelligent tutoring environment for Turkish grammar.
One of the components of this environment is a corpus search tool
which, among other aspects of the language, will be used to present
the learner sample sentences along with their morphological analyses.
Following a brief introduction to the Turkish language and its
morphology, the paper describes the morphological analysis and ambiguity
resolution used to construct the corpus used in the search tool.
Finally, implementation issues and details involving the user interface of the
tool are discussed.
BU-CEIS-9424
TITLE: Object-Space Parallel Polygon Rendering on Hypercube-connected
Multicomputers
AUTHOR(S): Tahsin M. Kurc
, Cevdet Aykanat, and Bulent Ozguc
ABSTRACT: This paper presents algorithms developed for object-space
parallelism for polygon rendering on hypercube-connected multicomputers.
In object-space parallelism, each processor is given the responsibility to
render a portion of the input polygons. Each processor shades and z-buffers
the locally generated pixels. After this local rendering, the remaining
pixels in each processor should be globally z-buffered. Efficient
parallelization of this pixel merging operation is important. This is
the only place where an overhead is incurred due to parallelization.
In this paper, efficient algorithms are presented to perform local
rendering and global pixel merging operations. A modified scanline based
z-buffer algorithm is proposed. This algorithm reduces the number of pixels
sent and received in the pixel merging operation and avoids the
re-initialization of the z-buffer for each scanline. Various algorithms are
proposed for pixel merging phase. These algorithms use different
communication characteristics of the hypercube multicomputers. Efficient
algorithms for load balancing in the pixel merging operation is also
proposed and presented. Experimental results obtained on a 16-processor
Intel's iPSC/2 hypercube multicomputer are also presented.
BU-CEIS-9425
TITLE: An Evaluation of the Validity of the Dexter
Hypertext Reference Model: A Case Study
AUTHOR(S): Bilge (Karal) Say and Fazli Can
ABSTRACT: Various formal models have been proposed for defining hypertext
systems. Among them, the Dexter Hypertext Reference Model is gaining
wide acceptance as a base for the design of future hypertext systems
and interoperability tools. However, this model has yet to be extended
and validated against the hypertext systems it took as its basis. This
paper is an attempt in the latter aspect, namely considering a widely
used hypertext system and validating it against the original model.
For this purpose, HyperCard, which was indeed one of the Dexter
Hypertext Reference Model's ``reference'' systems, is chosen. For
validation, we use the concept of refinement with Z, a formal
specification language.
BU-CEIS-9426
TITLE: An Analysis of Frame Sliced Signature Files
AUTHOR(S): Bulent Tezcan, Gokhan Tur and Fazli Can
ABSTRACT: Signature files provide an efficient access tool for
information retrieval. They reflect the contents of data objects
in terms of bit patterns and act as a filter during query
processing. In this study, an exact formula for the false drop
probability estimation of multiterm queries for Frame Sliced
Signature Files (FSSF) is given. The experimental results, which
are obtained using the INSPEC database of 12,684 documents, are
provided and compared with the theoretical result
BU-CEIS-9427
TITLE: Incremental Query Evaluation for Vertically Partitioned Signature
Files in Very Large Databases
AUTHOR(S): Seyit KOCBERBER and Fazli CAN
ABSTRACT: Signature files provide efficient retrieval of formatted and
unformatted data with a small space overhead. The main idea of all
signature based schemes is to reflect the essence of the data objects
into bit patterns. Vertical partitioning of signature files provides fast
filtering of unqualifying objects by accessing only those bits of the object
signatures that are set in the query signature. For very large databases
even one slice of database signatures may occupy several disk blocks.
An incremental query evaluation method, called INCBIT, is proposed which
dynamically avoids the unnecessary blocks of bit slices which need to be
accessed by utilizing the conjunctive nature of the queries. Three methods
to enhance the performance of INCBIT are introduced. In the first method,
bit slices are divided into vertical fragments in which the density of
on-bits vary. For high weight queries, after reducing the false drop
probability to a negligible level, the remaining on-bits of the query are
not used due to the incremental nature of INCBIT. Based on this observation,
first using the query bits in the fragments with low on-bit density
increases the performance. The second method originates from the fact that
clustering the signatures of the records containing discriminating terms
obtains a few on-blocks at the end of query processing. This prevents the
deterioration of the performance for the majority of the non-zero hit user
queries and improves the performance. In the third method, a bit-map for
signature blocks is kept in memory for the discriminating terms used in
signature clustering. The block bit-map of a term shows the positions of
the signature blocks which contain at least one record in which the term
exists. This data structure eliminates the block accesses produced due to
high false drops at the beginning of query evaluation. The terms with lower
database occurrence frequency are specified more frequently in the queries.
Such discriminating terms constitute about 20% of all terms; therefore,
the bit-map space overhead is small.
BU-CEIS-9428
TITLE: Parallelization of an Interior Point Algorithm
for Linear Programming (M.Sc. Thesis)
AUTHOR: Huseyin Simitci
ABSTRACT: In this study, we present the parallelization of Mehrotra's
predictor-corrector interior point algorithm, which is a
Karmarkar-type optimization method for linear programming.
Computation types needed by the algorithm are identified
and parallel algorithms for each type are presented.
The repeated solution of large symmetric sets of linear equations,
which constitutes the major computational effort in
Karmarkar-type algorithms, is studied in detail. Several forward and
backward solution algorithms are tested, and buffered backward
solution algorithm is developed. Heurustic bin packing algorithms are
used to schedule sparse matrix-vector product and factorization
operations. Algorithms having the best performance results are used
to implement a system to solve linear programs in parallel on
multicomputers. Design considerations and implementation details of
the system are discussed, and some performance results are presented
from a number of real problems.
BU-CEIS-9429
TITLE: FOCUSING FOR PRONOUN RESOLUTION IN ENGLISH DISCOURSE:
AN IMPLEMENTATION (*)
AUTHOR(S): Ebru Ersan and Varol
Akman
ABSTRACT: Anaphora resolution is one of the most active research areas
in natural language processing. This study examines focusing as a tool
for the resolution of pronouns which are a kind of anaphora. Focusing
is a discourse phenomenon like anaphora. Candy Sidner formalized
focusing in her 1979 MIT PhD thesis and devised several algorithms to
resolve definite anaphora including pronouns. She presented her theory
in a computational framework but did not generally implement the
algorithms. Her algorithms related to focusing and pronoun resolution
are implemented in this thesis. This implementation provides a better
comprehension of the theory both from a conceptual and a computational
point of view. The resulting program is tested on different discourse
segments, and evaluation and analysis of the experiments are presented
together with the statistical results.
(*) Revised version of the first author's M.S. thesis.
BU-CEIS-9430
TITLE: SITUATED MODELING OF EPISTEMIC PUZZLES (*)
AUTHOR(S): Murat Ersan and Varol Akman
ABSTRACT: Situation theory is a mathematical theory of meaning
introduced by Jon Barwise and John Perry. It has evoked great
theoretical and practical interest and motivated the framework of a
few `computational' systems. PROSIT is the pioneering work in this
direction. Unfortunately, there is a lack of real-life applications
on these systems and this study is a preliminary attempt to remedy
this deficiency. Here, we examine how much PROSIT reflects
situation-theoretic concepts and solve a group of epistemic puzzles
using the constructs provided by this programming language.
(*) Revised version of the first author's M.S. thesis
The report supercedes BU-CEIS-9417
BU-CEIS-9431
TITLE: TAGGING AND MORPHOLOGICAL DISAMBIGUATION OF TURKISH TEXT
(M.Sc. Thesis)
AUTHOR(S): Ilker Kuruoz
ABSTRACT: A part-of-speech (POS) tagger is a system that uses various
sources information to assign possibly unique POS to words. Automatic
text tagging is an important component in higher level analysis of
text corpora. Its output can also be used in many natural language
processing applications. In languages like Turkish or Finnish, with
agglutinative morphology, morphological disambiguation is a very
crucial process in tagging as the structures of many lexical forms are
morphologically ambiguous. This thesis presents a POS tagger for
Turkish text based on a full-scale two-level specification of Turkish
morphology. The tagger is augmented with a multi-word and idiomatic
construct recognizer, and most importantly morphological disambiguator
based on local lexical neighborhood constraints, heuristics and
limited amount of statistical information. The tagger also has
additional functionality for statistics compilation and fine tuning of
the morphological analyzer, such as logging erroneous morphological
parses, commonly used roots, etc. Test results indicate that the
tagger can tag about 97% to 99% of the texts accurately with very
minimal user intervention. Furthermore for sentences morphologically
disambiguated with the tagger, an LFG parser developed for Turkish,
on the average, generates 50% less ambiguous parses and parses almost
2.5 times faster.
BU-CEIS-9432
TITLE: Research Issues in Real-Time Database Systems
AUTHOR(S): Ozgur Ulusoy
ABSTRACT: Today's real-time systems are characterized by managing large
volumes of data. Efficient database management algorithms
for accessing and manipulating data are required to satisfy timing
constraints of supported applications. Real-time database systems is a new
research area investigating possible ways of applying database systems
technology to real-time systems. Management of real-time information through a
database systems requires the integration of concepts from both real-time
systems and database systems.
Some new criteria need to be developed to involve timing
constraints of real-time applications in many database systems design issues,
such as transaction/query processing, data buffering, CPU and IO scheduling.
In this paper, a basic understanding of the issues in real-time database
systems is provided and the
research efforts in this area are introduced. Different approaches to various
problems of real-time database systems are briefly described, and possible
future research directions are discussed.
-
BU-CEIS-9433
TITLE: CONTEXTS AND SITUATIONS (*)
AUTHOR(S): Mehmet Surav and Varol Akma
n
ABSTRACT: The issue of context arises in assorted areas of Artificial
Intelligence, Mathematical Logic, and Natural Language Semantics.
Although its importance is realized by various researchers, there is
not much work towards a useful formalization. In this report, we will
try to identify the problem, and decide what we need for an acceptable
(formal) account of the notion of context. We will present a
preliminary model (based on Situation Theory) and give examples to
show the use of context in various fields, and the advantages gained
by the acceptance of our proposal.
(*) Revised version of the first author's M.S. thesis
BU-CEIS-9434
TITLE: AN EFFICIENT MAPPING HEURISTIC FOR MESH-CONNECTED PARALLEL
ARCHITECTURES BASED ON MEAN FIELD ANNEALING
AUTH
OR(S) : Ismail Haritaoglu , Cevdet Ayakanat
ABSTRACT: A new Mean Field Annealing (MFA) formulation is proposed for
the mapping problem for mesh-connected architectures.
The proposed MFA heuristic exploits the conventional routing scheme
used in mesh interconnection topologies to introduce an efficient
encoding scheme. An efficient implementation scheme which decreases
the complexity of the proposed algorithm by asymptotical factors is
also developed.Experimental results also show that the proposed MFA
heuristic approaches the speed performance of the fast Kernighan-Lin
heuristic while approaching the solution quality of the powerful
simulated annealing heuristic.
BU-CEIS-9435
TITLE: A GLOBAL ROUTING HEURISTIC FOR FPGAS BASED ON
MEAN FIELD ANNEALING
AUTH
OR(S) : Ismail Haritaoglu , Cevdet Aykanat
ABSTRACT: In this work, we propose an order-independent global
routing algorithm for SRAM type FPGAs based on Mean Field Annealing.
The performance of the proposed global routing algorithm is evaluated in
comparison with LocusRoute global router on ACM/SIGDA Design Automation
benchmarks. Experimental results indicate that the proposed
MFA heuristic performs better than the LocusRoute in terms of
the distribution of the channel densities.
BU-CEIS-9436
TITLE: MAPPING AND FPGA GLOBAL ROUTING USING MEAN FIELD ANNEALING
(M.Sc. Thesis)
AUTH
OR(S) : Ismail Haritaoglu
ABSTRACT: Mean Field Annealing algorithm which was proposed for solving
combinatorial optimization problems combines the properties of neural
networks and Simulated Annealing.
In this thesis, MFA is formulated for mapping problem in
parallel processing and global routing problem in physical design
automation of Field Programmable Gate Array (FPGAs)
A new Mean Field Annealing (MFA) formulation is proposed for the
mapping problem for mesh-connected and hypercube architectures.
The proposed MFA heuristic exploits the conventional routing scheme
used in mesh and hypercube interconnection topologies to
introduce an efficient encoding scheme.
An efficient implementation scheme which decreases the complexity
of the proposed algorithm by asymptotical factors is also developed.
Experimental results also show that the proposed MFA heuristic
approaches the speed performance of the fast Kernighan-Lin heuristic
while approaching the solution quality of the powerful simulated
annealing heuristic.
Also, we propose an order-independent global routing algorithm
for SRAM type FPGAs based on Mean Field Annealing.
The performance of the proposed global routing algorithm is evaluated in
comparison with LocusRoute global router on ACM/SIGDA Design Automation
benchmarks.
Experimental results indicate that the proposed MFA heuristic performs
better than the LocusRoute.
BU-CEIS-9437
TITLE: DESIGN AND IMPLEMENTATION OF A VERB LEXICON AND VERB-SENSE
DISAMBIGUATOR FOR TURKISH
(M.Sc. Thesis)
AUTH
OR(S) : Okan Yilmaz
ABSTRACT:The lexicon has a crucial role in all natural language processing
systems and has special importance in machine translation systems.
This thesis presents the design and implementation of a verb lexicon
and a verb sense disambiguator for Turkish. The lexicon contains only
verbs because verbs encode events in sentences and play the most
important role in natural language processing systems, especially in
parsing (syntactic analyzing) and machine translation. The verb sense
disambiguator uses the information stored in the verb lexicon that we
developed. The main purpose of this tool is to disambiguate senses of
verbs having several meanings, some of which are idiomatic. We also
present a tool implemented in Lucid Common Lisp under X-Windows for
adding, accessing, modifying, and removing entries of the lexicon, and
a semantic concept ontology containing semantic features of commonly
used Turkish nouns.
BU-CEIS-9438
TITLE: ERROR-TOLERANT FINITE STATE RECOGNITION WITH APPLICATIONS TO
MORPHOLOGICAL ANALYSIS AND SPELLING CORRECTION
AUTH
OR(S): Kemal Oflazer
(Updated Feb 15 1995)
ABSTRACT: This paper presents a notion of error-tolerant recognition
with finite state recognizers. Error-tolerant recognition enables the
recognition of strings that deviate {\em mildly} from any string in
the regular set recognized by the underlying finite state recognizer.
Such recognition has applications in error-tolerant morphological
processing, spelling correction, and approximate string matching in
information retrieval. After a description of the basic concepts and
algorithms involved, we give examples from two applications: In the
context of morphological analysis, error-tolerant recognition allows
misspelled input word forms to be corrected, and morphologically
analyzed concurrently. We present an application of this to
error-tolerant analysis of agglutinative morphology of Turkish words.
The algorithm can be applied to morphological analysis of {\it any\/}
language whose morphology is {\em fully} described by two--level
finite state transducers, regardless of the word formation processes.
In the context of spelling correction, error-tolerant recognition can
be used to enumerate correct candidate forms from a given misspelled
string within a certain edit distance. Again it can be applied to any
language with a word list comprising all inflected forms, or whose
morphology is fully described by two--level finite state transducers.
We present experimental results for spelling correction for a number
of languages. These results indicate that such recognition works very
efficiently for candidate generation in spelling correction for many
European languages such as English, Dutch, French, German, Italian
(and others) with very large word lists of root and inflected forms
(some containing well over 200,000 forms), generating all candidate
solutions within 10 to 45 milliseconds (with edit distance 1) on a
SparcStation 10/41. For spelling correction in Turkish, error-tolerant
recognition operating with an (circular) recognizer of Turkish words
(with about 29,000 states and 119,000 transitions) can generate all
candidate words in less than 20 milliseconds, with edit distance 1.
BU-CEIS-9439
TITLE: ON THE USE OF INDUCTIVE REASONING IN PROGRAM SYNTHESIS:
PREJUDICE AND PROSPECTS
AUTHOR(S): Pierre Flener, Lubos Popeli
nsky
ABSTRACT: In this position paper, we give a critical analysis of the
deductive and inductive approaches to program synthesis, and of the
current research in these fields. From the shortcomings of these
approaches and works, we identify future research directions for these
fields, as well as a need for cooperation and cross-fertilization
between them.
Published in: L. Fribourg and F. Turini (eds), Proc. of LOPSTR/META'94,
LNCS, Springer-Verlag, 1994.
BU-CEIS-9440
TITLE: ILP AND
AUTOMATIC PROGRAMMING: TOWARDS THREE APPROACHES
AUTHOR(S): Pierre Flen
er, Lubos Popelinsky, Olga Stepankova
ABSTRACT: The prospects of inductive logic programming (ILP) with
respect to automatic programming (program synthesis) are discussed.
We argue that logic program synthesis from incomplete information is
but a niche of ILP, and study consequences of this statement. Then,
three approaches are described: schema-driven synthesis of logic
programs from incomplete specifications, the role of transformation
techniques in ILP, and interactive assumption-based inductive learning.
Published in: S. Wrobel (ed), Proc. of ILP'94 (GMD-Studien Nr. 237),
pp. 351--364, Bad Honnef, Germany, September 1994.
BU-CEIS-9441
TITLE: AN APPLICATION OF GENETIC PROGRAMMING TO THE 4-OP PROBLEM USING
MAP-TREES
AUTHOR(S): Tevfik Ayt
ekin, E. Erkan Korkmaz and H. Altay Guvenir
In Genetic programming (GP) applications the programs are expressed as
parse trees. A node of a parse tree is an element either from the
function-set or terminal-set, and an element of a terminal set can be
used in a parse tree more than once. However, when we attempt to use
the elements in the terminal set at most once, we encounter problems
in creating the initial random population and in crossover and mutation
operations. 4-Op problem is an example for such a situation. We
developed a technique called map-trees to overcome these anomalies.
Experimental results on 4-Op using map-trees are presented.
BU-CEIS-9442
TITLE: LEARNING SOIL CLASSIFICATION USING CFP
AUTHOR(S): Hakime Gulay Unsal and H. Altay Guveni
ABSTRACT: The paper presents an application of the Classification by
Feature Partitioning (CFP) algorithm to the problem of soil
classification. CFP is an exemplar based, incremental and supervised
learning algorithm. Learning in CFP is accomplished by storing the
objects separately in each feature dimension as disjoint partitions
of values. Application of the CFP algorithm to soil classification,
one of the important problems of soil engineering and civil
engineering, is described. The classification helps the soil engineer
by giving general guidance. In many soil engineering problems there
are no rational expression available for the analysis for the solution.