Ugur Dogrusoz - CS 390 / 490 Individual Research Projects

The purpose of these courses are to introduce students to research techniques in computer engineering and science. Over the years, undergraduate students have participated in many of our projects within this course, leading to useful components of larger software systems and many new methods published in leading journals.

Refer to course pages CS390 and CS490 for prerequisites and conditions.

Projects

Design and implementation of a simulation component for biological models in SBML in the Newt editor

Newt is a free, web based, open source viewer and editor for pathways in Systems Biological Graphical Notation (SBGN), Systems Biology Markup Language (SBML), and Simple Interaction Format (SIF). It was written with a series of libraries and extensions based on Cytoscape.js with utmost customization in mind.

We would like to build a software component for simulation and analysis of biochemical networks in SBML and their dynamics. The component is to be integrated into the Newt repository.

Design and implementation of querying algorithms for pangenome graphs

A pangenome graph is a sequence model that encodes the mutual alignment of many genomes. Efficient visualization of pangenome graphs is crucial for understanding the genetic diversity of a species. As these graphs are rather huge, one would like to fetch a "subgraph of interest" by specifying their segments or regions of interest in the pangenome graph.

Towards this end, we would like to design and implement some queries (such as finding a chain of segments of interest) in a pangenome graph. The algorithm is to be used within a visual Pangenome graph analysis tool named PG2. The tool is based on Cytoscape.js.

Design and implementation of a customized layout algorithm for pangenome graphs

A pangenome graph is a sequence model that encodes the mutual alignment of many genomes. Efficient visualization of pangenome graphs is crucial for understanding the genetic diversity of a species. The layout of a pangenome graph is a key factor in the readability of the visualization. The layout should be able to handle the complexity of the graph, and should be able to reveal the underlying structure of the graph.

Towards this end, we would like to design and implement a customized layout highlighting the sequencing of the segments in a pangenome graph. The algorithm is to be used within a visual Pangenome graph analysis tool named PG2. The tool is based on Cytoscape.js.

Design and implementation of an algorithm for initial placement of new graph elements in an evolving graph to enable better incremental layouts

Incremental layout algorithms are used to update the layout of a graph when the graph is modified by adding or removing nodes or edges. The goal is to preserve the user's mental map by making minimal changes to the existing layout while ensuring that the new elements are placed in a way that maintains the overall readability and aesthetics of the graph.

Place these new elements in appropriate locations is critical in achieving a good incremental layout. The algorithm should consider the existing layout and the new elements' neighborhood. This is tricky when the graph contains subgraphs (nesting relationships).

The new algorithm is to be implemented and integrated into this GitHub repository for use by Cytoscape.js visualization components.

Implementation and improvement of CiSE algorithm for layout of clustered graphs

CiSE (Circular Spring Embedder) is an algorithm for automatic layout of clustered graphs using a circular style. The algorithm tries to determine optimal location and orientation of individual clusters intrinsically within a modified spring embedder. Heuristics such as reversal of the order of nodes in a cluster and swap of neighboring node pairs in the same cluster are employed intermittently to further relax the spring embedder system, resulting in reduced inter-cluster edge crossings. Unlike other algorithms generating circular drawings, CiSE does not require the quotient graph to be acyclic, nor does it sacrifice the edge crossing number of individual clusters to improve respective positioning of the clusters. Moreover, it reduces the total area required by a cluster by using the space inside the associated circle.

Cytoscape.js is a graph library for effective visualization and analysis of networks.

This project aims to implement CiSE as a Cytoscape.js extension and to improve it with respect to commonly accepted graph layout criteria such as number of edge crossings and total area.

Design & implementation of a fast compound spring embedder layout

CoSE (Compound Spring Embedder) is a layout algorithm for undirected compound graphs, relaxing restrictions of previously known algorithms in regards to topology and geometry. The algorithm is based on the traditional force-directed layout scheme with extensions to handle multi-level nesting, edges between nodes of arbitrary nesting levels, varying node sizes, and other possible application-specific constraints.

Cytoscape.js is a graph library for effective visualization and analysis of networks. Existing implementations of CoSE are available, including one as a Cytoscape.js extension. Even though CoSE is fairly fast as a spring embedder algorithm and works well with small to medium sized graphs, it fails to complete within acceptable time when used as part of an interactive visualization tool for large sized graphs.

This project aims to speed up the algorithm by potentially using techniques such as:

  • GPU processing (parallel processing)
  • Multi-level scaling or spectral graph drawing techniques

    Implementation and improvement of polyomino based approach to disconnected graph layout

    Disconnected graphs occur rather frequently in real life applications either during the construction of a graph interactively or because of the nature of the application. Most graph layout algorithms assume a graph to be connected and try to minimize the area needed for the resulting drawing. No matter how effective such an algorithm is, the space wasted overall could be arbitrarily large if the relative locations of disconnected objects of a graph are chosen by a naive, inefficient method. Another key parameter here is the aspect ratio of the region (e.g., a window) within which the graph is to be displayed. When displaying a graph, the larger the wasted space is, the less visible objects will be, making the visualization process more difficult. Thus, a disconnected graph layout algorithm must strive for a packing of disconnected objects which respects the aspect ratio of the region in which it is to be displayed.

    Cytoscape.js is a graph library for effective visualization and analysis of networks. The aim of this project is to implement a polyomino based approach developed in the past for disconnected graph layout as a Cytoscape.js extension, and improve the algorithm in certain ways. In particular, we need a heuristic to determine the size of the grid used in forming polyominoes based on the input graph.

    Design & implementation of a schema & time bar based designer for a highly customizable network visualization tools

    Most network visualization based analysis tools have similar requirements. These include ability to

  • inspect properties of nodes and links,
  • filter nodes/edges based on type,
  • filter nodes/edges based on property values,
  • filter nodes/edges based on a time bar (date range),
  • support incremental and regular automatic layout,
  • provide complexity management mechanisms.

    Cytoscape.js is a graph library for effective visualization and analysis of networks. This project aims to design and implement a designer tool that takes the schema of a graph (could be defined using rdf) from an arbitrary domain and generates the necessary style sheets / code for a Cytoscape.js based visualization tool that supports above capabilities.

    Integration of PathwayMapper into the cancer genomics portal cBioPortal

    The cBioPortal for Cancer Genomics provides visualization, analysis and download of large-scale cancer genomics data sets. A tool named PathwayMapper was previously developed with the motivation of creating a platform providing more intuitive template pathway diagrams with rich data decoration and analysis with access to data in cBioPortal. PathwayMapper can also be used as a valuable tool for constructing pathways from scratch.

    PathwayMapper is based on Cytoscape.js, which is a graph library for effective visualization and analysis of networks.

    This project aims to integrate PathwayMapper into cBioPortal as an alternative to the cBioPortal's current network view. Ways to instantiate PathwayMapper from existing cBioPortal pages need to be designed.

    Design & implementation of a network visualization tool for detecting fraud in telecommunication networks

    Recently, detection of fraud is an increasingly important and challenging task. Since we maintain more and more personal data online, potential losses due to fraud increased to billions of dollars. Use of advanced querying of databases and visualization, especially of CDR, is critical in detecting fraud.

    Cytoscape.js is a graph library for effective visualization and analysis of networks. This project aims to study fraud detection methods and develop a Cytoscape.js based tool for detecting fraud in CDR data stored in graph databases thorough advanced querying and visualization.

    Design & implementation of a network visualization tool for intelligence analysis

    Often times huge amounts of data available for intelligence and security tasks is not very useful since they are mostly kept as tables in relational databases. Recent rise of graph based databases not only facilitate application of advanced querying in such large databases and raising alerts as needed but also facilitate mechanisms to effectively visualize the indirect connections in the data in detecting anomalies and security problems.

    Cytoscape.js is a graph library for effective visualization and analysis of networks. This project aims to study intelligence and security analysis through advanced querying and network visualization.