Bilkent University
Department of Computer Engineering
CS 590/690 SEMINAR

 

A Toolkit for Pangenomic File Formats: Data Extraction, Analysis, and Alignment Verification

 

Ebrar Bozkurt
Master Student
(Supervisor: Assoc.Prof.Can Alkan)

Computer Engineering Department
Bilkent University

Abstract: Linear reference genomes are widely used despite not capturing genetic diversity, resulting in reference bias in read mapping. Pangenomes address these limitations by representing multiple reference samples, usually as graphs, although different representations exist. With the growing interest in pangenomics research, new file formats and methods have been developed to represent pangenomics graphs and their alignments. However, additional functionalities and methods are needed to efficiently extract, analyze, and manipulate data from these file formats. This work introduces a novel toolkit offering new utilities for pangenomic file formats. The toolkit includes extracting graph characteristics, comparing alignment files, and verifying alignment accuracy, facilitating more robust and efficient pangenomic analyses.

 

DATE: March 17, Monday @ 14:10 Place: EA 502