Bilkent University
Department of Computer Engineering
CS 590/690 SEMINAR

 

Type-aware Bug Localization: A Hybrid Approach Merging LLM and IR

 

Mayasah Lami
Master Student
(Supervisor: Asst.Prof.Anıl Koyuncu)
Computer Engineering Department
Bilkent University

Abstract: Bug localization continues to be a core challenge in software maintenance, often accounting for nearly half of the developers' programming efforts. While Information Retrieval (IR) approaches are widely adopted for this task, they often struggle with the semantic gap between natural language bug descriptions and source code. This research aims to address critical gaps in current bug localization techniques by proposing a novel two-stage framework that integrates Large Language Models (LLMs) with traditional IR methods through a bug-type-aware approach. Our framework first uses the semantic comprehension of LLMs to classify bugs into specific categories, followed by targeted localization using type-specific strategies. Unlike existing approaches that treat all bugs uniformly, our method recognizes and exploits the distinct characteristics of different bug types to improve localization accuracy and efficiency. The proposed approach attempts to bridge several research gaps: (1) the underutilization of bug-type information in localization strategies, (2) the limited semantic understanding in traditional IR methods, and (3) the lack of integrated frameworks that combine the strengths of both IR and LLM approaches, while still preserving performance. Ideally, we would like to implement type-specific preprocessing, pattern recognition, and adaptive scoring mechanisms, and our framework aims to reduce the search space and improve localization precision significantly. We evaluate our approach using the BugsJS dataset for classification and localization, comparing it against generic bug localization approaches, type-specific bug detection tools, and traditional IR methods. Our comprehensive evaluation metrics will include classification accuracy measures and standard IR metrics for localization performance. This research contributes to the field by providing not only a more accurate bug localization framework but also valuable insights into type-specific prompt engineering patterns and guidelines for optimal deployment across diverse software projects.

 

DATE: April 07, Monday @ 14:30 Place: EA 409