Seminar in Computer Engineering

Bilkent University
Department of Computer Engineering
CS 590/690 SEMINAR

1D and 2D Partitioning Schemes for Efficient Distributed GNN Training

Ahmet Can Bağırgan
Master Student
(Supervisor: Prof.Dr.Cevdet Aykanat)
Computer Engineering Department
Bilkent University

Abstract: In recent years, graph neural networks (GNNs) have gained much attention as a growing area of deep learning capable of learning on graph-structured data. However, training GNNs on large-scale graphs often demands more computational power and memory than a single machine can provide, making distributed GNN training a promising approach for scaling up. A key requirement for distributed GNN training is to partition the input graph into smaller segments, which can then be distributed across multiple machines in a compute cluster. The objective of the partitioning is to minimize computational and communicational overhead. The communicational overhead refers to a number of metrics such as the number of messages, total message volume, and maximum communication volume handled by the process. Graph partitioning methods and tools have been used for parallel GNN training. In these methods reducing computational overhead is encoded as balancing the loads of the processes via proper vertex weighting. Reducing communication overhead is modeled as reducing the cut size of the partition which refers to minimizing the total communication volume. 1D partitioning models suffer from computational load balance and communicational volume balance because of the scale-free nature of GNN graphs. In this work, we compare various 1D and 2D partition models and methods for scaling distributed GNN training on distributed-memory parallel systems.

DATE: November 18, Monday @ 16:00 Place: EA 502