Reduction Fusion for Optimized Distributed Data-Parallel Computations via Inverse Recomputation

Haoxiang Lin; Yang Wang; Yanjie Gao; Hongyu Zhang; Ming Wu; Mao Yang

Reduction Fusion for Optimized Distributed Data-Parallel Computations via Inverse Recomputation

Haoxiang Lin ,
Yang Wang ,
Yanjie Gao ,
Hongyu Zhang ,
Ming Wu ,
Mao Yang

FSE 2025 | June 2025

Published by ACM

The ACM International Conference on the Foundations of Software Engineering, Ideas, Visions and Reflections Track

DOI

Download BibTex

Distributed data-parallel computations are critical for both traditional big data applications and emerging large language model tasks. The efficiency of these computations largely depends on reducer performance, particularly in handling extensive data access. This paper introduces a novel reduction fusion algorithm that optimizes distributed data-parallel programs by fusing dependent reducers and mappers into a single, unified reducer. Employing inverse recomputation, the algorithm preserves partial aggregation and reduces storage, network I/O, memory, and cache overheads. Our preliminary evaluation reveals performance improvements of up to 2.47×, demonstrating the practicality and effectiveness of this approach, while also highlighting its potential to address challenges posed by extensive data access in modern distributed computing environments.