Reduction Fusion for Optimized Distributed Data-Parallel Computations via Inverse Recomputation

FSE 2025 |

Published by ACM

The ACM International Conference on the Foundations of Software Engineering, Ideas, Visions and Reflections Track

DOI

Distributed data-parallel computations are critical for both traditional big data applications and emerging large language model tasks. The efficiency of these computations largely depends on reducer performance, particularly in handling extensive data access. This paper introduces a novel reduction fusion algorithm that optimizes distributed data-parallel programs by fusing dependent reducers and mappers into a single, unified reducer. Employing inverse recomputation, the algorithm preserves partial aggregation and reduces storage, network I/O, memory, and cache overheads. Our preliminary evaluation reveals performance improvements of up to 2.47×, demonstrating the practicality and effectiveness of this approach, while also highlighting its potential to address challenges posed by extensive data access in modern distributed computing environments.