Reduction Fusion for Optimized Distributed Data-Parallel Computations via Inverse Recomputation
- Haoxiang Lin ,
- Yang Wang ,
- Yanjie Gao ,
- Hongyu Zhang ,
- Ming Wu ,
- Mao Yang
FSE 2025 |
Published by ACM
The ACM International Conference on the Foundations of Software Engineering, Ideas, Visions and Reflections Track
Distributed data-parallel computations are critical for both traditional big data applications and emerging large language model tasks. The efficiency of these computations largely depends on reducer performance, particularly in handling extensive data access. This paper introduces a novel reduction fusion algorithm that optimizes distributed data-parallel programs by fusing dependent reducers and mappers into a single, unified reducer. Employing inverse recomputation, the algorithm preserves partial aggregation and reduces storage, network I/O, memory, and cache overheads. Our preliminary evaluation reveals performance improvements of up to 2.47×, demonstrating the practicality and effectiveness of this approach, while also highlighting its potential to address challenges posed by extensive data access in modern distributed computing environments.