Abstract:
On the DCU platform, existing automatic parallelization tools rely on source-to-source transformation of program code, and cannot directly generate efficient DCU executable programs for serial code, which limits the efficiency of program development and migration, and produces performance loss. To solve this problem, the end-to-end translation from serial code to DCU parallel program was realized by using the decoupling feature of LLVM framework for front-end and back-end, by realizing automatic parallel mapping and code generation pass from serial LLVM IR to DCU executable program. The compiler used the polyhedron model algorithm to mine the parallelism in the serial LLVM IR, and automatically mapped the appropriate program regions to the DCU based on the heuristic method to generate the DCU kernel code. The test results on the Polybench test set show that the average performance of use cases after automatic parallelization can reach 1.8 times the original performance, and multiple use cases can reach a maximum speedup of 3.7×.