面向DCU加速器的异构代码自动映射与生成

AUTOMATIC MAPPING AND GENERATION OF HETEROGENEOUS CODE FOR DCU ACCELERATORS

  • 摘要: 在DCU平台上,现有自动并行化工具依赖程序代码的源到源变换,无法直接为串行代码生成高效的DCU可执行程序,这限制了程序开发和移植的效率,并产生了性能损耗。为解决该问题,利用LLVM框架对前后端的解耦特性,通过实现串行LLVM IR到DCU可执行程序的自动并行映射和代码生成模块,实现从串行代码到DCU并行程序的端到端翻译。该编译器模块使用多面体模型算法对串行LLVM IR中的并行性进行挖掘,并基于启发式方法将其中的合适程序区域自动映射到DCU,生成DCU内核代码。Polybench测试集上的测试结果表明,自动并行化后用例平均性能可以达到原始性能的1.8倍,多个用例最高可达到3.7倍加速比。

     

    Abstract: On the DCU platform, existing automatic parallelization tools rely on source-to-source transformation of program code, and cannot directly generate efficient DCU executable programs for serial code, which limits the efficiency of program development and migration, and produces performance loss. To solve this problem, the end-to-end translation from serial code to DCU parallel program was realized by using the decoupling feature of LLVM framework for front-end and back-end, by realizing automatic parallel mapping and code generation pass from serial LLVM IR to DCU executable program. The compiler used the polyhedron model algorithm to mine the parallelism in the serial LLVM IR, and automatically mapped the appropriate program regions to the DCU based on the heuristic method to generate the DCU kernel code. The test results on the Polybench test set show that the average performance of use cases after automatic parallelization can reach 1.8 times the original performance, and multiple use cases can reach a maximum speedup of 3.7×.

     

/

返回文章
返回