Heterogeneous hardware support in BEAGLE, a high-performance computing library for statistical phylogenetics

Abstract

We describe our approach to extend the BEAGLE library for high-performance statistical phylogenetic inference (maximum likelihood estimation and Bayesian analysis) in order to support a wider range of modern accelerators and multicore CPUs, and present the corresponding performance results from these platforms. Our solution includes a shared code design providing a uniform interface for a variety of compute platforms available under CUDA and OpenCL parallel computing frameworks. We have also implemented CPU threading in BEAGLE, and in sum these improvements allow the library to exploit a wide-range of hardware parallelism including CPU and Xeon Phi, vectorization intrinsics (e.g., SSE, AVX), and GPUs. Although code reuse and maintainability are features of our design, our approach also includes hardware specific optimizations for performance critical portions of the code. Extending the BEAGLE library in this manner allows a greater variety of users to exploit the hardware resources available to them. As an example of increase in performance, using BEAGLE on a system with two Intel Xeon E5-2680v4 CPUs we observe a 39-fold speedup for a MrBayes 3.2.6 codon-model analysis, compared to the native MrBayes MPI-SSE implementation. The general design features of the library also provide a model for software development using parallel computing frameworks that is applicable to other domains.

Publication
2017 46th International Conference on Parallel Processing Workshops (ICPPW) (pp. 23–32). IEEE