Configuring concurrent computation of phylogenetic partial likelihoods: accelerating analyses using the BEAGLE library


We describe our approach in augmenting the BEAGLE library for high-performance statistical phylogenetic inference to support concurrent computation of independent partial likelihoods arrays. Our solution involves identifying independent likelihood estimates in analyses of partitioned datasets and in proposed tree topologies, and configuring concurrent computation of these likelihoods via CUDA and OpenCL frameworks. We evaluate the effect of each increase in concurrency on throughput performance for our partial likelihoods kernel for a four-state nucleotide substitution model on a variety of parallel computing hardware, such as NVIDIA and AMD GPUs, and Intel multicore CPUs, observing up to 16-fold speedups over our previous implementation. Finally, we evaluate the effect of these gains on an domain application program, MrBayes. For a partitioned nucleotide-model analysis we observe an average speedup for the overall run time of 2.1-fold over our previous parallel implementation, and 10-fold over the native MrBayes with SSE.

In S Ibrahim, KK Choo, Z Yan, and W Pedrycz (Eds.), Algorithms and Architectures for Parallel Processing. ICA3PP 2017. Lecture Notes in Computer Science (pp. 533–547). Springer, Cham