[an error occurred while processing this directive]

Cache-Oblivious Wavefront: Improving Parallelism of Recursive Dynamic Programming Algorithms without Losing Cache-Efficiency

State-of-the-art cache-oblivious parallel algorithms for dynamic programming (DP) problems usually guarantee asymptotically optimal cache performance without any tuning of cache parameters, but they often fail to exploit the theoretically best parallelism at the same time. While these algorithms achieve cache-optimality through the use of a recursive divide-and-conquer (DAC) strategy, scheduling tasks at the granularity of task dependency introduces artificial dependencies in addition to those arising from the defining recurrence equations. We remove the artificial dependency by scheduling tasks ready for execution as soon as all real dependency constraints are satisfied, while preserving the cache-optimality, by inheriting the DAC strategy. We applied our approach to a set of widely known dynamic programming problems, such as Floyd-Warshall’s All-Pairs Shortest Paths, Stencil, and LCS. Theoretical analyses show that our techniques improve the span of 2-way DAC-based Floyd Warshall’s algorithm on an n node graph from Θ(n log^2 n) to Θ(n), stencil computations on a d dimensional hypercubic grid of width w for h time steps from Θ((d^2h) w^(log⁡(d+2)-1)to Θ(h), and LCS on two sequences of length n each from Θ(n^log_2⁡3 ) to Θ(n). In each case, the total work and cache complexity remain asymptotically optimal. Experimental measurements exhibit a 3-5 times improvement in absolute running time, 10-20 times improvement in burdened span by Cilkview, and approximately the same L1/L2 cache misses by PAPI.

Bio

Yuan Tang got his bachelor’s degree from Peking University and later a PhD degree in computer science from the Institute of Software, Chinese Academy of Science. He is now an assistant professor at the Software School, Fudan University, Shanghai, China. His research interests are parallel programming, in particular the theoretical and experimental study of parallel algorithms and data structure. He has spent more than three years at MIT CSAIL (Computer Science and Artificial Intelligence Laboratory) as a visiting scientist.
Yuan Tang

Speaker

Yuan Tang

Date

Thursday, February 5, 2015

Time

1 pm - 2 pm

Location

Laufer Center Room 101