“Efficient Parallelization of an Ubiquitous Sequential Computation”, 2023-10-27 ():
We find a succinct expression for computing the sequence xt = at xt−1 + bt in parallel with two prefix sums, given t = (1, 2, …, n), at ∈ ℝn, bt ∈ ℝn, and initial value x0 ∈ ℝ.
On n parallel processors, the computation of n elements incurs ℒ(𝓁𝓸𝓰 n) time and 𝒪(n) space.
Sequences of this form are ubiquitous in science and engineering, making efficient parallelization useful for a vast number of applications.
We implement our expression in software, test it on parallel hardware, and verify that it executes faster than sequential computation by a factor of n⁄log n.