@mhoye @jonathankoren it's branchless too, and easily parallelized onto a GPU to speed it up