How we parallelized 600+ pandas functions with Modin

Scaling up pandas is hard. With Modin, we took a first-principles approach to parallelizing the pandas API. Rather than focus on implementing what we knew was easy, we developed a theoretical basis for dataframes—the abstraction underlying pandas—and derived a dataframe algebra that can express the 600+ pandas operators in under 20 algebraic operators.

Articles You Might Like

Share This Article

Related Posts