Skip to main content
Ctrl+K
Coiled logo, Coiled: A Dask Company Coiled logo, Coiled: A Dask Company Tech Blog

Site Navigation

  • Login
  • Docs
  • Sign Up

Site Navigation

  • Login
  • Docs
  • Sign Up

Posts tagged p2p

Shuffling large data at constant memory in Dask

  • 15 March 2023
  • Hendrik Makait
  • shuffling distributed p2p dask

With release 2023.2.1, dask.dataframe introduces a new shuffling method called P2P, making sorts, merges, and joins faster and using constant memory. Benchmarks show impressive improvements:

P2P shuffling uses constant memory while task-based shuffling scales linearly.

Read more ...


© Copyright 2024 Coiled Computing.