Skip to main content
Ctrl+K
Coiled logo, Coiled: A Dask Company Coiled logo, Coiled: A Dask Company Tech Blog

Site Navigation

  • Login
  • Sign Up
  • Product
  • Pricing
  • Docs

Site Navigation

  • Login
  • Sign Up
  • Product
  • Pricing
  • Docs

Posts tagged p2p

Shuffling large data at constant memory in Dask

  • 15 March 2023
  • Hendrik Makait
  • distributed p2p shuffling dask

With release 2023.2.1, dask.dataframe introduces a new shuffling method called P2P, making sorts, merges, and joins faster and using constant memory. Benchmarks show impressive improvements:

P2P shuffling uses constant memory while task-based shuffling scales linearly.

Read more ...


© Copyright 2023 Coiled Computing.