Skip to main content
Ctrl+K
Coiled logo, Coiled: A Dask Company Coiled logo, Coiled: A Dask Company Tech Blog

Site Navigation

  • Login
  • Sign Up
  • Product
  • Pricing
  • Docs

Site Navigation

  • Login
  • Sign Up
  • Product
  • Pricing
  • Docs
Ctrl+K

Recent Posts

  • 15 March - Shuffling large data at constant memory in Dask
  • 23 February - Just in time Python environments
  • 17 January - How many PEPs does it take to install a package?
  • 06 January - Scaling Hyperparameter Optimization With XGBoost, Optuna, and Dask
  • 06 January - Handling Unexpected AWS IAM Changes

Tags

  • aws
  • costs
  • dask
  • distributed
  • ec2
  • large scale model training
  • optuna
  • p2p
  • package sync
  • packaging
  • prefect
  • python environments
  • shuffling
  • workflow automation
  • xgboost

Authors

  • David Chudzicki (1)
  • Greg Hayes (2)
  • Hendrik Makait (1)
  • Nat Tabris (1)
  • Samantha Hughes (2)
  • Sarah Johnson (1)

Coiled Tech Blog#

Latest post#

  • Shuffling large data at constant memory in Dask by Hendrik Makait on 15 March

    With release 2023.2.1, dask.dataframe introduces a new shuffling method called P2P, making sorts, merges, and joins faster and using constant memory. Benchmarks show impressive improvements:

    P2P shuffling uses constant memory while task-based shuffling scales linearly.

    Read more...

© Copyright 2023 Coiled Computing.