Posts by Greg Hayes

Scaling Hyperparameter Optimization With XGBoost, Optuna, and Dask

XGBoost is one of the most well-known libraries among data scientists, having become one of the top choices among Kaggle competitors. It is performant in a wide of array of supervised machine learning problems, implements scalable training through the rabit library, and integrates with many big data processing tools, including Dask.

xgboost-optuna-dask-in-many-threads

Read more ...


Automated Data Pipelines On Dask With Coiled & Prefect

Dask is widely used among data scientists and engineers proficient in Python for interacting with big data, doing statistical analysis, and developing machine learning models. Operationalizing this work has traditionally required lengthy code rewrites, which makes moving from development and production hard. This gap slows business progress and increases risk for data science and data engineering projects in an enterprise setting. The need to remove this bottleneck has prompted the emergence of production deployment solutions that allow code written by data scientists and engineers to be directly deployed to production, unlocking the power of continuous deployment for pure Python data science and engineers.

Alt text

Read more ...