Enhancing Data Science Outcomes With Efficient Workflow (EDSOEW)


Course Overview

Learn how to create an end-to-end, hardware-accelerated machine learning pipeline for large datasets. Throughout the development process, you’ll use diagnostic tools to identify delays and learn to mitigate common pitfalls.

Please note that once a booking has been confirmed, it is non-refundable. This means that after you have confirmed your seat for an event, it cannot be cancelled and no refund will be issued, regardless of attendance.


  • Basic knowledge of a standard data science workflow on tabular data. To gain an adequate understanding, we recommend this article.
  • Knowledge of distributed computing using Dask. To gain an adequate understanding, we recommend the “Get Started” guide from Dask.
  • Completion of the DLI’s Fundamentals of Accelerated Data Science course or an ability to manipulate data using cuDF and some experience building machine learning models using cuML.

Course Objectives

  • Develop and deploy an accelerated end-to-end data processing pipeline for large datasets
  • Scale data science workflows using distributed computing
  • Perform DataFrame transformations that take advantage of hardware acceleration and avoid hidden slowdowns
  • Enhance machine learning solutions through feature engineering and rapid experimentation
  • Improve data processing pipeline performance by optimizing memory management and hardware utilization

Follow On Courses

Prices & Delivery methods

Online Training

0.5 days

  • on request
Classroom Training

0.5 days

  • on request

Currently there are no training dates scheduled for this course.