Fundamental of Accelerated Data Science
- Course Code GK847013
- Duration 1 day
Course Delivery
Course Delivery
This course is available in the following formats:
-
Company Event
Event at company
Request this course in a different delivery format.
Course Overview
TopCompany Events
These events can be delivered exclusively for your company at our locations or yours, specifically for your delegates and your needs. The Company Events can be tailored or standard course deliveries.
Course Schedule
TopTarget Audience
Top- Experience with Python, ideally including pandas and NumPy
- Suggested resources to satisfy prerequisites: Kaggle's pandas Tutorials, Kaggle's Intro to Machine Learning, Accelerating Data Science Workflows with RAPIDS
Course Objectives
Top- Use cuDF to accelerate pandas, Polars, and Dask for analyzing datasets of all sizes efficiently
- Utilize a wide variety of machine learning algorithms, including XGBoost, for different data science problems
- Deploy machine learning models on a Triton Inference Server to deliver optimal performance
- Learn and apply powerful graph algorithms to analyze complex networks with NetworkX and cuGraph
- Perform multiple analysis tasks on massive datasets to stave off a simulated epidemic outbreak effecting the UK
Course Content
TopModule 1: Introduction
- Meet the instructor.
- Create an account.
Module 2: GPU-Accelerated Data Manipulation
Ingest and prepare several datasets (some larger-than-memory) for use in multiple machine learning exercises later in the workshop:
- Read data directly to single and multiple GPUs with pandas, Polars, cuDF, and Dask.
- Prepare population, road network, and clinic information for machine learning tasks on the GPU with cuDF.
Module 3: GPU-Accelerated Machine Learning
- Apply several essential machine learning techniques to the data that was prepared in the first section:
- Use supervised and unsupervised GPU-accelerated algorithms with cuML.
Module 4: Graph Analytics
- Create and analyze graph data on the GPU with cuGraph.
Module 5: Project: Data Analysis to Save the UK
Apply new GPU-accelerated data manipulation and analysis skills with population-scale data to help stave off a simulated epidemic affecting the entire UK population:
- Use RAPIDS to integrate multiple massive datasets and perform real-world analysis.
- Pivot and iterate on your analysis as the simulated epidemic provides new data for each simulated day.