Live Chat
Monday - Friday 8am - 6pm EST Chat Now
Contact Us
Monday - Friday 8am - 8pm EST 1-800-268-7737 Other Contact Options
Checkout

Cart () Loading...

    • Quantity:
    • Delivery:
    • Dates:
    • Location:

    $

Apache Spark Basics

Quick introduction to Spark Basics & the Ecosystem, RDDS and Spark SQL

GK# 9336

Course Overview

TOP

Apache Spark Basics is a two-day, fast-paced course that provides students with a quick introduction to the Spark environment, benefits, features and common uses and tools.

Schedule

TOP
  • Delivery Format:
  • Date:
  • Location:
  • Access Period:

$

What You'll Learn

TOP

Working in a hands-on learning environment, students will learn where Spark fits into the Big Data ecosystem, and how to use core Spark features for critical data analysis.  The course also explores (at a higher-level) key Spark technologies such as Spark shell for interactive data analysis, Spark internals, RDDs, Dataframes and Spark SQL.

Outline

TOP
Viewing outline for:

Virtual Classroom Live Outline

Spark Basics

  • Background and history
  • Spark and hadoop
  • Spark concepts and architecture
  • Spark eco system (core, spark sql, mlib, streaming)

 

First look at Spark 

  • Spark in local mode
  • Spark web UI
  • Spark shell
  • Analyzing dataset - part 1
  • Inspecting RDDs

 

RDDs In Depth

  • Partitions
  • RDD Operations / transformations
  • RDD types
  • MapReduce on RDD
  • Caching and persistence
  • Sharing cached RDDs

 

Spark SQL & Dataframes

  • Dataframes
  • Dataframes DDL
  • Spark SQL
  • Defining table and importing datasets
  • Queries

Labs

TOP
Viewing labs for:

Virtual Classroom Live Labs

This “skills-centric” course is about 50% hands-on lab and 50% lecture, designed to train attendees in core Spark development and use skills, coupling the most current, effective techniques with the soundest industry practices. Throughout the course students will be led through a series of progressively advanced topics, where each topic consists of lecture, group discussion, comprehensive hands-on lab exercises, and lab review.

Prerequisites

TOP

Students should have attended the course(s) below, or should have basic skills in these areas:

  • Java Programming Fundamentals (for Java edition training)
  • Introduction to Python Programming (for Python edition training)
  • Introduction to SQL (Basic familiarity is needed, not in-depth SQL skills)

Who Should Attend

TOP

This is an Introductory-level course is geared for Developers and Architects seeking to be proficient in Spark tools & technologies. Attendees should be experienced developers who are comfortable with Java, Scala or Python programming.  Students should also be able to navigate Linux command line and have basic knowledge of Linux editors (such as VI / nano) for editing code.

Follow-On Courses

TOP

NOTE:  This is a quick start-style course. Students who wish to explore Spark in more depth with more hands-on should consider

  • JumpStart to Developing in Spark (TTDS6503) – 3 days – more depth and hands-on, developer-focused
  • Developing Apache Spark for Big Data & The Hadoop Ecosystem (TTDS6505 )– 5 days, most depth, developer-focused

Course Delivery

This course is available in the following formats:

Virtual Classroom Live

Experience expert-led online training from the convenience of your home, office or anywhere with an internet connection.

Duration: 2 day

Request this course in a different delivery format.
Enroll