Live Chat
Monday - Friday 8am - 6pm EST Chat Now
Contact Us
Monday - Friday 8am - 8pm EST 1-800-268-7737 Other Contact Options

Cart () Loading...

    • Quantity:
    • Delivery:
    • Dates:
    • Location:


Processing Big Data with Hadoop in Azure HDInsight (DAT202.1x)

Learn how to use the Hadoop technologies with Microsoft Azure HDInsight.

GK# 6993

Course Overview


More and more organizations are taking on the challenge of analyzing big data. This course teaches you how to use the Hadoop technologies in Microsoft Azure HDInsight to build batch processing solutions that cleanse and reshape data for analysis. In this five-week course, you’ll learn how to use technologies like Hive, Pig, Oozie, and Sqoop with Hadoop in HDInsight; and how to work with HDInsight clusters from Windows, Linux, and Mac OSX client computers.

NOTE: To complete the hands-on elements in this course, you will require an Azure subscription and a Windows, Linux, or Mac OS X client computer. You can sign up for a free Azure trial subscription (a valid credit card is required for verification, but you will not be charged for Azure services). Note that the free trial is not available in all regions. It is possible to complete the course and earn a certificate without completing the hands-on practices.

Add a course completion certificate to your cart when you enroll for your course, or purchase it at any time prior to course completion. A voucher code will be added to your MyGK dashboard that can be redeemed upon passing the exam in the course completion chapter.

Add Course Completion Certificate for $100


  • Delivery Format:
  • Date:
  • Location:
  • Access Period:


What You'll Learn

  • Provision an HDInsight cluster.
  • Connect to an HDInsight cluster, upload data, and run MapReduce jobs.
  • Use Hive to store and process data.
  • Process data using Pig.
  • Use custom Python user-defined functions from Hive and Pig.
  • Define and run workflows for data processing using Oozie.
  • Transfer data between HDInsight and databases using Sqoop.


Viewing outline for:

On-Demand Outline

Week 1

Course Introduction

01 | Module 1: Getting Started with HDInsight


Week 2

02 | Module 2: Processing Big Data with Hive


Week 3

03 | Module 3: Going Beyond Hive with Pig and Python


Week 4

04 | Module 4: Building a Big Data Workflow


Week 5 Course Exam


Expected Effort

Each week, you should expect to spend 3-4 hours on the course, including:

  • Viewing the lecture videos and demonstrations.
  • Further reading.
  • Trying the labs.
  • Completing module assessments (see below).


Coursework and Grading

This course includes coursework, some of which is graded. Each module in the course includes an ungraded lab (which is designed to give you hands-on practice with the Hadoop technologies taught in the module), and a graded assessment, in which you must answer all questions. Additionally, at the end of the course you must complete a final exam.

The module assessments account for 60% of the total grading for the course, and the final exam accounts for the remaining 40%. You must achieve an overall score of 70% or more to pass this course. In the module assessments, you have two attempts at each question.

In the final exam, you are restricted to one attempt per question.



We encourage all students to submit questions, observations, and comments in the Discussion section. If you have any issues while working on the course, check here first – your fellow students may have already found a resolution!

Please remember that the discussion forum is open to all students and staff, and while we love to see passionate engagement, abusive or inflammatory behavior will not be tolerated.

Due to the volume of students attending this course, it will not be possible for the course staff to answer every question individually. You should still post questions however, because in many cases, your fellow students may be able to help.


  • Familiarity with database concepts and basic SQL query syntax
  • Familiarity with programming fundamentals
  • A willingness to learn actively and persevere

Who Should Attend


If Data Analysist can be found in your email signature line ... this course is for you.

Course Delivery

This course is available in the following formats:


Train at your own pace with 24/7 access to courses that help you acquire must-have technology skills.

Request this course in a different delivery format.