Live Chat
Monday - Friday 8am - 6pm EST Chat Now
Contact Us
Monday - Friday 8am - 8pm EST 1-866-716-6688 Other Contact Options
Checkout

Cart () Loading...

    • Quantity:
    • Delivery:
    • Dates:
    • Location:

    $

Introduction to Hadoop Administration (TTCHADADM3)

Learn how to install, configure, and maintain the Apache Hadoop framework.

GK# 5122

Course Overview

TOP

Apache Hadoop is an open source framework for creating reliable and distributable compute clusters. Credited with the IBM Watson Jeopardy win in 2011, Hadoop can be used (with other related frameworks) to process large unstructured or semi-structured data sets from multiple sources to dissect, classify, learn from, and make suggestions for business analytics, decision support, and other advanced forms of machine intelligence. 

This introductory-level, hands-on lab-intensive course is geared for the administrator who is new to Hadoop and responsible for maintaining a Hadoop cluster and its related components. Hadoop is a system designed for massive scalability; it’s extremely fault-tolerant compared to other cluster architectures. As administrators, you will need to install, configure, and maintain Hadoop on Linux in various compute environments. 

This course agenda may be easily customized for addressing areas of specific interest to your team. There are lab variations that support Cloudera and Hortonworks distributions as well.

Schedule

TOP
  • Delivery Format:
  • Date:
  • Location:
  • Access Period:

$

What You'll Learn

TOP
  • Install, configure, and maintain the Apache Hadoop framework
  • Explore MapReduce, YARN, Spark
  • Explore Mahout, MLib, and other frameworks
  • Hadoop architecture
  • Install Hadoop
  • Test Hadoop programs
  • Optimize and tune Hadoop’s performance
  • Install Hadoop for the cloud and HBase

Outline

TOP
Viewing outline for:

Virtual Classroom Live Outline

1. Hadoop Overview

  • Map/Reduce
  • Hadoop, YARN, and Spark
  • Mahout and MLib
  • Alternate Frameworks

2. Hadoop Architecture

  • Hadoop Map/Reduce
  • YARN
  • HDFS
  • Spark
  • Cassandra
  • HBase
  • Hive
  • Pig

3. Installing Hadoop

  • Linux Considerations
  • SSH Configuration
  • Hadoop Installation
  • OS Security
  • NamedNodes
  • Job Trackers

4. Test-Running Hadoop Programs

  • Simple MapReduce Test
  • Spark Test
  • Pig Test

5. Cloud Installations

  • Amazon EC2
  • Amazon Elastic MapReduce
  • Rackspace
  • Installing with Docker

6. Optimization and Tuning

  • Performance Metrics
  • Node Sizing
  • Kernel Tuning

7. Installing HBase

  • HBase Installation
  • ZooKeeper

8. Previewing Hadoop 3

Who Should Attend

TOP

Administrators who need to maintain a Hadoop cluster and its related components in a Linux environment

Course Delivery

This course is available in the following formats:

Virtual Classroom Live

Experience expert-led online training from the convenience of your home, office or anywhere with an internet connection.

Duration: 3 day

Request this course in a different delivery format.
Enroll