Live Chat
Monday - Friday 8am - 6pm EST Chat Now
Contact Us
Monday - Friday 8am - 8pm EST 1-866-716-6688 Other Contact Options
Checkout

Cart () Loading...

    • Quantity:
    • Delivery:
    • Dates:
    • Location:

    $

Apache HBase Fundamentals

Learn about the HBase architecture and data models, how to install HBase, and how to use the shell and client APIs.

GK# 7284

Course Overview

TOP

Apache HBase is a NoSQL database that runs on Hadoop’s HDFS and is fully integrated with Hadoop. It is designed to handle large sets of data, which contain billions of columns and rows. It is an ideal choice for storing sparse and semi-structured data because it provides fault tolerance through replication, automatic failover, sharding, and load balancing. It also provides fast real-time lookups, in-memory caching, and server-side processing. Apache HBase can be accessed using various client APIs, including Java, Thrift, and REST. This learning path will discuss the HBase architecture and data models, it will also show how to install HBase, and how to use the shell and client APIs to access the data.

Schedule

TOP
  • Delivery Format:
  • Date:
  • Location:
  • Access Period:

$

What You'll Learn

TOP
  • Installation, Architecture, and Data Modeling
  • Access Data through the Shell and Client API
  • Advanced API, Administration, and MapReduce

Outline

TOP
Viewing outline for:

On-Demand Outline

Apache HBase Fundamentals: Installation, Architecture, and Data Modeling

Installation

  • Overview of HBase
  • HBase Requirements
  • HBase Software Requirements
  • HBase Filesystems
  • HBase Installation Modes
  • Installing HBase in Local Mode
  • Installing HBase in Fully Distributed Mode
  • Accessing the Web-Based Management Console
  • Using the HBase Shell

Architecture

  • HBase Components
  • HFiles and Regions
  • The Write-Ahead Log and MemStore
  • Compaction and Splits
  • Data Replication
  • Access HBase
  • Securing HBase
  • Hadoop’s MapReduce Integration with HBase

Data Modeling

  • HBase Schema Overview
  • Designing Tables
  • Designing Rowkeys for Tables
  • Versions, DataTypes, and Joins
  • Time to Live and Deleted Cells

Apache HBase Fundamentals: Access Data through the Shell and Client API

Table Creation in the Shell

  • Creating Tables Using the Shell
  • Disabling, Enabling, and Dropping a Table
  • Altering a Table’s Properties

Data Management in the Shell

  • Adding Data to a Table
  • Using the scan and get Commands
  • Deleting Data from a Table
  • Using Counters

Insert Data Using Java Client API

  • Establishing a Connection
  • Creating Tables using the Client Java API
  • Creating a Put Class Instance
  • Adding Data using the add() Option
  • Using Timestamp with Put for Versioning
  • Using the get() and has() Method

Get Data Using Java Client API

  • Using the Get Class
  • Retrieving Columns using the Get Class
  • Retrieving Versions of Columns using the Get Class
  • Retrieving Specific Values from a Cell
  • Using List with the Get Class

Scan Data using Java Client API

  • Using Scan() to read an Entire Table
  • Scanning Rows Starting at a Specific Row or a Range
  • Using Constructors to Narrow Search Results
  • Using getScanner() Method
  • Using the ResultScanner Class

Delete and Update Data using Java Client

  • Updating Data
  • Deleting Data

Apache HBase Fundamentals: Advanced API, Administration, and MapReduce

Filters

  • Implementing Utility Filters
  • Implementing Comparison Filters
  • Implementing Custom Filters

Cluster Administration

  • Checking the Status of the HBase Instance
  • Listing the User Space Tables
  • Deleting Tables
  • Completing a Major Compaction Manually
  • Merging Adjoining Regions
  • Stopping and Decommissioning a RegionServer
  • Performing a Rolling Restart
  • Adding a New Node
  • Monitoring HBase

Snapshots and Backups

  • Taking a Snapshot
  • Using a Snapshot to Clone a Table
  • Exporting and Restoring Snapshots
  • Performing a Full Shutdown Backup
  • Performing a Backup on a Live Cluster
  • Performing a Restore

MapReduce

  • Using HBase as a Data Sink for MapReduce Jobs
  • Using HBase as a Data Source for MapReduce Jobs
  • Bulk Loading Data
  • Splitting Map Tasks When Sourcing an HBase Table
  • Accessing Other HBase Tables within a MapReduce Job

Prerequisites

TOP

In the modern world, data is being generated at an exponential rate. Business data generation is increasing at a similarly rapid rate. Only a small percentage of business data is structured data in rows and columns of databases. This data proliferation requires a rethinking of traditional techniques for capture, storage, and processing. Big data is a term that describes data sets so big they can’t be managed with traditional database systems. Big Data is also a collection of tools and techniques aimed at solving these problems. This learning path covers the current thinking and state of the art for managing and manipulating large data sets using the techniques and tools of Big Data.

Who Should Attend

TOP

Administrators and developers who need experience using Hbase.

Course Delivery

This course is available in the following formats:

On-Demand

Train at your own pace with 24/7 access to courses that help you acquire must-have technology skills.



Request this course in a different delivery format.
Enroll