Cloudera Training for Apache HBase
Learn to deploy and manage Apache HBase in your Hadoop environment.
HBase is an open-source, non-relational, distributed database that provides a fault-tolerant, scalable way to store massive quantities of data. In this course, Hadoop developers and administrators will gain the skills needed to install and maintain HBase and develop client code. You will cover concepts addressed on the Cloudera Certified Specialist in Apache HBase (CCSHB) exam.
You will receive one CCSHB exam voucher at the end of class.
What You'll Learn
- HBase architecture
- Use the HBase shell to directly manipulate HBase tables
- Design optimal HBase schemas for efficient data storage and recovery
- Connect to HBase using the Java API
- Bulk-load data into HBase using MapReduce
- Administer a HBase cluster
- Resolve performance bottlenecks
Who Needs to Attend
Developers familiar with Apache Hadoop
Prerequisites
- Familiarity with Hadoop's architecture and APIs
- Experience writing basic applications
- Prior programming experience, preferably Java
- Experience with databases and data modeling is helpful, but it is not required
Follow-On Courses
There are no follow-ons for this course.
Certification Programs and Certificate Tracks
This course is part of the following programs or tracks:
Course Outline
1. HBase
2. Data Model
- Tables, Row Keys, and Column Families
- Choosing Column Attributes
- Version and HBase Operations
3. HBase Shell
- Creating and Manipulating Data Using the Command-Line Shell
4. Cluster Architecture
- HMaster, RegionServers, and Zookeeper
- Compactions in HBase
- Crash Recovery
5. Storage Architecture
- Client Caching
- Data Storage and Bloom Filters
- Modifying Rows
6. Schema Design
- Creating Column Families
- Designing for Locality and Access Patterns
- Detecting and Preventing Hot Spots
7. HBase API
- Connecting to HBase Using the Java API
- Administrative Actions Using the Java API
- Accessing Data Using the Java API
8. MapReduce and Bulk Loads
- MapReduce Integration
- Bulk-Load into HBase
9. HBase Configuration
- Standalone and Distributed Run Modes
- Required Zookeeper Configurations
- Required Configuration Settings
10. HBase Administration
- Monitoring HBase Processes
- Performing HBase Backups
- Planning for HBase Capacity
11. Performance Tuning
- Preventing Network Bandwidth Bottlenecks
- Java Garbage Collection and HBase Operations
- Tuning for Client Operations
- Logging Locations and Troubleshooting Tools
United States [

