Data Engineering on AWS
- Référence GK910032
- Durée 3 jour(s)
Modalité pédagogique
Classe inter en présentiel Prix
EUR2,450.00
hors TVA
Demander une formation en intra-entreprise S'inscrireModalité pédagogique
La formation est disponible dans les formats suivants:
-
Classe inter à distance
Depuis n'importe quelle salle équipée d'une connexion internet, rejoignez la classe de formation délivrée en inter-entreprises.
-
Classe inter en présentiel
Formation délivrée en inter-entreprises. Cette méthode d'apprentissage permet l'interactivité entre le formateur et les participants en classe.
-
Intra-entreprise
Cette formation est délivrable en groupe privé, et adaptable selon les besoins de l’entreprise. Nous consulter.
Demander cette formation dans un format différent
Résumé
Haut de pageThis comprehensive course provides a deep dive into data engineering practices and solutions on Amazon Web Services (AWS). Participants will learn how to design, build, optimize, and secure data engineering solutions by using AWS services. Topics range from foundational concepts to hands-on implementation of data lakes, data warehouses, and both batch and streaming data pipelines.
This course includes presentations, demonstrations, and hands-on labs.
Prochaines dates
Haut de pagePublic
Haut de page- Solutions architects
- DevOps engineers
- IT professionals
- Data analysts looking to expand into data engineering.
Objectifs de la formation
Haut de pageIn this course, you will learn to do the following:
- Design and implement scalable data lakes and data warehouses on AWS.
- Build, optimize, and secure batch data processing pipelines.
- Develop and manage streaming data solutions.
- Apply best practices for data governance and security.
- Automate data engineering workflows by using AWS services.
- Implement access control and security measures for data solutions.
Programme détaillé
Haut de pageModule 1: Data Engineering Roles and Key Concepts
- The role of a data engineer
- Data discovery for a data analytics system
- AWS services for data workflows
- Continuous integration and continuous delivery
- Networking considerations
Module 2:Designing and Implementing Data Lakes
- Data lake introduction
- Data lake storage
- Ingest data
- Catalog data
- Transform data
- Serve data for consumption
- Lab: Setting up a Data Lake on AWS
Module 3: Optimizing and Securing Data Lake Solutions
- Optimizing performance
- Security using Lake Formation
- Setting permissions with Lake Formation
- Security and governance
- Troubleshooting
- Lab: Automating Data Lake Creation using AWS Lake Formation Blueprints
Module 4: Data Warehouse Architecture and Design Principles
- Introduction to data warehouses
- Amazon Redshift overview
- Ingesting data into Amazon Redshift
- Processing data
- Serving data for consumption
- Lab: Setting up a Data Warehouse using Amazon Redshift Serverless
Module 5: Performance Optimization Techniques for Data Warehouses
- Monitoring and optimization options
- Data optimization in Amazon Redshift
- Query optimization in Amazon Redshift
- Data orchestration
Module 6: Security and Access Control for Data Warehouses
- Authentication and access control in Amazon Redshift
- Data security in Amazon Redshift
- Lab: Working with Amazon Redshift
Module 7: Designing Batch Data Pipelines
- Introduction to batch data pipelines
- Designing a batch data pipeline
- Ingesting batch data
Module 8: Implementing Strategies for Batch Data Pipelines
- Processing and transforming data
- Transforming data formats
- Integrating your data
- Cataloging data
- Serving data for consumption
- Lab: A Day in the Life of a Data Engineer
Module 9: Optimizing, Orchestrating, and Securing Batch Data Pipelines
- Optimizing the batch data pipeline
- Orchestrating the batch data pipeline
- Securing the batch data pipeline
- Lab: Orchestrating Data Processing in Spark using AWS Step Functions
Module 10: Streaming Data Architecture Patterns
- Introduction to streaming data pipelines
- Ingesting data from stream sources
- Storing streaming data
- Processing streaming data
- Analyzing streaming data
- Lab: Streaming Analytics with Amazon Managed Service for Apache Flink
Module 11: Optimizing and Securing Streaming Solutions
- Optimizing a streaming data solution
- Securing a streaming data pipeline
- Lab: Access Control with Amazon Managed Streaming for Apache Kafka
Module 12: Compliance and Cost Optimization
- Compliance considerations
- Cost optimization tools
Module 13: Course Wrap-Up
Pré-requis
Haut de page- Basic understanding of AWS services
- Familiarity with database concepts
- Basic programming or scripting knowledge
- Understanding of data processing fundamentals