Building Data Lakes on AWS (BDLA)

 

Course Overview

In this course, you will learn how to build an operational Data Lake that supports the analysis of structured and unstructured data. You will learn about the components and functions of the services that are involved in creating a Data Lake. You will use AWS Lake Formation to build a data lake, AWS Glue to build a data catalog, and Amazon Athena to analyze data. The course presentations and exercises deepen what you have learned by analyzing several common data lake architectures.

Who should attend

This course is designed for:

  • Solutions Architects
  • Big Data Developer
  • Data Architects and Analysts
  • Other data analysis experts

Certifications

This course is part of the following Certifications:

Prerequisites

We recommend that participants in this course meet the following requirements:

  • Good practical knowledge of key AWS services such as Amazon Elastic Compute Cloud (EC2) and Amazon Simple Storage Service (S3)
  • Experience with a programming or scripting language
  • First knowledge of the Linux operating system and the command line interface
  • Notebook required to take part in the exercises, tablets are not suitable

Course Objectives

What you will learn in this course:

  • Collect large amounts of data with services like Kinesis Streams and Firehose and store data securely and long term in Amazon Simple Storage Service.
  • Create a metadata index of your data lake.
  • Choose the best tools to capture, store, process, and analyze your data in Data Lake.
  • Applying the knowledge in hands-on labs where hands-on experience can be gained by building a complete solution.

Course Content

The course covers the following concepts:

  • The key services for building a serverless Data Lake architecture
  • A data analysis solution that addresses the capture, storage, processing, and analysis workflows
  • Repeatable deployment of templates to implement a Data Lake solution
  • Create a metadata index and enable search
  • Set up a large data transfer pipeline for multiple data sources
  • Data transformation using simple functions triggered by events
  • Data processing using the appropriate tools and services for the application
  • Available options for optimized analysis of processed data
  • Best practices for deployment and operations

Prices & Delivery methods

Online Training

Duration
1 day

Price
  • on request
Classroom Training

Duration
1 day

Price
  • on request

Click on town name or "Online Training" to book Schedule

Guaranteed date:   This green checkmark in the Upcoming Schedule below indicates that this session is Guaranteed to Run.
Instructor-led Online Training:   This is an Instructor-Led Online (ILO) course. These sessions are conducted via WebEx in a VoIP environment and require an Internet Connection and headset with microphone connected to your computer or laptop.
This is a FLEX course, which is delivered simultaneously in two modalities. Choose to attend the Instructor-Led Online (ILO) virtual session or Instructor-Led Classroom (ILT) session.

Europe

Germany

Online Training Time zone: Central European Time (CET) Enroll
This is a FLEX course. Hamburg Enroll
Online Training Time zone: Central European Time (CET) Enroll

Italy

Online Training Time zone: Central European Time (CET) Enroll
Online Training Time zone: Central European Time (CET) Enroll
Online Training Time zone: Central European Summer Time (CEST) Enroll

Switzerland

This is a FLEX course. Zurich Enroll
Online Training Time zone: Central European Time (CET) Enroll
This is a FLEX course. Zurich Enroll
Online Training Time zone: Central European Time (CET) Enroll
This is a FLEX course. Zurich Enroll
Online Training Time zone: Central European Time (CET) Enroll

United Kingdom

Guaranteed to Run Online Training Time zone: Greenwich Mean Time (GMT) Enroll
Online Training Time zone: Greenwich Mean Time (GMT) Enroll
Online Training Time zone: British Summer Time (BST) Enroll