In this course, you will learn how to build an operational Data Lake that supports the analysis of structured and unstructured data. You will learn about the components and functions of the services that are involved in creating a Data Lake. You will use AWS Lake Formation to build a data lake, AWS Glue to build a data catalog, and Amazon Athena to analyze data. The course presentations and exercises deepen what you have learned by analyzing several common data lake architectures.
Quién debería asistir
This course is designed for:
Big Data Developer
Data Architects and Analysts
Other data analysis experts
We recommend that participants in this course meet the following requirements:
Good practical knowledge of key AWS services such as Amazon Elastic Compute Cloud (EC2) and Amazon Simple Storage Service (S3)
Experience with a programming or scripting language
First knowledge of the Linux operating system and the command line interface
Notebook required to take part in the exercises, tablets are not suitable
Objetivos del curso
What you will learn in this course:
Collect large amounts of data with services like Kinesis Streams and Firehose and store data securely and long term in Amazon Simple Storage Service.
Create a metadata index of your data lake.
Choose the best tools to capture, store, process, and analyze your data in Data Lake.
Applying the knowledge in hands-on labs where hands-on experience can be gained by building a complete solution.
Contenido del curso
The course covers the following concepts:
The key services for building a serverless Data Lake architecture
A data analysis solution that addresses the capture, storage, processing, and analysis workflows
Repeatable deployment of templates to implement a Data Lake solution
Create a metadata index and enable search
Set up a large data transfer pipeline for multiple data sources
Data transformation using simple functions triggered by events
Data processing using the appropriate tools and services for the application
Available options for optimized analysis of processed data