Esquema Detallado del Curso
Unit 1: IBM Open Platform with Apache Hadoop
- Exercise 1: Exploring the HDFS
 
Unit 2: Apache Ambari
- Exercise 2: Managing Hadoop clusters with Apache Ambari
 
Unit 3: Hadoop Distributed File System
- Exercise 3: File access and basic commands with HDFS
 
Unit 4: MapReduce and Yarn
- Topic 1: Introduction to MapReduce based on MR1
 - Topic 2: Limitations of MR1
 - Topic 3: YARN and MR2
 - Exercise 4: Creating and coding a simple MapReduce job
 - Possibly a more complex second Exercise
 
Unit 5: Apache Spark
- Exercise 5: Working with Sparks RDD to a Spark job
 
Unit 6: Coordination, management, and governance
- Exercise 6: Apache ZooKeeper, Apache Slider, Apache Knox
 
Unit 7: Data Movement
- Exercise 7: Moving data into Hadoop with Flume and Sqoop
 
Unit 8: Storing and Accessing Data
- Topic 1: Representing Data: CSV, XML, JSON, and YAML
 - Topic 2: Open Source Programming Languages: Pig, Hive, and Other [R, Python, etc]
 - Topic 3: NoSQL Concepts
 - Topic 4: Accessing Hadoop data using Hive
 - Exercise 8: Performing CRUD operations using the HBase shell
 - Topic 5: Querying Hadoop data using Hive
 - Exercise 9: Using Hive to Access Hadoop / HBase Data
 
Unit 9: Advanced Topics
- Topic 1: Controlling job workflows with Oozie
 - Topic 2: Search using Apache Solr
 - No lab exercises