Apache Accumulo Training (AAT) – Perfil

Esquema Detallado del Curso

Module 1: Introduction to Accumulo
  • NoSQL concepts
  • Other NoSQL datastores
  • What is special about Accumulo: design goals and implementation
Module 2: Installation and quick start
  • Environment pre-requisits
  • Accumulo configuration
  • Process control scripts
  • Shell and monitoring tools
  • Lab
Module 3: Accumulo architecture
  • Key/Value spaces
  • Range scans and filtering
  • Tables and tablets
  • Internal Accumulo communication
  • Anatomy of reads and writes
Module 4: Writing and reading with API
  • Rows keys, row values
  • Mutations
  • Instances and connectors
  • Batch operations: Scanner, BatchWriter, BatchScanner
  • Lab
Module 5: Accumulo design patterns
  • How to present your design
  • Flexible schemas
  • Use of indexing
  • Single-entity tables
  • Unique keys
  • Design lab
  • Time series data
  • Use of denormalization
  • Joins and pre-joins
  • Indices
  • Teams lab
Module 6: Hadoop integration
  • Using Accumulo with Hadoop and other Hadoop echosystem tools
  • Imitating relational operations
  • Client-side iterators
  • Lab
Module 7: Server-side optimizations
  • Iterators
  • Constraints
  • Initial load (bulk load)
  • Lab
Module 8: Cells and partitions
  • Domain-specific autorization
  • Wide vs tall
  • Reasoning about locality
Module 9: Data retrieval patterns
  • Statistics
  • Query time optimization
  • Partitioned joins
Module 10: Data science with Accumulo, conclusion
  • Graph search
  • Machine learning
  • Geo information
  • Administration and performance optimization