Get in Touch

Course Outline

Big Data Overview:

  • Defining Big Data
  • The reasons behind the rising popularity of Big Data
  • Case studies in Big Data
  • Key characteristics of Big Data
  • Solutions for managing Big Data

Hadoop and Its Components:

  • An introduction to Hadoop and its core components
  • Hadoop architecture and the characteristics of data it can handle and process
  • A brief history of Hadoop, companies adopting it, and the motivations behind their adoption
  • Detailed explanation of the Hadoop framework and its components
  • Understanding HDFS: reading from and writing to the Hadoop Distributed File System
  • Setting up a Hadoop cluster in various modes: standalone, pseudo-distributed, and multi-node

(This section covers setting up a Hadoop cluster using VirtualBox, KVM, or VMware, addressing necessary network configurations, running Hadoop daemons, and testing the cluster).

  • Overview of the MapReduce framework and its operational mechanics
  • Executing MapReduce jobs on a Hadoop cluster
  • Understanding replication, mirroring, and rack awareness within Hadoop clusters

Hadoop Cluster Planning:

  • Strategies for planning your Hadoop cluster
  • Aligning hardware and software requirements for cluster planning
  • Analyzing workloads to plan a cluster that prevents failures and ensures optimal performance

What is MapR and Why Choose MapR:

  • Overview of MapR and its architecture
  • Understanding and working with MapR Control System, MapR Volumes, snapshots, and mirrors
  • Planning a cluster specifically for MapR
  • Comparing MapR with other distributions and Apache Hadoop
  • MapR installation and cluster deployment

Cluster Setup and Administration:

  • Managing services, nodes, snapshots, mirrored volumes, and remote clusters
  • Understanding and managing nodes
  • Understanding Hadoop components and installing them alongside MapR Services
  • Accessing data on the cluster, including via NFS, and managing services and nodes
  • Managing data using volumes, handling users and groups, assigning roles to nodes, commissioning and decommissioning nodes, cluster administration, performance monitoring, configuring and analyzing metrics, and administering MapR security
  • Understanding and working with M7, native storage for MapR tables
  • Cluster configuration and tuning for optimum performance

Cluster Upgrade and Integration with Other Setups:

  • Upgrading MapR software versions and types of upgrades
  • Configuring the MapR cluster to access HDFS clusters
  • Setting up a MapR cluster on Amazon Elastic MapReduce

All the above topics include demonstrations and practice sessions to provide learners with hands-on experience of the technology.

Requirements

  • Basic knowledge of Linux file systems
  • Fundamental Java programming
  • Knowledge of Apache Hadoop (recommended)
 28 Hours

Number of participants


Price per participant

Testimonials (1)

Upcoming Courses

Related Categories