Thank you for sending your enquiry! One of our team members will contact you shortly.
Thank you for sending your booking! One of our team members will contact you shortly.
Course Outline
Big Data Overview:
- Defining Big Data
- The reasons behind the rising popularity of Big Data
- Case studies in Big Data
- Key characteristics of Big Data
- Solutions for managing Big Data
Hadoop and Its Components:
- An introduction to Hadoop and its core components
- Hadoop architecture and the characteristics of data it can handle and process
- A brief history of Hadoop, companies adopting it, and the motivations behind their adoption
- Detailed explanation of the Hadoop framework and its components
- Understanding HDFS: reading from and writing to the Hadoop Distributed File System
- Setting up a Hadoop cluster in various modes: standalone, pseudo-distributed, and multi-node
(This section covers setting up a Hadoop cluster using VirtualBox, KVM, or VMware, addressing necessary network configurations, running Hadoop daemons, and testing the cluster).
- Overview of the MapReduce framework and its operational mechanics
- Executing MapReduce jobs on a Hadoop cluster
- Understanding replication, mirroring, and rack awareness within Hadoop clusters
Hadoop Cluster Planning:
- Strategies for planning your Hadoop cluster
- Aligning hardware and software requirements for cluster planning
- Analyzing workloads to plan a cluster that prevents failures and ensures optimal performance
What is MapR and Why Choose MapR:
- Overview of MapR and its architecture
- Understanding and working with MapR Control System, MapR Volumes, snapshots, and mirrors
- Planning a cluster specifically for MapR
- Comparing MapR with other distributions and Apache Hadoop
- MapR installation and cluster deployment
Cluster Setup and Administration:
- Managing services, nodes, snapshots, mirrored volumes, and remote clusters
- Understanding and managing nodes
- Understanding Hadoop components and installing them alongside MapR Services
- Accessing data on the cluster, including via NFS, and managing services and nodes
- Managing data using volumes, handling users and groups, assigning roles to nodes, commissioning and decommissioning nodes, cluster administration, performance monitoring, configuring and analyzing metrics, and administering MapR security
- Understanding and working with M7, native storage for MapR tables
- Cluster configuration and tuning for optimum performance
Cluster Upgrade and Integration with Other Setups:
- Upgrading MapR software versions and types of upgrades
- Configuring the MapR cluster to access HDFS clusters
- Setting up a MapR cluster on Amazon Elastic MapReduce
All the above topics include demonstrations and practice sessions to provide learners with hands-on experience of the technology.
Requirements
- Basic knowledge of Linux file systems
- Fundamental Java programming
- Knowledge of Apache Hadoop (recommended)
28 Hours
Testimonials (1)
practical things of doing, also theory was served good by Ajay