Big Data Hadoop Training Institute in Pune

  • Home
  • Big Data Hadoop Training Institute in Pune
Image 25,000/-

Big Data Hadoop Training Institute in Pune

COURSE DESCRIPTION

HADOOP DEV + SPARK & SCALA + NoSQL + Splunk + HDFS (Storage) + YARN (Hadoop Processing Framework) + MapReduce using Java (Processing Data) + Apache Hive + Apache Pig + HBASE (Real NoSQL ) + Sqoop + Flume + Oozie + Kafka With ZooKeeper + Cassandra + MongoDB + Apache Splunk.

BigData - Open Source Technology

Solution for BigData Problem,

Open Source Technology,

Contains several tool for entire ETL ,

data processing Framework,

It can process Distributed data and no need to store entire data in centralized storage as it is required for SQL based tools.

  • For write once And Read many times type of data store is nothing but Hadoop
  • Hadoop is large dataset it can be divided into smaller (64 or 128 MB) blocks that are spread among many machines in the clusters via Hadoop Distributed File System.
  • The key functions of Hadoop are,
  • Approachable-Hadoop runs on vast clusters of acceptable Hardware equipment.
  • Powerful-Because it’s intentional to run on clusters of acceptable Hardware equipment, Hadoop is an architect with the presumption of repeated hardware malfunctions. It can handle most of such failures.
  • Resizable-Hadoop measures consecutive to carry giant information by together with a lot of nodes to the cluster.
  • Simple-Hadoop permits users to rapidly write well-organized parallel codes.

Who Can Do this Course?

  • Fresher’s
  • BE/ B.Sc. Candidate
  • Any Engineers
  • Any Graduate
  • Any Post-Graduate
  • Working Professionals

INSTRUCTOR

Image
Zubair Ansari

Big Data Hadoop

  • +91 8668770390
  • info@mapstechhub.com
Lectures 48
Duration 60 hrs
Skill level
Language English
Students 10
Certificate Internship

CURRICULUM

SECTION 1: INTRODUCTION TO HANDSTANDS

Distributed computing Data management – Industry Challenges Overview of Big Data Characteristics of Big Data Types of data Sources of Big Data Big Data examples What is streaming data? Batch vs Streaming data processing Overview of Analytics Big data Hadoop opportunities

Download pdf Download doc

Why we need Hadoop Data centers and Hadoop Cluster overview Overview of Hadoop Daemons Hadoop Cluster and Racks Learning Linux required for Hadoop Hadoop ecosystem tools overview Understanding the Hadoop configurations and Installation.

Download pdf Download doc

HDFS HDFS Daemons – Namenode, Datanode, Secondary Namenode Hadoop FS and Processing Environment’s UIs Fault Tolerant High Availability Block Replication How to read and write files Hadoop FS shell commands

Download pdf Download doc

YARN YARN Daemons – Resource Manager, NodeManager etc. Job assignment & Execution flow

Download pdf Download doc

The introduction of MapReduce. MapReduce Architecture Data flow in MapReduce Understand Difference Between Block and InputSplit Role of RecordReader Basic Configuration of MapReduce MapReduce life cycle How MapReduce Works Writing and Executing the Basic MapReduce Program using Java Submission & Initialization of MapReduce Job. File Input/Output Formats in MapReduce Jobs Text Input Format Key Value Input Format Sequence File Input Format NLine Input Format Joins Map-side Joins Reducer-side Joins Word Count Example(or) Election Vote Count Will cover five to Ten Map Reduce Examples with real time data.

Download pdf Download doc

Data warehouse basics OLTP vs OLAP Concepts Hive Hive Architecture Metastore DB and Metastore Service Hive Query Language (HQL) Managed and External Tables Partitioning & Bucketing Query Optimization Hiveserver2 (Thrift server) JDBC , ODBC connection to Hive Hive Transactions Hive UDFs Working with Avro Schema and AVRO file format Hands on Multiple Real Time datasets.

Download pdf Download doc

Apache Pig Advantage of Pig over MapReduce Pig Latin (Scripting language for Pig) Schema and Schema-less data in Pig Structured , Semi-Structure data processing in Pig Pig UDFs HCatalog Pig vs Hive Use case Hands On Two more examples daily use case data analysis in google. And Analysis on Date time dataset

Download pdf Download doc

Introduction to HBASE Basic Configurations of HBASE Fundamentals of HBase What is NoSQL? HBase Data Model Table and Row. Column Family and Column Qualifier. Cell and its Versioning Categories of NoSQL Data Bases Key-Value Database Document Database Column Family Database HBASE Architecture HMaster Region Servers Regions MemStore Store SQL vs. NOSQL How HBASE is differed from RDBMS HDFS vs. HBase Client-side buffering or bulk uploads HBase Designing Tables HBase Operations Get Scan Put Delete Live Dataset

Download pdf Download doc

Sqoop commands Sqoop practical implementation Importing data to HDFS Importing data to Hive Exporting data to RDBMS Sqoop connectors

Download pdf Download doc

Flume commands Configuration of Source, Channel and Sink Fan-out flume agents How to load data in Hadoop that is coming from web server or other storage How to load streaming data from Twitter data in HDFS using Hadoop

Download pdf Download doc

Oozie Action Node and Control Flow node Designing workflow jobs How to schedule jobs using Oozie How to schedule jobs which are time based Oozie Conf file

Download pdf Download doc

Scala Syntax formation, Datatypes , Variables Classes and Objects Basic Types and Operations Functional Objects Built-in Control Structures Functions and Closures Composition and Inheritance Scala’s Hierarchy Traits Packages and Imports Working with Lists, Collections Abstract Members Implicit Conversions and Parameters For Expressions Revisited The Scala Collections API Extractors Modular Programming Using Objects

Download pdf Download doc

Spark Architecture and Spark APIs Spark components Spark master Driver Executor Worker Significance of Spark context Concept of Resilient distributed datasets (RDDs) Properties of RDD Creating RDDs Transformations in RDD Actions in RDD Saving data through RDD Key-value pair RDD Invoking Spark shell Loading a file in shell Performing some basic operations on files in Spark shell Spark application overview Job scheduling process DAG scheduler RDD graph and lineage Life cycle of spark application How to choose between the different persistence levels for caching RDDs Submit in cluster mode Web UI – application monitoring Important spark configuration properties Spark SQL overview Spark SQL demo SchemaRDD and data frames Joining, Filtering and Sorting Dataset Spark SQL example program demo and code walk through

Download pdf Download doc

What is Kafka Cluster architecture With Hands On Basic operation Integration with spark Integration with Camel Additional Configuration Security and Authentication Apache Kafka With Spring Boot Integration Running Usecase

Download pdf Download doc

Introduction & Installing Splunk Play with Data and Feed the Data Searching & Reporting Visualizing Your Data Advanced Splunk Concepts

Download pdf Download doc

Introduction of NoSQL What is NOSQL & N0-SQL Data Types System Setup Process MongoDB Introduction MongoDB Installation DataBase Creation in MongoDB ACID and CAP Theorum What is JSON and what all are JSON Features? JSON and XML Difference CRUD Operations – Create , Read, Update, Delete Cassandra Introduction Cassandra – Different Data Supports Cassandra – Architecture in Detail Cassandra’s SPOF & Replication Factor Cassandra – Installation & Different Data Types Database Creation in Cassandra Tables Creation in Cassandra Cassandra Database and Table Schema and Data Update, Delete, Insert Data in Cassandra Table Insert Data From File in Cassandra Table Add & Delete Columns in Cassandra Table Cassandra Collections

Download pdf Download doc
Share
Take this course NOW

LATEST POST

22 Jan
What is Machine Learning?
  • 7:49 am
  • admin

Enroll Now

Get Trained in Latest Technologies Click to Join Demos and Batches (Weekdays / Weekends / Online / Offline / Project / Intrenship ) Office Address : 602 & 603,"Fortuna Venture",Pimple Saudagar, Pune-411027 Call: +918668770390 E-mail Us: info@mapstechhub.com