Hadoop Spark Hive Big Data Admin Class Bootcamp Course NYC
1
Hadoop Spark Hive Big Data Admin Class Bootcamp Course NYC, Learn installations and architecture of Hadoop, Hive, Spark, and other tools. Handle structured & Unstructured Data.
Course Description
Introduction Hadoop Big Data Course
- Introduction to the Course
Top Ubuntu commands
Understand NameNode, DataNode, YARN and Hadoop Infrastructure
Hadoop Install
- Hadoop Installation & HDFS Commands
- Java based Mapreduce
# Hadoop 2.7Â / 2.8.4
Learn HDFS commands
Setting up Java for mapreduce
Intro to Cloudera Hadoop & studying Cloudera Certification
SQL and NoSQL
- SQL, Hive and Pig Installation (RDBMS world and NoSQL world)
- More Hive and SQOOP (Cloudera – Sqoop and Hive on Cloudera.
- JDBC drivers.
- Pig
- Intro to NoSQL, MongoDB, Hbase Installation
Understanding different databases
Hive :
- Hive Partitions and Bucketing
- Hive External and Internal Tables
Spark Scala Python
- Spark Installations and Commands
- Spark Scala Scala Sheets
- Hadoop Streaming Python Map Reduce
- PySpark – (Python – Basics). RDDs.
Running Spark-shell and importing data from csv files
PySpark – Running RDD
Mid Term Projects
- Pull data from csv online and move to Hive using hive import
- Pull data from spark-shell and run map reduce for fox news first page
- Create Data in MySQL and using SQOOP move it to HDFS
- Using Jupyter Anaconda and Spark Context run count on file that has Fox news first page
- Save raw data using delimiter comma, space, tab and pipe and move that into spark-context and spark shell
Broadcasting Data – stream of data
Kafka Message Broadcasting
Free