Introduction to PySpark, Apache PySpark Programming Introduction.
Description
Introduction to PySpark.
What is PySpark
PySpark Job Roles and Opportunities
PySpark Course Content
PySpark Developer vs PySpark Machine Learning Developer
Why Spark was developed.
What is PySpark and Spark Vs Hadoop.
Spark Main Components.
PySpark Deployment Modes
Introduction to PySpark Programming.
PySpark Installations on Windows Operating System
PySpark Programming Introduction
Different Ways of PySpark Programming with Examples
First PySpark Program using PySpark Shell
First PySpark Program using Interactive Mode
First PySpark Program using Script Mode
First PySpark Program using Jupyter Notebook
First PySpark Program using PyCharm IDE
Pre-Requisites:
- Python for Data Engineering
- SQL for Data Engineering (SQL with MySQL or SQL with Oracle or SQL with PostgreSQL or SQL with Any RDBMS)
- Linux Essentials or Linux Basics or Linux Foundation
- Any Cloud Computing (AWS or Azure or GCP) Fundamentals
PySpark Programming/PySpark Developer
- PySpark Foundation
- PySpark Core Programming – RDD Programming (Transformations & Actions)
- PySpark SQL – DataFrame & Tables, Datasets
- PySpark Streaming – Streaming + Live Analytics
- PySpark Cluster Setup on AWS Cloud (Java, Scala, Python, Jupyter Notebook, MySQL, Hadoop, Apache Hive, Apache Kafka, Apache Cassandra, Apache Spark)
- PySpark Integrations
PySpark Integration with Apache Hadoop
PySpark Integration with Apache Hive
PySpark Integration with RDBMS (MySQL)
PySpark Integration with Apache Kafka
PySpark Integration with Apache Cassandra
PySpark Integration with Different Cloud Services
Who are Target Audience:
- Who wants to become Data Engineer (Apache Data Engineer or AWS Data Engineer or GCP Data Engineer or Azure Engineer)
- Who are Fresher/Experienced