Getting Started with Apache Flink, An Overview of Apache Flink.
Apache Flink is an open-source, native analytic database for Apache Hadoop. It is shipped by vendors such as Cloudera, MapR, Oracle, and Amazon. The examples provided in this course have been developed using Cloudera Apache Flink. This course is intended for those who want to learn Apache Flink.
Apache Flink is used to processing huge volumes of data at lightning-fast speed using traditional SQL knowledge.
To make the most of this course, you should have a good understanding of the basics of Hadoop and HDFS commands. It is also recommended to have a basic knowledge of SQL before going through this course.
Apache Flink is the next generation Big Data tool also known as 4G of Big Data.
It is the true stream processing framework (doesn’t cut stream into micro-batches).
Flink’s kernel (core) is a streaming runtime that also provides distributed processing, fault tolerance, etc.
Flink processes events at a consistently high speed with low latency.
It processes the data at a lightning-fast speed.
It is a large-scale data processing framework that can process data generated at very high velocity.
Flink is an alternative to MapReduce, it processes data more than 100 times faster than MapReduce. It is independent of Hadoop but it can use HDFS to read, write, store, and process the data. Flink does not provide its own data storage system. It takes data from distributed storage.