Big data refers to the use of advanced data analytics methods that help extract value from large data sets both structured and unstructured. With the availability of large data sets, there is a need for tools to computationally analyse and help reveal patterns, trends, associations to make meaningful decisions.
Module 1 - Introduction to Big Data
Rise of Big Data, Hadoop vs traditional systems, Hadoop Master-Slave architecture, HDFS Architecture, NameNode, DataNode, Secondary Node, JobTracker, TaskTracker.
Module 2 – HDFS and MapReduce architecture
Core components of Hadoop, Anatomy of Read and Write data on HDFS, MapReduce architecture Flow, JobTracker and TaskTracker.
Module 3 - Hadoop Configuration
Hadoop modes, Hadoop terminal commands, Cluster configuration, Web ports, Hadoop configuration files, Reporting, Recovery, MapReduce in action.