A software framework that facilitates the secure and efficient storage and analysis of large amounts of data from various sources, like databases, web servers, and file systems.
A collection of interconnected databases that allows data to be accessed, analyzed, and stored. It includes 3 main components; data warehouse, analytical framework, and integration layer.
A resource management and job scheduling daemon responsible for resource allocation and job scheduling. Decides who gets which resources and when, or when resources are available.
A software framework that facilitates the reliable and efficient processing of large amounts of data. It is well suited to data-intensive, real-time, and/or streaming applications.
Hadoop commons are Java libraries, files, or scripts needed for all Hadoop cluster components. They are used by HDFS, YARN, and MapReduce for running the cluster.