Please enable JavaScript.
Coggle requires JavaScript to display documents.
Big Data (Distributed Framework (Resource management (Job Scheduling,…
Big Data
Distributed Framework
Distributed File System
Master/Slave Architecture
Distributed Processing
Batch processing
Real-time processing
Map/Reduce
Mapping
Reducing
Resource management
Job Scheduling
Resource allocation
Job management
Job execution
Fail-safe measure
Block Replica
Rack Awareness
4Vs
Principle of Data Locality
Tools
Hadoop
YARN
MapReduce
HDFS
Name node
Secondary node
Data node
Standby node
Hive
Tables
Internal
External
Sqoop
Spark
Data type
Structured
Unstructured
Semi-structured
Concept
Programming
Programming
Programming operations
Data Ingestion
Export
Import
Data commands
Query Optimzation
Partitions
Buckets
Process Cycle
Data Storage