Please enable JavaScript.
Coggle requires JavaScript to display documents.
HDFS Process, MapReduce - Coggle Diagram
HDFS Process
Data Storage
HDFS stores large files into blocks (typically 128 MB ..). These blocks are distributed across multiple nodes (servers) in a cluster
Data Replication: Each block is replicated on multiple nodes (typically 3 copies) to ensure data reliability and fault tolerance.
-
masters ->Write Process: Client sends file to NameNode, which directs DataNodes to store blocks.
slaves -> Read Process: Client asks NameNode for block locations, reads in parallel from DataNodes.
MapReduce
explication
-
-
After the Map phase, the system shuffles the data: it ensures that all records with the same key go to the same Reducer.
-
map
-
Function map(string name, string document):
for each word in document:
emit(w, 1)
reduce
Reduce collects those extracted pieces, processes them, and produces the final result.
-
task
tâche || unité de travail soit un map task(bech te5ou les data pour l'analyse) || reduce task (te5ou les resultat mt3 mappers et les combinent pour le résultat final )
-
-
-
-