Please enable JavaScript.

Coggle requires JavaScript to display documents.

Big Data Concepts and Tools (Critical Success Factors for Big Data…

- - - - consists of Hadoop Distributed File System (HDFS) and Map Reduce
      - data is being broken up into "parts" which are then loaded into a file system (cluster) made up of multiple nodes
      - each "part" is replicated multiple times and loaded into the file system for replication and failsafe procesing
      - Jobs are being distributed to clients and once completed, the results are collected and aggregated using MapReduce
    - - consists of 2 nodes
        
        Master
        
        Name Node
        
        keeps track of the files and directories
        
        provides information on where in the cluster data is stored and whether any nodes have failed
        
        Job Tracker
        
        initiates and co-ordinates jobs or the processing of data and dispatches compute tasks to Task Tracker
        
        Slave
        
        Data Node
        
        referred to as a "storage node" where data is stored
        
        Task Tracker
        
        known as "compute node" whereby data is processed