Please enable JavaScript.
Coggle requires JavaScript to display documents.
Hadoop ECO System (Managers\ task planners (ZooKeeper
Main tool for…
Hadoop ECO System (
Managers\ task planners,
Suppliers,
Engines,
SQL tools
for analysis of historical records.,
Data Types,
Hadoop Distributed File System
It's a special file system ,
Advanced Analytic,
NOSQL: HBase
Allows working with different records in real time.
New records are added into sorted structure in memory , and only when its achive restricted volume it is sent to disc.,
Import: Apache Kafka
Sends messages to disc immediately and keep these data configured amount of days. Easy salable.
- Kafka is not lie about reliability
- consumer groups is not working (all messages will be given to all consumers)
- server do not saves offsets for consumers
,
Spark Streaming
Can take data from Kafka, ZeroMQ,soket , Twitter etc.
DStream interface— collection of small RDD, which are got for fixed time range)