Please enable JavaScript.
Coggle requires JavaScript to display documents.
Big Data Concepts and Tools (Keys to Success with Big Data Analytics…
Big Data Concepts and Tools
Characteristics of Big Data
Variety
Structured Data, Spreedsheet Data
Semi-structured data
Unstructured data
Online Data Analytics
Velocity
data being generated fast and needs to be processed fast
Volume
Data volume is increasing exponentially
Other Vs of Big Data
Vaiability
data flows can be inconsistent with periodic peaks
Value
provides business value
Veracity
accuracy, quality, truthfulness, trustworthiness
Limitations of Data Warehouse/Relational Database
Scalability
unable to handle huge amounts of new/contemporary data sources
Speed
Unable to handle speed at which big data is arriving
Scheme (fixed)
Others
Unable to perform queries on big data efficiently
Challenges of Big Data Analytics
Data governance
Skill availability
Processing capabilities
Data integration
Solution cost
Data volume
Keys to Success with Big Data Analytics
Alignment between the business & IT strategy
A fact based decision making culture
Strong committed sponsorship
A strong data infrastructure
A clear business need
The right analytics tools
Personnel with advanced analytical skills
High-Performance Computing for Big Data
In-database analytics
Grid computing & MPP
In-memory analytics
Appliances
Big Data Technologies Hadoop
open sources framework for storing and analyzing massive amount of distributed, semi and unstructured data
Open source
MapReduce + Hadoop = Big Data core technology
Demystifying Facts
an ecosystem, not a single product
a file system, not a DBMS
open source but available from vendors
Hadoop and MapReduce are related but not the same
consists of multiple products
Hadoop Cluster
name node keeps track of the files and directories and provides information on where in the cluster data is stored
job and task tracker are for processing data
data nodes referred to as storage node
job tracker initiates and co-ordinates jobs
2 Nodes
master
slave
Other Big Data Technologies
HIVE
data warehousing -like framework developed by Facebook
allow users to write queries in an SQL like language
PIG
query language developed by Yahoo!
Stream Analytics Applications
Financial Services
Health Services
Law Enforcement and Cyber Security
Government
e-Commerce