Please enable JavaScript.
Coggle requires JavaScript to display documents.
Big Data Concepts & Tools (Business Problems (Process efficiency and…
Big Data Concepts & Tools
Characteristics of Big Data
Volume (Scale)
Data volume is increasing exponentially
Variety (Complexity)
Velocity (Speed)
Data is being generated fast and needs to be processed fast
Veracity
accuracy, quality, truthfulness, trustworthiness
Variability
data flows can be inconsistent with periodic peaks
Value
provide business value
Limitations of Data Warehouse/Relational Database
Schema (fixed)
Scalability
Unable to handle huge amounts of new/contemporary data sources
Speed
Unable to handle speed at which big data is arriving
Others
Unable to handle sophisticated processing such as machine learning
Unable to perform queries on big data efficiently
Challenges of Big Data Analytics
Data volume
ability to capture, store & process the huge volume of data in a timely manner
Data integration
ability to combine data quickly and at reasonable cosT
Processing capabilities
ability to process the data quickly, as it is captured
Data governance
Security, privacy, ownership, quality issues
Skill availability: shortage of data scientists
Solution cost: Return on Investment
Critical Success Factors
A clear business need
Strong committed sponsorship
Alignment between the business & IT strategy
A fact based decision making culture
A strong data infrastructure
The right analytics tools
Personnel with advanced analytical skills
Business Problems
Process efficiency and cost reduction
Brand management
Revenue maximization, cross-selling/up-selling
Enhanced customer experience
Churn identification, customer recruiting
Improved customer service
Identifying new products & market opportunities
Risk Management
Regulatory compliance
Enhanced security capabilities
High-Performance Computing
In-memory analytics
Storing & processing the complete data set in RAM
In-database analytics
Placing analytic procedures close to where data is stored
Grid computing & MPP
Use of many machines & processors in parallel
Appliances
Combining hardware, software & storage in a single unit for performance & scalability
Big Data Technologies
Hadoop
Master: Name Node & Job Tracker
Slave: Data Node & Task Tracker
MapReduce
NoSQL
HIVE
PIG