Please enable JavaScript.
Coggle requires JavaScript to display documents.
IBM_DE - Coggle Diagram
IBM_DE
Introduction to DE
W2: Data Ecosystem
Types of Data
-
Semi-structured
emails, XML, TCP/IP packets,zip files
Unstructured
audio, video, images, pdf, webpages, social media feeds
-
NoSQL (not-only)
Different types
Document-based
preferred for e-commerce platforms, medical records storage, CRM platforms, analytics platform
MongoDB, DocumentDB, CouchDB, Cloudant
-
Column-based
-
-
all cells corresponding to a column are saved as continuous disk => making access and search easier and faster
Suitable for (1) systems that require heavy write requests
(2) storing time-series data
(3) weather data (4) IoT data
-
Graph-based
-
-
(1) Social networks
(2) product recommendations
(3) network diagrams
(4) fraud detection
(5) access management
-
-
Key-value
Redis, Memcached, DynamoDB
-
-
Data Repositories
Data warehouses
-
teradata, oracle, IBM DB2, Amazon redshift, Google BigQuery, Cloudera, Snowflake
Data Marts
sub-section of data warehouse, built for particular business function, purpose or community of users
-
Data Lakes
store large amounts of structured, semi-structured, unstructured data in their native format
exist as a repository of raw data straight from the source, to be transformed based on the use case
Big Data
Velocity, Volume, Variety, Veracity, Value
-
-