Please enable JavaScript.
Coggle requires JavaScript to display documents.
Databricks
Fundamentals - Coggle Diagram
Databricks
Fundamentals
What is a
Data Lakehouse?
1980 -
-
Pros:
- Business inteligence (BI)
- Analytics
- Structured & clean data
- Predefined schemas
Cons:
- No support for semi or
unstructured data
- Inflexible schemas
- Struggled with volume and velocity
- Long processing time
-
2000
DataLakes
Pros:
- Flexible data storage
- Streaming support
- Cost efficient in the cloud
Support for AI and ML
Cons:
- No transaction support
- Poor data reliability
- Slow analysis performance
- Data governance concerns
- Data warehouses still needed
- Implemented complex technology stack environments
- Introduced complexity and delay as data teams were stuck in silos
- Data duplication
- Ai implementation was difficult.
-
Data Teams needed:
Systems to support data applications, SQL analytics, real-time analysis, data science and machine learning
Lakehouse platform.
-
A single reliable source of truth.
- Transaction support
- Schema enforcement and governance
- Data governance
- BI Support.
- Decoupled storage from compute
- Open storage formats
- Support for diverse data types.
- Support for diverse workloads
- End to End streaming
-
Supports the work of data analysts
data engineers, and data scientists
It's the modernized version of a data warehouse
Without compromising the flexibility in depth of a data lake.
-
-
-