Please enable JavaScript.
Coggle requires JavaScript to display documents.
Chapter 1: Introducing Presto - Coggle Diagram
Chapter 1: Introducing Presto
Problems with Big Data
Huge amount of different type data
different datastores and cannot put them in just one
type of query language or mechanism
very expensive and slow to solve
Presto to the rescue
Presto solution
open source
distributed SQL
different datasources
Designed for performance and scale
Presto is Not
a datastore and it doesn't have storage
OLTP
Presto Is
query data where it is
OLAP
replacement for Hive and HDFS
Presto Query Techniques
in-memory parallel processing
pipelined execution
multithreaded execution for CPU
Flat memory data structure to minimize garbage
SQL on anything
presto can query
object storage
AWS S3
Google Blob storage
google cloud
RDMS
PostgreSQL
MySQL
Oracle
NoSQL
Cassandra
Kafka
MongoDB
Elasticsearch
Separation of storage and compute
presto represents the compute layer
databases represent the storage layer
this allows scalability of compute resources
Use Cases
One SQL Analytics Access point
access point to datawarehouse and source systems
sql based access to anything
Federated Queries
semantic layer for virtual data warehouse
Data Lake query engine
ETL
Better insights due to faster response