Please enable JavaScript.
Coggle requires JavaScript to display documents.
Azure DP-203, Azure Synapse Analytics, Azure Storage, Azure Databrick,…
-
Azure Synapse Analytics
Azure Synapse Serverless SQL pools
- Capabilities
- Secure data & Manage user
- Query data in Data Lake
- Create metadata objects
External tables with Synapse SQL: data located in Hadoop, Azure Storage blob, or ADLS
Apache Spark
- can use ADLS Gen 2 & BLOB storage
- process big data workloads that cannot be handled by Azure Synapse SQL
USE CASE:
- Modern data warehousing
- Advanced analytics
- Data exploration and discovery
- Real time analytics
- Data integration
- Integrated analytics
Ingest & Prep:
- Azure Synapse SQL Serverless
- Azure Synapse Spark
- Azure Synapse Pipeline
- Azure Data Factory
- Azure Databricks
- Dedicated SQL pool (formerly SQL DW) refers to the enterprise data warehousing features that are available in Azure Synapse Analytics
- The size of a dedicated SQL pool (formerly SQL DW) is determined by Data Warehousing Units (DWU)
Data Warehouse Units (DWU) represents a collection of analytic resources that are being provisioned. Analytic resources are defined as a combination of CPU, memory, and IO.
Service Level Objective (SLO) is the scalability setting that determines the cost and performance level of your data warehouse.
Azure Storage
Azure Blob storage
Blob storage is designed for:
- Serving images or documents directly to a browser.
- Storing files for distributed access.
- Streaming video and audio.
- Writing to log files.
- Storing data for backup and restore, disaster recovery, and archiving.
- Storing data for analysis by an on-premises or Azure-hosted service.
Language: .NET, Java, Node.js, Python, Go, PHP, Ruby
3 types of resources:
- The storage account
- A container in the storage account
- A blob in a container
A storage account can include an unlimited number of containers, and a container can store an unlimited number of blobs
3 types of blobs:
- Block blobs
- Append blobs
- Page blobs
A container organizes a set of blobs, similar to a directory in a file system
-
Azure Table storage
3 types of resources:
- Storage account
- Table
- Entity
- Property
A service that stores non-relational structured data (also known as structured NoSQL data) in the cloud, providing a key/attribute store with a schemaless design
Azure Files
fully managed file shares in the cloud that are accessible via the industry standard Server Message Block (SMB) protocol or Network File System (NFS) protocol
-
-
-
Azure SQL Database
Plan & Manage costs
Virtual core (vCore)
- Provisioned throughput
- Serverless
For both Azure SQL Database and Azure SQL Managed Instance
-
-
-
- a fully managed backup service
- built-in high availabilit
- use Azure Advanced Threat Protection
Azure Cosmos DB
-
INTERFACE:
- Cassandra: a column family database management system
- Gremlin: a graph database interface to Cosmos DB
- Table: supports multiple read & write replicas
- MongoDB: for Cosmos DB to enable a MongoDB application to run unchanged against a Cosmos DB database
-
Azure Data Factory
-
- Control - Until
- Data movement - Copy
- Data transformation - Mapping data flow
Azure Storage Explorer
Use Azure account connects to:
- Blob containers
- ADLS Gen2 containers
- ADLS Gen2 directories
- Queues
-
-
Azure Stream Analytics
4 kinds of resources as inputs:
- Azure Event Hubs
- Azure IoT Hub
- Azure Blob storage
- Azure Data Lake Storage Gen2
-
-
-
Delta Lake architecture
- Bronze tables contain raw data ingested from various sources (JSON files, RDBMS data, IoT data, etc.).
- Silver tables will provide a more refined view of our data. We can join fields from various bronze tables to enrich streaming records, or update account statuses based on recent activity.
- Gold tables provide business level aggregates often used for reporting and dashboarding. This would include aggregations such as daily active website users, weekly sales per store, or gross revenue per quarter by department.
Azure HDInsight
Apache Kafka
an open-source distributed streaming platform that can be used to build real-time streaming data pipelines and applications
Azure Data Studio
is a cross-platform database tool for data professionals using on-premises and cloud data platforms on Windows, macOS, and Linux
-
-
-
-
-
-