Please enable JavaScript.
Coggle requires JavaScript to display documents.
DP-203 - Chapter 12 - Designing Security for Data Policies and Standards,…
DP-203 - Chapter 12 - Designing Security for Data Policies and Standards
Data encryption
at rest
Azure Storage
built-in, cannot be disabled
higher protection
Cumstomer Managed Key (
CMK
)
Azure SQL &
Dedicated SQLpools
Always encrypted
Column master key
Column encryption key
Permissions
ALTER ANY COLUMN MASTER KEY
ALTER ANY COLUMN ENCRYPTION KEY
VIEW ANY COLUMN MASTER KEY DEFINITION
VIEW ANY COLUMN ENCRYPTION KEY DEFINITION
Randomized vs
Deterministic encryption
client driver fetches key from secure vault, so key is not available to DBA's.
purpose: encrypt specific columns
Transparent Data Encryption (
TDE
)
applied at database level
Azure SQL
built-in
Azure Synapse
(Dedicated SQL pool)
manual activation
optional: double encryption using CMK
in transit
Azure Storage
network protocol
TLS
1.2
SSL (being discontinued)
Azure Synapse SQL
TLS
is built-in (enforced)
ignores
Encrypt
and
TrustCertificate settings
in connection string
dedicated VPN
Azure ExpressRoute
data auditing
Storage auditing
Classic diagnostic logging
Azure monitor
SQL auditing
switch on and select destination(s)
for compliancy
Blob storage
for real-time
dashboarding
Event hub
for ad-hoc querying
Log analytics
data masking
Azure (Synapse) SQL
Dynamic data masking (
DDM
)
> Add mask
credit card nr
email
default value
custom string
random number
TSQL
ALTER COLUMN
emailId
ADD MASKED WITH
(FUNCTION = 'email()');
RBAC and ACL
Azure RBAC
scope
Role
set of permissions
predefined
Security principal
managed identiy
user
group
service principal
Conflict resolution:
RBAC is evaluated first, then ACL
limitation
max 200 role assignments per subscription
Other (less recommended)
ACL
permissions
Write
Execute
Read
File/Folder
Complementary
limitation
max 32 entries per file/directory
assign groups instead of users
Row-level and column-level security
always enforced: rules are stored inside database
row-level
CREATE SECURITY POLICY
securityfilter
ADD FILTER PREDICATE
schema.udf_hasAccess(param1)
ON dbo.table WITH (STATE = ON)
column-level
GRANT SELECT ON
dbo.DimCustomer (customerId, name, city)
TO
LowPriv_User;
Data retention
Azure storage
Data life cycle management
service
cool
archive
delete
purge data based on business requirements
ADF
Delete activity
Azure Synapse SQL
TRUNCATE TABLE
dbo.DimCustomer;
Identities, keys and secrets
Access Keys
admin access to storage accounts
sign
Shared Access Signatures
(
SAS
)
restricted access for a limited period of time to storage account(s)
SAS types
signed with Storage Access Key
Account SAS
access to multiple storage services
(blob, file, table, queue)
Service SAS
access to 1 storage service
(blob, file, table, queue)
signed by AAD
User delegation SAS
access to blob store only
recommended SAS approach
Azure Active Directory
(
AAD
)
Users
Groups
Service principals
Managed identities
Azure Key Vault (
AKV
)
stores and manages
Secrets
Certificates
Keys
Secure endpoints
allowed accessibility of resource?
from public IP address (unsecured)
public endpoint
(default)
from specific private IP addresses only
private endpoint
existing resource?
yes
Create VNET
Private Link Service
:
Create private endpoint within VNET
Link it with resource
no, new resource
Create
managed VNET
and
managed private endpoint
while creating resource
Resource tokens in Databricks
Generate Personal Access Token (
PAT
) in
settings > user settings
limited lifetime!
alternative
regular AAD tokens can also be used
Loading a Spark dataframe with sensitive information
from
cryptography.fernet
import Fernet library
encryptionKey = Fernet.generate_key()
def encryptUdf(plaintext, KEY)
def decryptUdf(encryptedtext, KEY):
df = piidf.withColumn("SSN", encrypt("SSN", lit(encryptionKey)))
Writing encrypted data to tables or parquet files
df.write.format("delta").mode("overwrite").option("overwriteSchema", "true").saveAsTable("PIIEncryptedTable")
encrypted.write.mode("overwrite").parquet("abfss://path/to/store")
data privacy and sensitive information
strategy
Identify PII
Classify
Protect
Monitor & audit
Microsoft Defender
For Storage
For Azure SQL
Azure Synapse SQL
feature
Data Discovery and Classification
separate storage account
RBAC and ACL
encryption
data masking
ro- and column-level security
Trust no one