Execution Environments
Dataflow
Apache Beam
Java
Python
Autoscaling
CPU utilization
Work remaining
Throughput
Dynamic Workload
Re-balancing
Keep all workers busy
Eliminates Stragglers
(small chunk of data
takes longer to process)
uses work
stealing/shedding
Batch/Stream processing
fraud detection
IoT
gaiming
click stream,
point of sale, segmentation
Cloud Functions
node.js
index.js
package.json
pub/sub
cloud storage
http
App Engine Flexible
use cases
backends for mobile
web apps
HTTP APIs
ssh to VM's
disabled by default
microservices
gcloud app deploy
uploads sources into GCS
builds Docker image
pushes image into
Container registry
creates Load Balancer
creates/manages
VM's in 3 Zones
setup monitoring, logging,
health checks, error reporting
0 downtime
traffic split
canary
AB testing
limitations
just http(s)
access to tempfs only
- in memory
no persistent disks!
at least 2 nodes running all times!
App Engine Standard
scales to 0
must use App Engine Standard API's
not portable to other
compute environment:
- not using client libraries
Execution Environments (cont.)
Kubernetes Engine
for complex, portable applications:
- as runs containers
http(s), any others protocol
worker node(s)
pod(s)
container(s)
kube proxy
kubelet
master node(s)
scheduler
controller manager
api server
Compute Engine
Predefined and Custom
machine types
Persistent Disks (standard, balanced), Local SSD's
Instance Groups
lift and shift
VM Startup
provisioning
boot
request
plan ahead for traffic boost
1 Application
1..n Services
1..n Versions
1..n Instances
at least 'default'
traffic migration -> 100% to certain version
traffic split
App Engine SDK to run locally
services availble
task queue
scheduled tasks
memcache
search
logs
auth
60sec timout
no PD local persistance
pay per hour
pay per class
Endpoints
ESP v2
Extensible Service
Proxy (ESP)
app engine flex
GCE
GKE
k8s
app engine
cloud functions
GKE
Cloud Run
GCE
k8s
auth
logging
monitoring
ApiGee
analytics
monetization
business oriented
select region
select region
go
java
phyton
Networks scales per vCPUs
2Gbps/vCPU
max to 200Gbps (for 176 vCPU)
min 10Gbps (for 2/4 vCPU)
predefined
shared core
memory cpu
high mem
memory optimized
cpu optimized
preemtible
up to 24 hours
sole tenant nodes
Cloud Run
regional
stateless
http
pubsub
can be run inside GKE
shares network & file system
etcd
cloud manager
GKE
Node pools
assign based on label
private cluster
possible
$HOME/.kube/config
Deployment
kubectl run --generator deployment/apps.v1
kubectl apply
cloud console
kubectl autoscale deployment <name> --min=3 --max=10 --cpu-percent
rollingUpdate
recreate
pods deleted and recreated
kubectl rollout undo deployment <name>
kubectl rollout pause/resume deployment <name>
deployment's rollout is triggered if and only if the deployment's Pod template (that is, .spec.template) is changed,
Jobs
Job
Parallel Job
Cron Job
parallelism > 1
either in yaml
or via
scale
Kind: CronJob
Scheduling
nodeSelector
nodeAfinity
podAfinity
topology: node, zone, region
preferredDuringSchedulingIgnoredDuringExecution
requiredDuringSchedulingIgnoredDuringExecution
Taints - defined on nodes
tolerations - on pods
Networking
alias IP ranges
~4000IPs per cluster
NEG: Network Endpoint Groups
used by LB - traffic from LB to PODS directly
(not to NODES)
Security
network policy
pod level firewall rules
nodes recreated
enable for master and nodes
deploy network policies
Storage
Volume
EmptyDir
ephemeral
shares pod's lifecycle
from node's local disk
or memory
ConfigMap
Secret
= ConfigMap but secured
always in-mem (tmp file system)
downwardAPI
pods metadata
ephemeral
ephemeral
PV, PVC
persistant
size, class, access (R/W)
ReadWriteOnce
mounted to single node
ReadOnlyMany
mounted to several nodes
ReadWriteMany
not supported by GCP disks
but supported by NFS systems
k8s has only ServiceAccounts,
user identities managed outside
RBAC
Roles
(namespace level)
and ClusterRoles
Subject
resource's + verb's
user
group
serviceAccount
disable access to node's metadata
by removing role
compute.instance.get
as node contains secret with cert kubelete using to talk with serverAPI
Security Context
Pod spec
Pod Security
Policy
PCollections
PTransforms
Map
FlatMap
ParDo
GroupByKey
CombineByKey - better
Flattern == union
firebase
60sec default timeout
/tmp memory mount
use cases
tiny ETL
webhooks
event handling
can generate Dockerfile
can migrate to GCE, GKE, functions
by IP
by cookie
rand
uses IP tables
accessed by GCP tools via private IP
accessed by Authorized networks (trusted IP ranges)
gcloud container clusters get-credentials <cluster> --zone <zone>
Spot VMs
next gen for preemtible
no live migrate
no auto restart
no max runtime limit
Confidential VMs
encrypt data being processed
Shielded VMs
Secure Boot, virtual
trusted platform module or vTPM-enabled Measured Boot
NEW State-full IP adresses