Please enable JavaScript.
Coggle requires JavaScript to display documents.
Stream Processing - Coggle Diagram
Stream Processing
Stream Processing Design Patterns
Single Event Processing
most basic
simple producer and consumer
each event isolated
consume
process
produce to another stream
Processing with Local state
example : minimum and maximum stock prices
Group by aggregation
store the min and max till now and compare
Multiphase Processing and repartitioning
similar to map-reduce
top 10 trades
Processing with external lookup
integration with external data
to improve latency we use cache and update the database table cache during each change
enriched click topics
Streaming Join
join two streams
match events of one stream with the other during a time window
Out of sequence events
older events arriving late
How to handle
recognize that there is out of sequence event
define the time period
have an in band capability to reconcile this event
update the results
Reprocessing
improved version
safer
switch between versions
spin up the new version in new consumer and start processing
bug in current version
How to choose stream processing framework
Ingest
low latency
asynchronous microservices
near real time data analytics
Operability of the system
usability of API and easy debugging
Community
Concepts
Time
event time
record created
log append time
arrived to broker and stored
processing time
stream processing received and started calculation
State
Internal state
External state
Stream-Table Duality
table
collection of records
mutable
current state only
Stream
history of changes
all states and history
Materializing
convert streams to table
Time Windows
size of the window
how often the window moves
how long it remains updatable
aligned windows
sliding windows
Kafka Streams Architecture
Building a Topology
set of operations and transitions
starts with source processors and end with sink processors
Scaling the topology
multiple threads of execution
split tasks and distribute them according to partitions
Kafka will automatically coordinate the work
Surviving failures
restart ability
change threads ability
Use Cases
Customer service
IoT
Fraud detection
What is stream processing
stream data
unbounded
ordered
immutable
replayable
Stream Processing
contentious and non-blocking option filling the gap between request response and batch processing
processing stream data in real time