Please enable JavaScript.
Coggle requires JavaScript to display documents.
Set Operations & Aggregation - Coggle Diagram
Set Operations & Aggregation
Set
Intersection
special case of JOIN (equi join)
Union and Difference with duplicates
Sort-based
Sort both relations (on all attributes)
Merge and eliminate duplicates
Hash-based
Partition R and S
Build in-memory hash table for partition Ri
Probe with tuples in Si, add to table if not duplicates
Buffer = (B-2) to read Ri + (1) to read S to probe in Ri + (1) to write
Mr / (B-1) <= B-2
Aggregation: Sorting
Algorithm
Sort on group by attributes
Scan sorted tuples, compute running aggregate
When group by attribute changes, output aggregate result
Cost = sorting cost, no scanning cost b/c aggregate while sorting
Aggregation: Hashing - Most useful
Algorithm
Hash on group by attribute
hash entry: key = group attribute, value = running aggregate
Scan tuples, probe hash table, update hash entry
Scan hash table, output hash entry
Cost = scan relation
Problem: 1 entry / group, hashtable does not fit in memory
Aggregation: Index
Without Grouping
Use a B+ Tree on aggregate attributes
only need to look at B+ Tree
With grouping
B+ tree on all attributes in SELECT, WHERE, GROUP BY
index-only scan
if group-by attributes form a prefix of search key,
tuples can be retrieved in group-by order