Outstanding. Let's explore the most advanced and effective probing technique.Topic 10: Open Addressing - Double Hashing
1. Core Concept & Definition
Double Hashing is considered the best open addressing method because it effectively eliminates both primary and secondary clustering. It uses a second hash function to calculate the probe step size.The probing function is:
h(k, i) = (h₁(k) + i * h₂(k)) % mwhere:
-
h₁(k) is the primary hash function (determines the starting point).
-
h₂(k) is the secondary hash function (determines the step size for probing).
-
i is the probe number (0, 1, 2, ...).
-
m is the size of the hash table.
2. Text Diagram: Double Hashing in Action
Let's use a table of size m=7. Our hash functions will be:
-
h₁(k) = k % 7
-
h₂(k) = 5 - (k % 5) // Must never evaluate to 0! This is a simple example.
We will insert the keys: 50, 700, 76, 85, 92.Step 1: Insert 50 -> h₁(50)=1 -> Slot 1 is free.
Index: 0 1 2 3 4 5 6
[ ] [50] [ ] [ ] [ ] [ ] [ ]
Step 2: Insert 700 -> h₁(700)=0 -> Slot 0 is free.
Index: 0 1 2 3 4 5 6
[700] [50] [ ] [ ] [ ] [ ] [ ]
Step 3: Insert 76 -> h₁(76)=76%7=6 -> Slot 6 is free.
Index: 0 1 2 3 4 5 6
[700] [50] [ ] [ ] [ ] [ ] [76]
Step 4: Insert 85 -> h₁(85)=1 -> COLLISION at slot 1 (50).
Now, calculate step size h₂(85)=5 - (85%5)=5-0=5.
i=1: h(85,1) = (1 + 1*5) % 7 = 6 -> COLLISION at slot 6 (76).
i=2: h(85,2) = (1 + 2*5) % 7 = 11 % 7 = 4 -> Slot 4 is free. Insert.
Index: 0 1 2 3 4 5 6
[700] [50] [ ] [ ] [85] [ ] [76]
Step 5: Insert 92 -> h₁(92)=1 -> COLLISION at slot 1 (50).
Calculate step size h₂(92)=5 - (92%5)=5-2=3.
i=1: h(92,1) = (1 + 1*3) % 7 = 4 -> COLLISION at slot 4 (85).
i=2: h(92,2) = (1 + 2*3) % 7 = 7 % 7 = 0 -> COLLISION at slot 0 (700).
i=3: h(92,3) = (1 + 3*3) % 7 = 10 % 7 = 3 -> Slot 3 is free. Insert.
Index: 0 1 2 3 4 5 6
[700] [50] [ ] [92] [85] [ ] [76]
Notice how the step size h₂(k) is different for 85 (step=5) and 92 (step=3). This means they follow completely different probe sequences, effectively avoiding secondary clustering.3. Choosing the Second Hash Function h₂(k)
The choice of h₂(k) is critical. It must have one fundamental property:
-
h₂(k) must never evaluate to zero. If it does, the probe sequence becomes h₁(k), h₁(k), h₁(k), ... which is useless.
A common and effective choice is:
h₂(k) = R - (k % R)
where R is a prime number slightly smaller than the table size m (e.g., if m=7, choose R=5). This guarantees 1 <= h₂(k) <= R.4. Code Snippet (Search with Double Hashing)
#define TABLE_SIZE 7 // m (prime)
#define R 5 // Prime number < m
int hash1(int key) {
return key % TABLE_SIZE;
}
int hash2(int key) {
return R - (key % R); // Ensures step size is between 1 and R
}
// Search function for Double Hashing
int search(int* table, int key) {
int start = hash1(key);
int step = hash2(key); // Calculate the step size for this key
for (int i = 0; i < TABLE_SIZE; i++) {
int j = (start + i * step) % TABLE_SIZE; // Calculate probe index
if (table[j] == key) {
return j; // Key found
}
if (table[j] == 0) { // Assuming 0 means empty
return -1; // Key not found
}
// If DELETED, continue
}
return -1; // Key not found after full cycle
}