Please enable JavaScript.
Coggle requires JavaScript to display documents.
Fundamental Data Structures - Coggle Diagram
Fundamental Data Structures
Sets
Operations
Checking membership of a given item in a given set
Finding the union of two sets, which comprises all the elements in either or both of them
Finding the intersection of two sets, which comprises all the common elements in the sets
Implementations
Using the list structure to indicate the set's elements
Considering only sets that are subsets of some large set
U
, called the universal set. If set
U
has
n
elements, then any subset
S
of
U
can be represented by a bit string of size n, called a
bit vector
, in which the ith element is 1 if and only the ith element of
U
is included in set
S
A
set
is an unordered collection (possibly empty) of distinct items called
elements
of the set.
Dictionaries
Operations
void clear(Dictionary d);
Empties the dictionary
void insert(Dictionary d, Key k, E e);
Inserts one element indexing it with
h(k)
value
E remove(Dictionary d, Key k);
Removes and returns the element indexed by
h(k)
value
E removeAny(Dictionary d);
Pass by through all the elements in a Dictionary while removing it
E find(Dictionary d, Key k);
Finds and returns the element that is indexed by
h(k)
value
int size(Dictionary d);
Returns the size of the Dictionary
A dictionary is a
set
of elements with the options of
searching
,
insertion
and
deletion
. The elements of a dictionary are indexed by
comparable
keys
Implementations
Hashing
It is a very efficient way to implement dictionaries and is based on the idea of distributing keys among a hash table
Hash Table
Is a one-dimensional array H[0..
m
-1]
A hash table's
size
should not be excessively large compared to the number of keys, but it should be sufficient to not jeopardize the implementation's time efficiency
Hash Function
It is a predefined function
h
that computes its value for each of the keys
The hash function assigns an interger between 0 and m-1, called the
hash adress
, to a key
Requirements
Has to be
easy to compute
Distribute keys as
evenly
as possible
Adequate
relation
between m and |Key|
Limitations
Not ideal in the presence of
multiple entries
per key
Not ideal for
accessing
elements based on some
order
Not ideal for searching based on a
range of keys
Versions of Hashing
Open Hashing (Separate Chaining)
Each hash address is associated with a linked list
Temporal Efficiency
Insertion (unsorted): θ(1)
Search: depends on the load factor α = n/m
When α is aproximately equal to 1, then θ(1) (in average)
Deletion: similar to the temporal efficiency of searching
Closed Hashing (Open Adressing)
All the elements are stored in the hash table itself (requirement: m >= n)
Strategies for defining an
offset
in the presence of collisions
Linear probing
Pseudo-random probing
Quadratic probing
Double hashing
The strategy should ensure that every hash address can be reached when solving a collision
Balanced Trees
Array
Linked List
Space and Time Trade-Offs
Algorithm design techniques related to the space-time
trade-off idea
Input enhancement
The idea is to preprocess the problem’s input,
in whole or in part, and store the additional information obtained to accelerate solving the problem afterward
Algorithms based on it
Counting methods for sorting
Boyer-Moore algorithm for string matching and its simplified version suggested by Horspool
Prestructuring
This technique simply uses extra space to facilitate faster and/or more flexible acess to the data. Some processing is done before a problem in question is actually solved but, unlike the input enchacement variety, it deals with acess structuring
Algorithms based on it
Hashing
Indexing with B-trees
Dynamic programming
This strategy is based on recording solutions to overlapping subproblems of a given problem in a table from which a solution to the problem in question is then obtained
A
space–time tradeoff
is a case where an algorithm or program trades increased space usage with decreased time.