Please enable JavaScript.
Coggle requires JavaScript to display documents.
OLAP optimization (Query processing on a centralized DBMS architecture…
OLAP optimization
-
Query optimization
- express SQL query with relational algebra
Evaluation plans
expression of query execution with detailed
evaluation strategy - cost based query optimization
Heuristic approach, do many plans and choose
lowest, avoid the most expensive ones, don't search the
cheapest/lowest
Search Space
many algortithms to use, possibilties to implement
all operators from relational algebra, to do all beyond the scope
Tuning
Query optimizer can visualize evaluations plans
possible rules:
- definition of primary keys for all table
- explicit creation of indexes
- optimization of main memory allocation
Further techniques for OLAP
- Denormalization
- Fragmentation
- Materialized views
-
Fragmentation
Horizontal Fragmentation
- partition table into disjoint subsets of rows (with partition attribute)
- subset can be on different machine (scalability, parallel processing, I/O reduction)
- supported by lots of DBMSs
- Sharding: horizontal fragmentation in shared-nothing architecture
Vertical Fragmentation
- subsets of attributes
- each subset includes the key attribute
- fragment tables are projections (parallel processing, scalability, acces only on necessary attributes not whole rows)
- typically no DBMS support, but often column store
Indexing
Basic Concepts
- Search Key (attribute to look up files search key not key
- Index file (records of form Search Key/Pointer, smaller then original file)
Two kinds of index:
- ordered index (search key in sorted order)
- hash index (seach keys are distributed uniformly across "buckets" using a hash function
Index structures
- B+- trees
- Bitmap index
- Hash tables
Clustered Index (primary Index)
- index on attribute x pointing to records sorted by X-values
-
Sparse Index
index records for only some search key values
- find index record with largest search key value < K
- search sequentially starting at record to which index records points
Non-clustered Index (secondary Index)
- no table organization required
- possible several non-clustered indexes per table
Example
- index record points to a bucket that contains pointers to all the actual records with that particular search key value
- have to be dense
Materialized views
- relation that stores tuples of a view
- acts like a cache for the view
- can be indexed
- clustered, partitioned
- supported by major DBMS