Please enable JavaScript.
Coggle requires JavaScript to display documents.
Engine Essentials (metrics (Transformations (Used for time-series analysis…
Engine Essentials
metrics
Each property of the metric editor provides control over different clauses in SQL statements used to calculate metrics.
Transformation
where, and/or from
Condition
where
Level(Dimensionality)
Use a hierarchy as the target, and the level metric dynamically changes when you drill.
target会随着drill变化,filtering和grouping的设置不变,这样在hierarchy的不同level上,都可以计算percentage
Order of Resolution
Metric looks to Template
Template looks to Filter
use shortcut to the prompt "Choose an Attribute from Geography" as the report filter
Use "Geography" in the template
Advanced Settings
filter setting: uncheck to exclude attributes absent in report or level (dimensionality)
比如说在report filter中有(year in 2007)但是year不在report template里,如果uncheck了之前那的setting, 则这个year的filter不会起作用。
3.Target
affects the SELECT and GROUP BY clauses
target is the attribute level at which the metric calculation will group
2.Grouping
None
GROUP BY is often omitted from the SQL pass for the level metric; if there is an unrelated attribute on the template, the level metric will still select and group by that unrelated attribute.
Does not group by the target attribute or any of its children.
Standard
Groups by the attribute level specified in the target
determine how the metric will be aggregated
1.Filtering
none
e.g. the level metric will be calculated from the most detailed fact table, ITEM_EMP_SLS, instead of referencing the higher level fact table, CITY_CTR_SALES, because its target is ITEM and it is defined with None filtering and None grouping.
Points the calculation of the level metric to a particular table
Typically used in combination with None grouping option and referencing a paticular attribute as the target
ignore
Completely ignore any related filtering criteria;不相关的即不在一个schema的不影响;
absolute
Place the filtering criteria that are related to the target in a subquery
Raise the level of the filter to the target level;
not only the ones that are in the filter or in the report template
standard
affects the WHERE clause
commonly used to achieve a percentage contribution compound metric
used to control the level of aggregation of a metric on a report
where, group by, and /or from
Formula
select , and aggregate functions(sum, count, avg, min, max)
Non-aggregate Metrics
A simple metric that does not aggregate across a particular attribute or hierarchy.
e.g. Inventory or stock calculations.
An inventory metric may aggregate across Geography and Product, but not for Time.
Non-aggregatable options
Beginning Value
Sets the aggregation to the beginning on hand values (BOH)
Ending Value
Sets the aggregation to the ending on hand values(EOH)
first or last value in the lookup table, using min() or max() function 比如说在Month_ID上用min或者max函数
Fact table verse Lookup Table
fact table
temporary tables are used
lookup table
subquery is used
sets in Metric Level, Grouping
It's recommended to use hierarchy in the template. e.g. use "Time" hierarchy,然后在metric level中将target设成Time, Grouping设成beginning,就可以使得所有Time related的attributes都成为non-aggregatable的。
Conditional Metrics
Metric definition that contains filtering criteria
只能引用一个filter,这个filter里可以有多个qualification
Transformations
Used for time-series analysis.
e.g. This year Sales compared to Last Year Sales, or Month to Date calculations.
Time-series analysis presents a difficult challenge because it cannot be assumed that each day maps to the same day form the previous year.
examples:
Financial Calendars
Monday to Monday analysis
this is a common scenario in Retail
Floating holidays
Company buy-outs
Expression-based Transformation
Table-based Transformation
which are schema objects
What are the distinguishing parts of SQL for a transformation metric
The transformation table will be referenced in SQL, in the 'FROM" clause, and particular joins will be performed.
Nested Metrics
enable users to break calculations into many levels
each level can have its own dimensionality, conditionality, and transformations.
scenario
How can I find the Minimum Item Revenue for each Category?
schema
Category->Subcate->Item->Fact_Table
Metric: Min(Sum(Revenue){Item}){Report Level}
inner: Item Revenue for each category; outer: Minimum Item Revenue for each category
Metric Join Type
types
Inner Join
Outer Join
can be changed via
metric editor
report data option
for all the metrics in the report, overwrite the settings in metric editor
These settings apply when multiple metrics are on a report.
一个row是否出现在report中由是否有相应的metric决定
analytical engine
Three primary duties
Fill in SQL logic where applicable during the Query stage;
advanced functions with respect to resolution of metrics or metric qualifications, e.g. RunningSum(Revenue) > 0
Custom Grouping banding
Populate data during the Populate/Evaluate stage
Smart metric
Consolidation
Report Limit
report->data menu->'report data' option
会在原来的sql后添加一个having clause;如果用metric qualification的话则会有单独的sql pass
和metric qualification的filter相比,report limit会更efficient, 因为没有添加sql pass;但是需要设定output level即qualification判断的level的情况只能用metric qualification
Subtotal
subtotals in the report grid, such as total(), maximum(), average(), are all done by analytical engine, and not reflected in the SQL for database;
Cross-tab final results for display during the Cross-tab stage
Sorting, pivoting, page-by, thresholds
Analytical SQL calculated by Analytical Engine
median()函数不是DB的sql支持的,analytical engine负责计算这个函数;
scenario: report limit, Median(Revenue) > 1000
Compound metrics
Created by combining 1 or more pre-existing metrics using 1 or more math operators;
such as, median(Revenue), RunningSum(Revenue), Revenue - Cost = Revenue;
注意Smart Metric的设置, 一个常用的场景是计算倍数,同比增长之类的,效果是相加减后算比例而不是将比例相加减
Evaluation Order of Analytical Engine
default
first... Metric Qualification
then...
Compound Metric(smart)
Consolidation
Report Limit
Subtotal
lastly...page-by, sorting, and cross-tabbing
scenario
一个需要change order的场景是,应用在consolidation上的smart metric,default的顺序是先计算smart compound metric再计算consolidation
在report->data->report data options中可以更改evaluation order;
analytical functions
functions
operators
plug-in packages
three categories:
can be processed only by the analytical engine
can be processed only by the database
can be processed by either
if the database can perform the calculation, the SQL engine sends the instructions to the database, otherwise, the Analytical Engine processes the task;
Case
The CASE function relies on the Analytical Engine for processing rather than using the database;
the case function allows for multiple comparisons;
The Case expression can be used in a ApplySimple function. By using ApplySimple, you ensure that the database will handle the processing.
OLAP functions - Moving Average
depends on the database, the moving average can be processed by the database or the analytical engine;
Date and Time Functions
Not supported by the Analytical Engine
Have to be resolved by the database;
teamplates
Attributes
Basic schema objects of the logical data model
Provide context form metrics on reports
Belong to the overall System Hierarchy, give hints on the level of metrics, such as region, or call center
Attribute Form
Every attribute must have at least one attribute form defined in MSTR Architect
Typically, an attribute will have the ID form and the primary description form.
Typically, the ID form will be the attribute key (the unique identifier for the attribute)
Hierarchies
hierarchies can be placed on a template as "wildcard" output levels;
At least one attribute of the hierarchy needs to be used in the report filter.
Final report output level will be determined by the filter selections.
We don't see hierarchies in sql, they are replaced by attributes.
Consolidations
Allow users to create customized element groupings and calculation,
No change to the warehouse schema or the logical data model is required;
Cons & limitations: consolidations cannot be targeted for level metrics; cannot be used in set qualifications; cannot be used as function parameter; cannot be used exactly the same as the attribute.
The SQL Engine generated the SQL required to retrieve the necessary data; The Analytical Engine performs the calculations contained in the consolidation definition;
custom groups
A more advanced way of grouping attribute elements from the same or different attributes;
Filters are used to define custom group elements; e.g. name starts with"north"
types:
an attribute qualification
a set qualification
a shortcut to a report
An existing report is used to determine output level;
原来report中的filter和attributes被用来形成新的report用的filter
example scenario:
step1. Top Senior Employees report SQL
Report limit (Top 5 by length of Employment) -> report returns 5 rows
Step2. Revenue for senior US Employees report SQL, total和individual各用一个sql pass计算
a shortcut to a filter
a custom group banding qualification
this option only exists in the custom group building editor; the other four are general filter definition scenarios;
slice the data into multiple ranges or bands based on metric values, rank or percent;
custom group binding allows you to create these binds in a variety of ways so that each band appears as a row on your report
banding types
band size
band count
在band count的这个例子中,是使用纯SQL的, 也没有使用create, 只是使用了select into
banding points
在范例中,analytical engine计算了analytical SQL, the analytical engine performs the band creation.而且过程中使用create创建了临时表;
banding for each distinct metric value
you must define an output level when defining custom group banding
each custom group element generates its own SQL pass, and the Analytical Engine is in charge of combining the data and displaying it together;
Consolidation is more efficient on SQL compared to custom group.
当定义custom group的时候如果使用的是attribute qualification, 则可以考虑是否应该换成custom group;
example scenario
step 1. SQL pass for metric qualification
step 2a. SQL pass to calculate the metrics on the report for the corresponding Custom Group element; total value for the custom group
step 2b. SQL pass to calculate the metrics on the report for the individual items within the Custom Group element; individual value for every item in the custom group
VLDB properties
basic intro
VLDB
stands for Very Large Database
A setting that modifies the Engine behavior to perform database-specific optimizations (customizing SQL)
can be defined at the following levels
Database Instance
Project
Analytical Engine VLDB properties
The Project level settings contain only Analytical Engine-related and MDX-related VLDB properties.
Report
Template
Metric
Those that have been changed are marked with a "check" label.
How are VLDB properties applied?
The engines query the metadata to gather schema information, and VLDB property settings, and then generate the SQL, which will be used to query the data warehouse.
Facts or Myth?
All VLDB properties modify the SQL generated by the Engine.
false, 有一些比如sql长度限制,sql执行时间限制的setting,不会改变SQL
MSTR applies default VLDB property values for every database.
true
These settings are stored in the metadata.
true, 所以在metadata upgrade的时候可以保留
Benefits
Optimization
database-specific settings enhance the performance of queries
Flexibility
VLDB properties can be set at multiple levels, even on report-by-report basis.
Ongoing database support
new versions will take advantages of new database platforms and versions as they become available.
VLDB Property Groupings
a series of folders into which the VLDB and Analytical Engine properties are grouped.
Analytical Engine
Freeform SQL
Dynamic Sourcing
Dynamic Sourcing logs in SQL View
Shows why a certain report was not able to use Dynamic Sourcing.
Four types of logs available in report SQL view.
Intelligent Cube Parse log
whether or not a cube has been defined to be available for dynamic sourcing
Report Parse log
Mismatch log
why a report cannot use a specific cube
Extended Mismatch log
related to a metric, why a report cannot use a cube
use VLDB sourcing to enable appropriate logs
Governing
Max SQL/MDX Size, SQL Time out(Per Pass)
Maximum SQL/MDX Size: Sets the maximum size(bytes) of the SQL that is accepted by the ODBC driver.
SQL Time Out: Maximum duration (seconds) allowed for each SQL pass.
Result Set Row Limit
only in the Report level
limits the number of rows returned to the Intelligence Server for the final result set.
Indexing
An index organizes data to enable queries to access the data faster.
Proper tuning of indexes is essential to high performance.
Common types of indexes
Primary Key Index
B-tree
Bitmap
Index-organized tables
Indexes can be created for lookup tables or fact tables.
Using MicroStrategy's VLDB properties, you can specify if you want to have indexes created on intermediate tables.
determine whether and when to create an index on an intermediate table.
Joins
Preserve All Lookup Table Elements
Performs an outer join to the lookup table in the final pass.
Values: Preserve common elements of lookup and final pass result table.
Preserve All Final Pass Result Elements.
这个Final Pass Result通常是取回了数据的fact table,最后和lookup table做join得到最左列的描述性ID就可以得到最终结果的。
MDX
Metrics
Pre/Post Statements
The Pre/Post Statements group enables you to hard-code SQL statements in different parts of a query.
report pre/post statement
table pre/post statement
insert pre/mid/post statement
cleanup poststatement
When using datamarts, the report Poststatement is executed against the datamart database.
wildcards
example
Custom INSERT statement captures information about report and RDBMS server usage.
good for logging purpose.
Query Optimization
Significantly reduces the number of SQL passes that the MicroStrategy SQL Engine generates, by
eliminating unused passes
reusing intermediate SQL passes
combining SQL passes
values
level 0: No optimization
level 1: Remove unused and duplicate passes
level 2: level 1 + Merge passes with different SELECT
level 3 : level 2 + merge passes, which only hit DB table, with different WHERE
level 4 : level 2 + merge All passes with different where (this is the default value)
Select/Insert Statements
Tables
Intermediate Table Types
Defines the type of object used to store intermediate results for multipass SQL.
values
permanent table
by using "Create Table"
1 more item...
Derived table
by using sub-query (nested)
true temporary table
create table #zzmd00
common table expression , DB2 UDB only
with a21 as (sub-query for temp table) ; select ** from a21
no "create table"
temporary view
create view zzmd00
MicroStrategy's general recommendation is to use derived tables and common table expressions in place of permanent tables whenever possible,depending on your database platform. Because derived tables and common table expressions don't use 'create table" and therefore no need of "drop table".
FallBack Table Type
If the Intermediate Table Type VLDB property is set to either derived tables, common table expressions, or views, you may run into a scenario where a particular reporting requirement cannot be resolved using that intermediate table type.
example scenario
ODBC ranking intermediate tables (on a metric qualification) cannot be resolved with derived tables.
In this case, the Engine needs to refer to the Fallback Table Type to know how to store intermediate result sets when resolving the report.
values
Permanent Table
True Temporary Table
Fail Report
Maximum SQL Passes Before Fallback / Maximum Tables in FROM clause before Fallback
If the report exceeds this number of SQL passes, then the FallBack Table Type will be used for intermediate tables instead of Intermediate Table Type .
Designed to increase the number of reports that can use derived tables or common table expressions as the intermediate table types.
basic optimizations
multipass SQL
example scenarios
reports with more than one metric(stored in separate fact tables)
reports using metric qualifications (filtering is based on metric qualification)
reports using custom groups(which use metric qualifications)
reports using metrics defined at different levels(e.g. one in report level, the other defined in a metric qualificaiton)
simulating outer joins on RDBMS platforms that do not support them natively
and others
Composing optimized SQL
composing the SELECT and GROUP BY clauses: use max() on description columns, to avoid them appearing in the GROUP BY clause:因为使用group by的时候是对description column进行排序,而使用max()的时候是对id column进行排序,默认的情况是对id column排序比较快; Optimization technique for attribute descriptions in the select clause;
optimizing the FROM clause: by default, the MSTR engine orders tables in the FROM clause in the following order: (1) Fact tables (2)Metric Qualification(MQ) tables (3)Relationship tables (4)Lookup tables 对多数的现在DBMS,顺序是没有关系的,但是有些数据库是有关系的,在VLDB settings里可以修改;
Logical table size: which table to choose (tip:如果在那个table的一个attribute上有一个filter,则一定会选择那个table)
Optimizing Joins
the SQL engine also tries to optimize how it performs the joins between tables;
If the two tables to join have more than one column in common, the column corresponding to the lowest level attribute in each table will be referenced to perform the necessary joins; e.g. call-center or region? call-center will be chosen because it's the lower level in hierarchy;
how you define the relationships between attributes in MSTR Architect will ultimately affect how the SQL engine performs these joins;
Attribute Definitions - Impact on joins
Be careful assigning parent-child relationships. e.g. A missing relationship in Architect may yield incorrect numbers on reports;
Be sure to include all necessary form expressions. e.g. "Day" attribute maps to Day_Date and Order_Date in a total of eight warehouse tables;
The Engine sees every MSTR object as some type of expression:
attribute form expressions
fact expressions
metric expressions
filter expressions
consolidation expressions
custom group expressions
subtotal expressions
transform expressions
intermediate tables
An intermediate table is a table that the MSTR SQL Engine creates on the database to store temporary data;
Is used to calculate the final result set;
depending on the RDBMS, to implement multi-pass sql
Intermediate table types
On databases where true temporary tables are supported, the MSTR SQL Engine can be configured to create the table type of choice using the Intermediate table type VLDB property
permanent table
Access (no true temporary tables)
不到万不得已不要选择的
true temporary table: oracle, sql sever, DB2, Teradata, might have different names for different platform; no logging, no index, when database connection is closed, the data will be lost;
common table expression: DB2; similar to derived table, different syntax, also multipass sql in a single statement
derived table: Teradata
execute multipass sql in a single statement, done in a nested select statement, no additional table creating and dropping, so no logging, so this is faster;
temporary view, least used temp table type
MSTR generally recommends using derived tables or common table expressions if your RDBMS platform supports, such as in DB2 and Teradata;
default intermediate table VLDB property 是由RDBMS决定的
naming rules: database connection parameters, 'SP' for "split metric" (解释了用temp table的原因,在不同table上的metrics在一个report里被引用)
filters
attribute qualifications
All attribute qualifications, including advanced types like Dynamic date filters, Attribute-to-attribute comparisons, and Joint element list, are applied in the WHERE clause.
set qualifications
on metric
metric qualification
basic : on a value, rank , or percents
in SQL
产生一个临时表,存放customer id(那些符合metric条件的)
从临时表中获取customer列表,在这一步取数据
其实在这个范例场景下,使用report limit可以得到more efficient SQL, 不过这个范例场景只是用来说明basic metric qualification是如何工作的。
需要格外注意的是,这个场景可以使用report limit也是因为,metric的output level使用的是default的report level
Metric Qualification with Output Level
the output level will be applied in a "group by" clause
Metric Qualification with Advanced Option
Default
默认的setting是“When resolving this qualification in a report, other qualifications will be taken into consideration and merged.”
Disabled
scenario:report filter使用(Revenue>6000)和(Year = 2007)这两个条件
在设置(Revenue>6000)的advanced option为default即会和其它qualifications merge的情况下,前述两个条件都会出现在第一条产生临时表的SQL里,如果是disabled的情况,则(Revenue>6000)会出现在第一条产生临时表的SQL里,而(Year=2007)只会出现在第二条取数据的SQL里。
在这个例子中,这两种设置返回的数据会不同。
whether other qualifications in the filter were taken in to consideration when calculate this metric qualification
on relationships
a fact or table should be given to relate the output level and filter qualification, when defining the relationship filter. 不过实际上fact也是基于table定义的,实际上被reference的还是fact背后的table。
通过看report SQL可以知道relationship filter是否正确工作。
relationship filters
Scenario #1: report request: Show me a simple list of customers who purchased the movie "Caddyshack".
examples
List of stores that sell Nike shoes in the Washington D.C. area.
List of catalogs that carry the book Fitness for Dummies.
List of customers that bought items A and B
parts
3.Output level
resulting attribute elements
Filter qualification
desired criteria for analysis
1.Relationship
established via a fact, a table, or the system default
Relates two attributes that are not logically related.
Scenario #2: report request: Show me a list of customers who purchased the movie "CaddyShack" and the movie "Austin Powers" and their Revenue.
implemented by sub-query in where;
Advanced Option
default-disabled
the filter qualification is applied only in the relationship sub-query
enabled
the filter qualification is applied to the overall report SQL
Report As Filter
You can use the report as filter option in the Filter Editor to create similar reports to those that use relationship filters.
Report as Filter option in the Filter Editor
To create similar reports to those that use relationship filters
With the report as filter option, the report that is used as a filter for a second report contains the same logical components that comprise the typical relationship filter
Relationship
the metric on the template
Filter qualification
the report filter
Output level
the attributes on the template
使用的时候要具体注意产生的SQL
相同的场景,relationship filter使用sub-query实现的,而Report As Filter使用了multipass SQL实现
其实我觉得有的时候用这种更容易理清逻辑
比如下面这种场景:买了A Item的Customer,在B Item上的Revenue是多少
Multiple-pass SQL
The engine builds "sets" via SQL passes and joins them together in later SQL passes in a report.
Both two types use "output level" to determine the attribute level at which the qualification is applied.
filters are treated as expressions by the engines.
tips:
disable dynamic sourcing when doing the exercises, which will source from cubes if they exist;Because in exercises we want to see the sql statement;
report level: the lowest level attribute on the report
只要有可能,mstr的engines会尝试让database进行计算,尽量使用sql和database原有的函数功能;
intro to mstr engines
Engines
SQL engine
generates SQL and dictates the logic in the report excitation process;
generated SQL;
manages the other two engines;
resolves metric dimensionality, and anything that can be resolved in SQL;
identifies the appropriate lookup columns and fact table;
Query Engine
Executes the SQL generated by the SQL engine.
interact with the analytical engine to complete the SQL logic;
Analytical engine
performs advanced analytical calculations(anything non-sql), cross-tabs the report and handles multidimensional storage.