Please enable JavaScript.
Coggle requires JavaScript to display documents.
Data Strategy (Data Warehouse
& Data Marts (Questions to be
Resolved…
Data Strategy
-
Data Lake
Principles
-
Use Data lake as "Event lakes" to
store events. Ex: Transactional events,
Supply chain events,
Shipping events, Order events
-
-
From the Event store data
build profile DBs (Polyglot DBs)
ex: Customer hub (customer profiles)
that has nested JSON structures (schema) with keys & hierarchy
Ex: Loyalty transactions in one array
and All behavioral events in another array (click events)
Use the schema objects - atomic schema objects
(ex: loyalty schema object) and other schema objects
to assemble Data products
Questions to be
Resolved
What are the cost associated
with having HANA, Teradata
and Hadoop?
Can we reduce foot print on one or
two of them? May be even
do without one or two of them ?
Data Services
Principles
Major data services must be domain
based data Services - For ex:
Sales, Transportation, Customer,
Product, Doctors, Patients etc
-
-
-
Questions to be
Resolved
Should Data services be built on
Source systems or data marts
or Data Warehouse or Data Lake ?
When & Why ?
-
-
Should data services
use IMDG,
if so when/why?
-
-
-
-
Should apps get data
from multiple sources and
then Join them
vs
A data service joining them
prior to app's consumption
Should apps that get data
through data services (domain or source
data services) keep a copy of that data
for its own use , if yes - what are the
guidelines ?
Do you create Domain Data
Services by calling Source
system Data Services (sometimes
Domain data services may need
to call ODS if there are no Data Services available
on Source systems
-
Data Science
/Analytics
-
Questions to
be resolved
Apps that get data from data services (source or
domain data service) should not use it for
analytics & use it for operational purposes only right ?
All analytical apps should get data from data services built on top of Data lake or directly from Data lake
-
Data
Principles
-
-
-
For data from legacy sources
(ex: SAP, PKMS, etc),
pass the link to schema long with the event
So every event points to a 'Event schema'
-
-
Questions to
be resolved
-
What baout the legacy systems
that do not have APIs for data?
Option - write your own
or worst case extract data into an ODS layer
What about legacy systems that
may be operations of the system would be
impacted by data services calls ?
In such as case create an ODS to copy data
from source legacy system to an ODS
The ODS should keep 3 yrs of historical
data right, rest of the history should be
in Data lake or DWH
-
What modal DB to use for what purpose ?
Ex: GraphDB, or Columnar or RBBMS, or ??
-