5 Broad Groups of Data Products FUNCTIONS
1⃣ Raw Data
2⃣ Derived Data
3⃣ Algorithms
4⃣ Decision Support
5⃣: Automated Decision Making
WHAT ❓ :
WHY ❓
WHO ❓
Definition
- Data products are products that
derived substantial value from analytics
DATA product
INFORMATION product
"The future belongs to the companies & people that turn data into products" Mike Loukides, O'Reilly
reason for Data Scientists treated as rockstars
Google Analytics
Google Search
- Data products created from DS workflow
Data Science produce insights
- Data product is an economic engine
derives value from data & produces more data & therefore more value
a cycle of data contributes value, value creates data
insights are most useful, the best when it is actionable
application of models
usually, domain specific dataset
predictive model
inferential model
e.g.
Recommender system
e.g. Amazon, Youtube, social tags, online news, WAZE
a subclass of information filtering systems that are meant to predict the preferences or ratings that a user would give to a product
examines customers/users behaviours and preferences
use recommendation algorithms to make predictions
references
details
REFERENCES
Datapreneurs
classified
- Data Science Services
- Data Science Training
- Data Science Communities
Digital Natives
generation grow up with and used and shaped by to digital content, online connectivity and technological devices
Data natives
expect not only digital but a smart world i.e. data products that seemingly adapt to their lifestyle and habit, personalized and prescriptive
e.g. wants Starbucks app to recommend new drinks at the right time, suitable time, meet the preferences.. not just a way to order online
concerns what technology can do for them
beyond digital natives
- Data Products
👍: data products can enable behaviour change
sleep better, better diet etc.
👍 data products provide deeply personalized experiences
data products are built upon core analytics
ENTREPRENEURS focused on data science and related topics BI, Business Analytics, Predictive modelling, Machine learning etc.
Andrew Ng. Coursera
Anthony Goldbloom, Kaggle
Christian Chabot, Tableu
Arun C Murthy, Hortonworks
ts primary objective is bringing a quantitative understanding of online behaviour to the user. Here data is central to the interaction with the user and unlike the other products mentioned
a service to search information
we are collecting and making available data as it is (some small processing or cleansing steps). .
The user can then choose to use the data as appropriate, but most of the work is done on the user’s side
In providing users with derived data, we are doing some of the processing on our side.
We could, in the case of customer data, add additional attributes like assigning a customer segment to each customer
e.g.
we could add their likelihood of clicking on an ad or of buying a product from a certain category.
DOSM sells data with statistics input
e.g.
can purchase domain specific raw data from government
algorithms-as-a-service. We are given some data, we run it through the algorithm — be that machine learning or otherwise — and we return information or insights.
e.g. Google Image: the user uploads a picture, and receives a set of images that are the same or similar to the one uploaded.
Behind the scenes, the product extracts features, classifies the image and matches it to stored images, returning the ones that are most similar.
to Provide information to the user to help them with decision-making
Analytics dashboards such as Google Analytics, Flurry, or WGSN
Google Analytics, that could mean changing the editorial strategy, addressing leaks in the conversion funnel, or doubling down on a given product strategy.
user in control to interpret the data, and to make decision to act or otherwise
we have taken design-decisions in data collection, derivation of new data, in choosing what data to display and how to display it
WAZE?
it computes back end the best route in different time frame for a given trip set by users
outsource all of the intelligence within a given domain
e.g.
Netflix product recommendations
Spotify’s Discover Weekly
Self-driving cars or automated drones are more physical manifestations of this closed decision-loop
click to edit
sometimes with an explanation as to why the AI chose that option, other times completely opaque
google history?
Statista, statistic portal
ESRI demographic data
Updated regularly, Esri Demographics datasets are of high value Integrate datasets into your workflows by consuming them as maps, charts, infographics, reports, and more.
allow the algorithm to do the work and present the user with the final output
Feature Selection
art of data science
to identify suitable & important feature
from large dataset
to help make predictions
build different types of models
fall into different types of categories
The need for DATA PRODUCT
Generating abundance of Data
users are expecting highly personalized products DATA NATIVES
Hence, Data increasingly affects every aspect of our lives
Information revolution
internet & connectivity creates surplus of DATA material
transform us all as
data producers
Data producers
a user interface, system or device that collects data that's relevant to an organization
CREATES DATA
Data consumers
A user interface, system or tool that uses data.
Extra info
Passive data
Active data
Data you need to request from a user. The user needs to actively provide this data. Also called “explicit data.
Data that's collected without asking the user for it, via some other means. Also called “implicit data.
e.g.
user browser
timestamp in fb message conversation
data consumers