DATA PRODUCTS

5 Broad Groups of Data Products FUNCTIONS

1⃣ Raw Data

2⃣ Derived Data

3⃣ Algorithms

4⃣ Decision Support

5⃣: Automated Decision Making

WHAT ❓ :

WHY ❓

WHO ❓

Definition

  • Data products are products that
    derived substantial value from analytics

DATA product

INFORMATION product

"The future belongs to the companies & people that turn data into products" Mike Loukides, O'Reilly

reason for Data Scientists treated as rockstars

Google Analytics

Google Search

  • Data products created from DS workflow

Data Science produce insights

  • Data product is an economic engine

derives value from data & produces more data & therefore more value

a cycle of data contributes value, value creates data

insights are most useful, the best when it is actionable

application of models

usually, domain specific dataset

predictive model

inferential model

e.g.

Recommender system

e.g. Amazon, Youtube, social tags, online news, WAZE

a subclass of information filtering systems that are meant to predict the preferences or ratings that a user would give to a product

examines customers/users behaviours and preferences

use recommendation algorithms to make predictions

references
details

REFERENCES

Datapreneurs

classified

  • Data Science Services
  • Data Science Training
  • Data Science Communities

Digital Natives

generation grow up with and used and shaped by to digital content, online connectivity and technological devices

Data natives

expect not only digital but a smart world i.e. data products that seemingly adapt to their lifestyle and habit, personalized and prescriptive

e.g. wants Starbucks app to recommend new drinks at the right time, suitable time, meet the preferences.. not just a way to order online

concerns what technology can do for them

beyond digital natives

  • Data Products

👍: data products can enable behaviour change

sleep better, better diet etc.

👍 data products provide deeply personalized experiences

data products are built upon core analytics

ENTREPRENEURS focused on data science and related topics BI, Business Analytics, Predictive modelling, Machine learning etc.

Andrew Ng. Coursera

Anthony Goldbloom, Kaggle

Christian Chabot, Tableu

Arun C Murthy, Hortonworks

ts primary objective is bringing a quantitative understanding of online behaviour to the user. Here data is central to the interaction with the user and unlike the other products mentioned

a service to search information

we are collecting and making available data as it is (some small processing or cleansing steps). .

The user can then choose to use the data as appropriate, but most of the work is done on the user’s side

In providing users with derived data, we are doing some of the processing on our side.

We could, in the case of customer data, add additional attributes like assigning a customer segment to each customer

e.g.

we could add their likelihood of clicking on an ad or of buying a product from a certain category.

DOSM sells data with statistics input

e.g.

can purchase domain specific raw data from government

algorithms-as-a-service. We are given some data, we run it through the algorithm — be that machine learning or otherwise — and we return information or insights.

e.g. Google Image: the user uploads a picture, and receives a set of images that are the same or similar to the one uploaded.

Behind the scenes, the product extracts features, classifies the image and matches it to stored images, returning the ones that are most similar.

to Provide information to the user to help them with decision-making

Analytics dashboards such as Google Analytics, Flurry, or WGSN

Google Analytics, that could mean changing the editorial strategy, addressing leaks in the conversion funnel, or doubling down on a given product strategy.

user in control to interpret the data, and to make decision to act or otherwise

we have taken design-decisions in data collection, derivation of new data, in choosing what data to display and how to display it

WAZE?

it computes back end the best route in different time frame for a given trip set by users

outsource all of the intelligence within a given domain

e.g.

Netflix product recommendations

Spotify’s Discover Weekly

Self-driving cars or automated drones are more physical manifestations of this closed decision-loop

click to edit

sometimes with an explanation as to why the AI chose that option, other times completely opaque

google history?

Statista, statistic portal

ESRI demographic data

Updated regularly, Esri Demographics datasets are of high value Integrate datasets into your workflows by consuming them as maps, charts, infographics, reports, and more.

allow the algorithm to do the work and present the user with the final output

Feature Selection

art of data science

to identify suitable & important feature

from large dataset

to help make predictions

build different types of models

fall into different types of categories

The need for DATA PRODUCT

Generating abundance of Data

users are expecting highly personalized products DATA NATIVES

Hence, Data increasingly affects every aspect of our lives

Information revolution

internet & connectivity creates surplus of DATA material

transform us all as

data producers

Data producers

a user interface, system or device that collects data that's relevant to an organization

CREATES DATA

Data consumers

A user interface, system or tool that uses data.

Extra info

Passive data

Active data

Data you need to request from a user. The user needs to actively provide this data. Also called “explicit data.

Data that's collected without asking the user for it, via some other means. Also called “implicit data.

e.g.

user browser

timestamp in fb message conversation

data consumers