Please enable JavaScript.
Coggle requires JavaScript to display documents.
Data Science Methodology 101 - Coggle Diagram
Data Science Methodology 101
From problem to approach
what is the problem you are trying to solve ?
2.how can you use data to answer the question ?
working with the data
What data do you need to answer the question ?
where is the data coming from ( identify all sources) and how will you get it
Is the data that you collected representative of the problem to be solved ?
what additional work is required to manipulate and work with the data ?
deriving the answer
in what way can the dat be visualized to get the answer that is required ?
Does the model used really answer the initial question or does it need to be adjusted ?
Can you put the model into practice ?
Can you get constructive feedback into answering the question ?
Syllabus
Module 1: From Problem to Approach
Business Understanding
what is the goal ?
Objectives supporting the goal
greeting stockholders buy-in and support
Analytic Approach
Descriptive
Diagnostic ( Statically Analysis )
Predictive ( Forecasting )
Prescriptive
Classification
Module 2: From Requirements to Collection
Data Requirements
identifying the necessary data content, formats and sources for initial data collection.
Data Collection
the data scientist will have a good understanding of what they will be working with.
Techniques such as descriptive statistics and visualization can be applied to the data set, to assess the content, quality, and initial insights about the data.
Gaps in data will be identified and plans to either fill or make substitutions will have to be made.
Module 3: From Understanding to Preparation
Data Understanding
encompasses all activities related to constructing the data set.
Is the data that you collected representative of the problem to be solved?
Data Preparation
is the process of getting the data into
a state where it may be easier to work with.
What
are the ways in which data is prepared?
Feature engineering is It is the process of using domain knowledge of the data to create features that make the machine learning algorithms work.
Module 4: From Modeling to Evaluation
Modeling
what is the purpose of data modeling,
what are some characteristics of this process?
Data Modelling focuses on developing models that are either descriptive or predictive
Evaluation
Does the model used really answer the initial question or does it need to be adjusted?
The first is the diagnostic measures phase, which is used to ensure the model is working
The second phase of evaluation that may be used is statistical significance testing.
Module 5: From Deployment to Feedback
Deployment
getting the stakeholders familiar
with the tool produced.
Feedback
refine the model and assess it for
performance and impact.