Please enable JavaScript.
Coggle requires JavaScript to display documents.
Decision tree, Evaluation Method, Clustering - Coggle Diagram
Decision tree
What is it?
A structure that includes a root node, branches, and leaf nodes. Each internal node denotes a test on an attribute, each branch denotes the outcome of a test, and each leaf node holds a class label. The topmost node in the tree is the root node.
How to use it
Start with your overarching objective/ “big decision” at the top (root) > Draw your arrows >Attach leaf nodes at the end of your branches > Determine the odds of success of each decision point >Evaluate risk vs reward
when to use it
Decision trees are used in a wide range of industries, to solve many types of problems. Because of their flexibility, they’re used in sectors from technology and health to financial planning. Examples a technology business evaluating expansion opportunities based on analysis of past sales data and Banks and mortgage providers using historical data to predict how likely it is that a borrower will default on their payments
where to use it
Another application of decision trees is in the use of demographic data to find prospective clients. They can help streamline a marketing budget and make informed decisions on the target market that the business is focused on. In the absence of decision trees, the business may spend its marketing market without a specific demographic in mind, which will affect its overall revenues.
Evaluation Method
when to use it
We must know the information before we use evaluation method. The information are the information that is needed for its to make decisions and the information that can be feasibly be collected and analyzed. This information is needed when to use the evaluation method. The information must be accurate. Lastly, the information must be credible.
where to use it
Evaluation method is designed to help explore the options that we have when creating a program and project designs. For example, we use evaluation methods for overall health policy assessment.
how to use it
Construction and evaluation of a classifier require partitioning labeled data into a training set and a test set. Holdout and cross-validation.
Split or Holdout : i) Given data is randomly partitioned into two independent sets. ii) Training set for model construction. iii) Test set for accuracy estimation.
Cross-Validation : i) k-fold, where k = 10 is the most popular. ii) Randomly partition the data into k mutually exclusive subsets, each approximately equal size. iii) At i-th iteration, use D, as test set and others as training set
what it is ?
Evaluation method is an valuation of the strengths and limitations of the various data sources is critical to selecting appropriate data for use, and to establishing the uncertainty associated with dose-response models that are developed from different data sets and test protocols.
Clustering
What it is?
Clustering is a type of unsupervised learning method of machine learning. In the unsupervised learning method, the inferences are drawn from the data sets which do not contain labelled output variable. It is an exploratory data analysis technique that allows us to analyze the multivariate data sets. Clustering is a task of dividing the data sets into a certain number of clusters in such a manner that the data points belonging to a cluster have similar characteristics. Clusters are nothing but the grouping of data points such that the distance between the data points within the clusters is minimal.
How to use it
For example, Clustering in Data Mining helps in the classification of animals and plants are done using similar functions or genes in the field of biology. It helps in gaining insight into the structure of the species. Areas are identified using the clustering in data mining. In the database of earth observation, lands are identified which are similar to each other.
Where to use it
Clustering can be used in various area such as data mining, academics, image processing and transformation, bioinformatics and many more. Some common applications platforms where clustering as a tool can e implemented as recommendation engines, market and customer segmentation, social network analysis and man more.
When to use it
When the dataset is large and unstructured dataset. When the classes is not known or do not know how many the classes is divided to. When manually dividing and annotating your data is too resource-intensive. When looking for anomalies in data.