Please enable JavaScript.

Coggle requires JavaScript to display documents.

Session 6 - Introducing Machine Learning - Coggle Diagram

- - - - 4 steps process
        
        Selection: 1st step = select the relevant data for the analysis (cleaning & pre-processing the data, error removed)
        
        Preprocessing: 2nd step is to prepare the data for mining (transforming the data into a diff format)
        
        Data mining: 3rd step = apply data mining algorithms to the data to extract patterns & knowledge
        
        Interpretation/Evaluation: Last step = interpret & evaluate the results/significance of the data mining process
  - - - uses summary statistics -> average, counts, median, variance, relative frequency, counts by a certain group, etc.
      - Ex: T nbr of cust. last year; % of cust. that spend > 100€; % of female cust.; average spending male customers
      - Type of data: Structured data (cust. records, sales transactions, website logs) & semi-structured data (XML, JSON, email messages, web pages), such as transactional data, customer data, and website data.
    - - Ex: Predicting cust. churn; Predicting conversions (=predicting whether a customer will convert based on its characteristics)
      - Type of data: Structured & unstructured data (emails, social media posts, customer reviews, medical images)
- - - - Customer Attrition or Churn: Ex: Goal = Predict whether a customer is likely to churn or leave for a competitor; Approach = Label customers as loyal or disloyal
      - Fraud Detection:Ex: Goal = predict fraudulent cases in credit card transactions; Approach = use credit card transaction data and account holder information as attributes
      - Digital Marketing: Ex: Goal = Reduce the cost of direct mailing by targeting a set of consumers likely to purchase a new cellphone; Approach = Use historical data for a similar cellphone product to identify customers who purchased and those who did not
  - - - can be done using clustering algorithms to group customers together based on their demographics, lifestyle, and purchase history.
    - - can be done using clustering algorithms to group documents together based on their keywords, topics, and sentiment.