Please enable JavaScript.

Coggle requires JavaScript to display documents.

Logistic Regression, Logistic Function, Maximum Likelihood, Confounding,…

- - - - e.g.
        
        z-value = Estimate/Std.error = Num of Std Devs the Estimate is away from 0 on Std Normal Curve
        
        For every 1 unit of weight gained, log(odds of obesity) increases by 1.83
        
        Both Intercept and Slope are NOT statistically significant if alpha = 0.05
- - - - Overall probability of class 1=
        num of class 1 data/num of class 0 e.g. p=obese/not_obese
        
        LL(overall probability) = -6.18
      - From Best fitting line =>
        
        LL(fit) = -3.77
      - p-value
        
        Degrees of freedom = 2 -1 = 1
        
        LL(fit) = 2 parameters (y-intercept, slope)
        LL(overall) = 1 (y-intercept)
        
        P.S for other GLMs
        2[(LL(saturated) - LL(overall)) - (LL(saturated) - LL(fit))]
        Logistic regression: LL(saturated model) = 0;
- - - - (ln(b)/k , L/2)
- - - - log(0.5/0.5)=0; log(0.731/(1-0.731))=1;
        log(0.88/(1-0.88))=2; .....
      - log(0) taken as -ive ∞
        log(1/0)=log(1) - log(0) = +ive ∞
        log(0/1)=log(0) = -ive ∞
    - - 2.1 Project OG data points (at +ive & -ive ∞) onto candidate line
      - 2.2 Each data point will have candidate log(odds) (i.e. y-value) according to candidate line; Transform candidate log(odds) to candidate probabilities
        
        In logistic function form:
        p = 1/[1+e^(-log(odds))]
        = 1/[1+e^(-β0-β1*X)]
        
        ~ Sigmoid function:
        
        Logistic function in detail
      - Likelihood = (y-axis) value of continuous probability density function
        
        For data with unknown likelihood, we assumed one kind of distribution, which gives the y-equation and find the parameters of the distribution using Maximum likelihood
        BUT here we know the likelihood values and we're not interested in the PDF fitting the probabilities (=logistic function which can also be represented by βs) => we want to know the parameters of fitting line (β's) that maximize the likelihood
        
        Can use Log(likelihood) as well
      - 2.3 Maximizing Likelihood
        
        Matrix form: z = θ'x ; σ(z) = 1/[1+e^(-z)] (sigmoid)
        θ=parameters (1 x n); z = 1 x m+1
        x = data matrix (n x m+1); x0 = 1s for intercept value
        n=num of observations; m=num of independent vars
        
        Log likelihood:
        
        Uses Gradient Ascent Algorithm
        (or) Gradient DESCENT if you take -ive LL as Loss function
        
        n = magnitude of the step size that we take; learning rate
  - - - log(odds for obesity for normal) =
    - - size = log(2/9) x B1 + log(7/3 / 2/9) x B2
        size = -1.5 x B1 + 2.35 x B2
      - geneMutant = log(odds ratio)
        = Tells on a log scale how much having the mutated gene increases/decreases the odds of being obese
        
        z-value = Num of std devs the estimated coefficients are away from 0 on std normal curve
        
        Intercept = NOT statistically significant
        geneMutant = Statistically significant
- - - - e.g. L(mean=32, std=2.5|weight=32) = 0.12
- - - - To solve for μ, take σ as constant and find where the slope of the likelihood function ∂L/∂μ = 0
      - To solve for σ, take μ as constant and find where the slope of the likelihood function ∂L/∂σ = 0