Please enable JavaScript.

Coggle requires JavaScript to display documents.

Machine learning - Coggle Diagram

- - - - rom sklearn.neighbors import KNeighborsClassifier
        k = 4
        Train Model and Predict
        neigh = KNeighborsClassifier(n_neighbors = k).fit(X_train,y_train)
    - - hypothesis function: logistic/sigmoid function
        \( h_{\theta}(x) = g(\theta^{T}x) \)
        \( z = \theta^{T}x\)
        \( g(z) = \frac{1}{1+e^{-z}} \)
      - decision boundary:
        khi \( \theta^{T}x >=0\) thì g(z) >=0.5 => y =1
        và ngược lại
      - hàm cost function cho logistic function là hàm dạng log- trong đó cost sẽ là rất lớn nếu kq dự đoán khác so với thực tế
        \( J(\theta) = \frac{1}{m} \sum_{i=1}^{m} Cost(h_{\theta}(x^{(i)}),y^{(i)})\)
        \(Cost(h_{\theta}(x),y) = -log(h_{\theta}(x))\) if y =1;
        \(Cost(h_{\theta}(x),y) = -log(1-h_{\theta}(x))\) if y =0
        với 2 hàm trên thì: \(cost \to \infty\) khi \(h_{\theta}(x) \to (1-y) \)
        2 hàm cost trên tương đương với:
        \(Cost(h_{\theta}(x),y) = -(y)log(h_{\theta}(x)) -(1-y)log(1-h_{\theta}(x))\)
      - \( J(\theta) = -\frac{1}{m} \sum^{m}_{i=1}[y^{(i)}log(h_{\theta}(x^{(i)})) + (1-y^{(i)})log(1-h_{\theta}(x^{(i)}))] \)
        với \( h = g(X\theta)\)
        \( J(\theta) = \frac{1}{m} (-y^{T})log(h) - (1-y)^{T}log(1-h) \)
      - Gradient descent
        \( \theta_{j} := \theta_{j} - \alpha\frac{1}{m} \sum_{i=1}^{m}(h_{\theta }(x^{(i)})-y^{(i)})x^{(i)}_{j} \)
        \(\theta := \theta - \frac{\alpha}{m} X^{T}(g(X\theta) - \vec{y}) \)
  - - - Linear
        
        if Polynomial regression
        from sklearn.preprocessing import PolynomialFeatures
        poly = PolynomialFeatures(degree=2)
        train_x = poly.fit_transform(train_x)
        
        train
        from sklearn import linear_model
        regr = linear_model.LinearRegression()
        train_x = np.asanyarray(train[['ENGINESIZE']])
        train_y = np.asanyarray(train[['CO2EMISSIONS']])
        regr.fit(train_x, trainy)
        print ('Coefficients: ', regr.coef)
        print ('Intercept: ',regr.intercept_)
        
        predict
        test_x = np.asanyarray(test[['ENGINESIZE']])
        test_y = np.asanyarray(test[['CO2EMISSIONS']])
        testy = regr.predict(test_x)
        
        Linear regression analysis is used to predict the value of a variable based on the value of another variable.
      - Non-Linear
        
        Cubic
        y = 1(x**3) + 1(x*2) + 1x + 3
        
        Quadratic
        y = np.power(x,2)
        
        Exponential
        Y= a + b*np.exp(X)
        
        Logarithmic
        Y = np.log(X)
        
        Sigmoidal/Logistic
        Y = 1-4/(1+np.power(3, X-2))
        
        curve_fit
        from scipy.optimize import curve_fit
        popt, pcov = curve_fit(sigmoid, xdata, ydata)
        print the final parameters
        print(" beta_1 = %f, beta_2 = %f" % (popt[0], popt[1]))
        
        predict using test set
        y_hat = sigmoid(test_x, *popt)
    - - Linear
        
        from sklearn import linear_model
        regr = linear_model.LinearRegression()
        x = np.asanyarray(train[['ENGINESIZE','CYLINDERS','FUELCONSUMPTIONCOMB']])
        y = np.asanyarray(train[['CO2EMISSIONS']])
        regr.fit (x, y)
        print ('Coefficients: ', regr.coef)
        
        lý thuyết
        
        Cost function:
        \( J(\theta_{0} ,\theta_{1} ) = \frac{1}{2m} \sum_{i=1}^{m}(\hat{y_{i}}-y_{i})^{2}= \frac{1}{2m} \sum_{i=1}^{m}(h_{\theta }(x_{i})-y_{i})^{2} \)
        
        gradient descent
        
        Theta := theta - learningRate * (đạo hàm của hàm cost function theo theta):
        \( \theta_{j} := \theta_{j} - \alpha\frac{\partial }{\partial\theta_{j}}J(\theta_{0} ,\theta_{1} ) \)
        \( \theta_{0} := \theta_{0} - \alpha\frac{1}{m} \sum_{i=1}^{m}(h_{\theta }(x_{i})-y_{i})\)
        \( \theta_{1} := \theta_{1} - \alpha\frac{1}{m} \sum_{i=1}^{m}(h_{\theta }(x_{i})-y_{i})x_{i} \)
        Với \( x^{(i)}_{j} \) ta có value of feature j in the \(i^{th}\) training example
        \( \theta_{j} := \theta_{j} - \alpha\frac{1}{m} \sum_{i=1}^{m}(h_{\theta }(x^{(i)})-y^{(i)})x^{(i)}_{j} \)
        
        hypothesis function (linear)
        \(h_{\theta }(x) = \theta_{0} + \theta_{1}x_{1}\)
        \(h_{\theta }(x) = \theta_{0} + \theta_{1}x_{1} + \theta_{2}x_{2} + ... + \theta_{n}x_{n}\)
        \(h_{\theta }(x) = \theta^{T}X\)
        
        Feature scaling
        \(x_{i}= \frac{x_{i}-\mu_{i}}{s_{i}}\)
        trong đó \( \mu_{i}\) là avg(x) và \(s_{i}\) là (max(x) - min(x))
        
        Learning Rate \( \alpha \): nhỏ thì làm chậm trainning mà lớn thì ko ra kq
        
        Normal Equation: ngoài cách dùng gradient descent để dò dần ra \(\theta\) ta có thể tính trực tiếp ra bằng phép tính sau:
        \(\theta = (X^{T}X)^{-1}X^{T}y\)
        
        tránh overfitting bằng regularized
        (+lamda/m*theta):
        \( \theta_{j} := \theta_{j} - \alpha\frac{1}{m} \sum_{i=1}^{m}(h_{\theta }(x^{(i)})-y^{(i)})x^{(i)}_{j} + \frac{\lambda}{m}\theta_{j}\)
        hàm normal equation sẽ trở thành:
        \(\theta = (X^{T}X +\lambda L)^{-1}X^{T}y \)
      - Non-Linear