Please enable JavaScript.
Coggle requires JavaScript to display documents.
Machine learning - Coggle Diagram
Machine learning
Check
Mean absolute error
\( MAE = \frac{1}{n}\sum_{i=1}^{n} \left | y_{i}-\hat{y}_{i}\right |\)
np.mean(np.absolute(test
y
- test_y))
\(\hat{y}_{i}: predict y\)
Root Mean Squared Error
\(RMSE= \sqrt{\frac{1}{n}\sum_{i=1}^{n} (y_{i}-\hat{y}_{i})^2}\)
np.sqrt(np.mean((test
y
- test_y) ** 2))
Mean squared error
\(MSE= \frac{1}{n}\sum_{i=1}^{n} (y_{i}-\hat{y}_{i})^2\)
np.mean((test
y
- test_y) ** 2)
Relative Absolute Error
\( RAE = \frac{\sum_{i=1}^{n} \left | y_{i}-\hat{y}_{i}\right |}{\sum_{i=1}^{n} \left | y_{i}-\bar{y}_{i}\right |}\)
Relative Squared Error
\( RSE = \frac{\sum_{i=1}^{n} \left ( y_{i}-\hat{y}_{i}\right )^{2}}{\sum_{i=1}^{n} \left (y_{i}-\bar{y}_{i}\right )^{2}}\)
\(R^2\)
\( R^{2} = 1 - RSE\)
from sklearn.metrics import r2_score
r2_score(test_y , test
y
)
Supervised learning
Classification
K-Nearest neighbors
rom sklearn.neighbors import KNeighborsClassifier
k = 4
Train Model and Predict
neigh = KNeighborsClassifier(n_neighbors = k).fit(X_train,y_train)
Regression
Simple Regression
Linear
if Polynomial regression
from sklearn.preprocessing import PolynomialFeatures
poly = PolynomialFeatures(degree=2)
train_x = poly.fit_transform(train_x)
train
from sklearn import linear_model
regr = linear_model.LinearRegression()
train_x = np.asanyarray(train[['ENGINESIZE']])
train_y = np.asanyarray(train[['CO2EMISSIONS']])
regr.fit(train_x, train
y)
print ('Coefficients: ', regr.coef
)
print ('Intercept: ',regr.intercept_)
predict
test_x = np.asanyarray(test[['ENGINESIZE']])
test_y = np.asanyarray(test[['CO2EMISSIONS']])
test
y
= regr.predict(test_x)
Linear regression analysis is used to predict the value of a variable based on the value of another variable.
Non-Linear
Cubic
y = 1
(x**3) + 1
(x
*2) + 1
x + 3
Quadratic
y = np.power(x,2)
Exponential
Y= a + b*np.exp(X)
Logarithmic
Y = np.log(X)
Sigmoidal/Logistic
Y = 1-4/(1+np.power(3, X-2))
curve_fit
from scipy.optimize import curve_fit
popt, pcov = curve_fit(sigmoid, xdata, ydata)
print the final parameters
print(" beta_1 = %f, beta_2 = %f" % (popt[0], popt[1]))
predict using test set
y_hat = sigmoid(test_x, *popt)
Multiple Regression
Linear
from sklearn import linear_model
regr = linear_model.LinearRegression()
x = np.asanyarray(train[['ENGINESIZE','CYLINDERS','FUELCONSUMPTION
COMB']])
y = np.asanyarray(train[['CO2EMISSIONS']])
regr.fit (x, y)
print ('Coefficients: ', regr.coef
)
Non-Linear
Prepare data
Split to train set
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split( X, y, test_size=0.2, random_state=4)
print ('Train set:', X_train.shape, y_train.shape)
print ('Test set:', X_test.shape, y_test.shape)
msk = np.random.rand(len(df)) < 0.8
train = cdf[msk]
test = cdf[~msk]
Normalize Data
from sklearn import preprocessing
X = preprocessing.StandardScaler().fit(X).transform(X.astype(float))
Un-supervised learning
Clustering