Model development

- Simple Linear Regresssion:

* Predictor (Independent)/ variable-X

* Target(dependent)/variable-Y

The Equation:

Y = \[y = b_{0} + b_{1}X\]

b_{0}: intercept

b_{1}: slop

The Code: Fitting a Simple Linear Model Estimator (X: predictor, Y: Target)

#Import Linear_model from scikit-learn

from sklearn.linear_model import LinearRegresssion

#Creat a linear Regression Object using the consturctor.

lm=LinearRegression()

#Define the predictor and Target variable:

X = df[['highway-mpg']]

Y = df['price']

#use lm.fit(X,Y) to fit the model.

lm.fit(X,Y)

SLR (Estimator Linear Regression)

Now we can obtain a prediction

# to view the intercept(b0): lm.intercept_

#to view the slop(b1): lm.conef_

- Multi Linear Regression

The equationL

\[y = b_{0} + b_{1}X_{1} + b_{2}X_{2} + b_{3}X_{3} + b_{4}X_{4}\]

b1 is the coefficient of x1 and so on.

The Code: to fitting a multiple Linear Regression:

# extract the for 4 predictor variable and store them in the variable z

z = df[['horsepower', ' curb-weight', 'engine-size', 'highway-mpg']]

#Train the model

lm.fit(z, df['price'])

# obtain a prediction

Yhat = lm.predict(X)

Model Evaluation using Visualization

the main benefit to use regression plot are:

it gives us a good estimate of :

- relationship between two variable

- the strength of the correlation

- the direction of the relationship (+, -)

The horizontal axis is independent variable while the vertical axis is dependent variable

Regression Plot

The code:

import seaborn as sns

sns.regplot(x="highway-mpg", y="price", data=df)

plt.ylim(0,)

Polynomial Regression:

Quadratic-2nd order

\[y = b_{0} + b_{1}X_{1} + b_{2}X_{1}^{2}\]

Cubic-3nd order

\[y = b_{0} + b_{1}X_{1}+ b_{2}X_{1}^{2}+ b_{3}X_{1}^{3}\]

Higher order

\[y = b_{0} + b_{1}X_{1}+ b_{2}X_{1}^{2}+ b_{3}X_{1}^{3}+...\]

The Code:

- Calculate Polynomial of 3rd order

f = np.polyfit(x,y,3)

p = np.polyld(f)

- print out the model

print(p)

- Polynomial Regression with more than one dimension:

from sklearn.preprocessing import PolynomialFeatures

pr = PolynomialFeatures(degree=2, include_bias=False)

x_polly=pr.fit_transform(x[['horsepowe', 'curb-weight']])

pr=PolynomialFeatures(degree=2)

pr=PolynomialFeatures(degree=2, include_bias=False)

pr.fit_transform([[1,2]])

- Pre-processing

1- Normalize each features

from sklearn.preprocessing import StandardScaler

# Normalize:

SCALE=StandardScaler()

#Fitting

SCALE.fit(x_data[['horsepower', 'highway-mpg']])

#Transform

x_scale=SCALE.transform(x_data[['horsepower', 'higway-mpg']])

Pipelines

There are many steps to getting predictions

[Normalization ----> Polynomial Transform ]-----> Linear Regression

Transformations Predictions

#import all pipelines we need

from sklearn.preprocessing import PolynomialFeatures

from sklearn.linear_model import LinearRegression

from sklear.preprocessing import StandardScaler

from sklearn.pipeline import Pipeline

# create Pipeline instructor

Input = [('scale', StandardScaler()), ('polynomial', PolynomialFeatuers(degree=2),...(mode', LinearRegression() )]

#piple line constructor

pipe=Pipeline(Input)

#trian the pipleline

Pipe.fit(df[['horsepower', 'curb-weight', 'engine-size', 'highway-mpg']],y)

yhat=Pipe.predict(X[[['horsepower', 'curb-weight', 'engine-size', 'highway-mpg']])

Junior 4 Data Scientist

Comments

Flickr

Sponsor

Labels

Blog Archive

Model development

- Simple Linear Regresssion:

- Multi Linear Regression

Model Evaluation using Visualization

Polynomial Regression:

Pipelines

About Inas AL-Kamachy

0 Comments:

Post a Comment

Recent comments

Flickr

Sponsor

Labels

Blog Archive

Model development

- Simple Linear Regresssion:

- Multi Linear Regression

Model Evaluation using Visualization

Polynomial Regression:

Pipelines

About Inas AL-Kamachy

RELATED POSTS

0 Comments:

Post a Comment