Comments

Model development

 



- Simple Linear Regresssion: 

* Predictor (Independent)/ variable-X

* Target(dependent)/variable-Y

The Equation:

Y = \[y = b_{0} + b_{1}X\]

b_{0}: intercept

b_{1}: slop

The Code: Fitting a Simple Linear Model Estimator (X: predictor, Y: Target)

#Import Linear_model from scikit-learn
from sklearn.linear_model import LinearRegresssion

#Creat a linear Regression Object using the consturctor.
lm=LinearRegression()

#Define the predictor and Target variable: 
X = df[['highway-mpg']]
Y = df['price']

#use lm.fit(X,Y) to fit the model. 
lm.fit(X,Y)

SLR (Estimator Linear Regression) 

Now we can obtain a prediction
# to view the intercept(b0): lm.intercept_
#to view the slop(b1): lm.conef_


- Multi Linear Regression

The equationL 
\[y = b_{0} + b_{1}X_{1} + b_{2}X_{2} + b_{3}X_{3} + b_{4}X_{4}\]

b1 is the coefficient of x1 and so on.

The Code: to fitting a multiple Linear Regression: 
# extract the for 4 predictor variable and store them in the variable z

z = df[['horsepower', ' curb-weight', 'engine-size', 'highway-mpg']]

#Train the model 
lm.fit(z, df['price'])

# obtain a prediction
Yhat = lm.predict(X) 

Model Evaluation using Visualization

the main benefit to use regression plot are: 
it gives us a good estimate of : 
- relationship between two variable
- the strength of the correlation
- the direction of the relationship (+, -)

The horizontal axis is independent variable while the vertical axis is dependent variable 

Regression Plot
The code: 
import seaborn as sns
sns.regplot(x="highway-mpg", y="price", data=df)
plt.ylim(0,)

Polynomial Regression: 

Quadratic-2nd order
\[y = b_{0} + b_{1}X_{1} + b_{2}X_{1}^{2}\]




Cubic-3nd order
\[y = b_{0} + b_{1}X_{1}+ b_{2}X_{1}^{2}+ b_{3}X_{1}^{3}\]




Higher order
\[y = b_{0} + b_{1}X_{1}+ b_{2}X_{1}^{2}+ b_{3}X_{1}^{3}+...\]


The Code: 
- Calculate Polynomial of 3rd order
f = np.polyfit(x,y,3)
p = np.polyld(f)

- print out the model
print(p)

- Polynomial Regression with more than one dimension: 
from sklearn.preprocessing import PolynomialFeatures
pr = PolynomialFeatures(degree=2, include_bias=False)
x_polly=pr.fit_transform(x[['horsepowe', 'curb-weight']])
pr=PolynomialFeatures(degree=2)
pr=PolynomialFeatures(degree=2, include_bias=False)
pr.fit_transform([[1,2]])


- Pre-processing

1- Normalize each features

from sklearn.preprocessing import StandardScaler
# Normalize: 
SCALE=StandardScaler()
#Fitting
SCALE.fit(x_data[['horsepower', 'highway-mpg']])
#Transform
x_scale=SCALE.transform(x_data[['horsepower', 'higway-mpg']])

Pipelines

There are many steps to getting predictions

[Normalization ----> Polynomial Transform ]-----> Linear Regression
                Transformations                                             Predictions

#import all pipelines we need

from sklearn.preprocessing import  PolynomialFeatures
from sklearn.linear_model import LinearRegression
from sklear.preprocessing import StandardScaler
from sklearn.pipeline import Pipeline

# create Pipeline instructor

Input = [('scale', StandardScaler()), ('polynomial', PolynomialFeatuers(degree=2),...(mode', LinearRegression() )]

#piple line constructor
pipe=Pipeline(Input)

#trian the pipleline

Pipe.fit(df[['horsepower', 'curb-weight', 'engine-size', 'highway-mpg']],y)
yhat=Pipe.predict(X[[['horsepower', 'curb-weight', 'engine-size', 'highway-mpg']])


Share on Google Plus

About Inas AL-Kamachy

    Blogger Comment
    Facebook Comment

0 Comments:

Post a Comment