Regularization: Ridge and Lasso - Linear Regression

Today's Agenda

What is Regularization?
Quick recap of linear regression
Ridge Regression Details
Lasso Regression Details
Implementation of Lasso and Ridge Regression
When to Use Lasso and Ridge Regression?

What is Regularization?

Regularization is a method of reducing your changes of overfitting your model. Ridge regression and Lasso regressione are two potential ways you can take a regression model and eliminate overfitting. Regularization does this by adding additional constraints to the model.

Quick recap of linear regression

Ridge Regression Details

Remember, linear regression fits a line in the form: y=mx+b.
Ridge regression uses the same ordinary least squares method, but with an additional constraint. The additional constraint is that ridge regression chooses coefficients that fit the data well, but also chooses the smallest coefficients possible. In other words, the slope (or coefficient) of every independent variable should be as close to zero as possible. Ridge regression uses the alpha symbol to denote the regularization parameter. Alpha can be any positive integer. Ridge regression is also called L2 regularization. Increasing alpha pushes the coefficients closer to zero and an alpha of zero means the model is essentially just a regular linear regression. You should choose your alpha value by experimenting with different values of alpha and seeing which one gives you the highest r-squared or adjusted r-squared value.

Lasso Regression Details

An alternative to ridge regression is lasso regression. Lasso regression differs from ridge regression in that ridge regression pushes coefficients toward zero. Lasso regression pushes coefficients EXACTLY TO ZERO. When a coefficient is pushed to zero, that independent variable has not effect on the model. This can show you which variables should be in your model. Lasso also has a regularization parameter called alpha. The higher the alpha, the more variables are pushed to zero. An alpha of 0 is the same as not having regularization at all. You should choose your alpha value by experimenting with different values of alpha and seeing which one gives you the highest r-squared or adjusted r-squared value.

Implementation of Lasso and Ridge Regression in Python

import pandas as pd
from sklearn.linear_model import Lasso
from sklearn.linear_model import Ridge
from sklearn.model_selection import train_test_split

data = pd.read_csv("MOCK_Income_Data.csv")

data.head()

X = data[["Experience","Bachelors","Masters","PhD","Age"]]
Y = data[["Income"]]

x_train, x_test, y_train, y_test = train_test_split(X,Y)

x_train

ridge_model = Ridge(alpha=1).fit(x_train,y_train)

ridge_model.score(x_test,y_test)

0.7099549559117013

lasso_model = Lasso(alpha=1).fit(x_train,y_train)

lasso_model.score(x_test,y_test)

0.5359029485524133

When to Use Lasso and Ridge Regression?

When you have a lot of data points (or rows of data), regularization becomes less important because your model will have a lot of data to learn how to generalize well. However, if you have a lot of variables, then you may want to use lasso regularization to see which ones affect the dependent variable the most.

Lasso is easier to interpret than Ridge because it reduces the amount of variables. YOu can also use scikit-learn's ElasticNet class, which combines Lasso and Ridge, but then you have two sets of parameters to adjust.

	Experience	Age	Income
0	0	21	21000
1	0	22	23450
2	1	24	84000
3	1	25	29000
4	1	26	35000

	Experience	Bachelors	Masters	PhD	Age
11	21	1	1	1	40
15	21	1	0	0	45
19	39	0	0	0	57
3	1	0	0	0	25
13	19	1	1	0	41
...	...	...	...	...	...
16	25	1	0	0	52
12	18	1	1	0	40
8	12	0	0	0	32
17	26	0	0	0	54
0	0	0	0	0	21

	Experience	Bachelors	Masters	PhD	Age
11	21	1	1	1	40
15	21	1	0	0	45
19	39	0	0	0	57
3	1	0	0	0	25
13	19	1	1	0	41
...	...	...	...	...	...
16	25	1	0	0	52
12	18	1	1	0	40
8	12	0	0	0	32
17	26	0	0	0	54
0	0	0	0	0	21

	Experience	Bachelors	Masters	PhD	Age
11	21	1	1	1	40
15	21	1	0	0	45
19	39	0	0	0	57
3	1	0	0	0	25
13	19	1	1	0	41
...	...	...	...	...	...
16	25	1	0	0	52
12	18	1	1	0	40
8	12	0	0	0	32
17	26	0	0	0	54
0	0	0	0	0	21

	Experience	Bachelors	Masters	PhD	Age
11	21	1	1	1	40
15	21	1	0	0	45
19	39	0	0	0	57
3	1	0	0	0	25
13	19	1	1	0	41
...	...	...	...	...	...
16	25	1	0	0	52
12	18	1	1	0	40
8	12	0	0	0	32
17	26	0	0	0	54
0	0	0	0	0	21