
Regularization is a method of reducing your changes of overfitting your model. Ridge regression and Lasso regressione are two potential ways you can take a regression model and eliminate overfitting. Regularization does this by adding additional constraints to the model.

Remember, linear regression fits a line in the form: y=mx+b.
Ridge regression uses the same ordinary least squares method, but with an additional constraint. The additional constraint is that ridge regression chooses coefficients that fit the data well, but also chooses the smallest coefficients possible. In other words, the slope (or coefficient) of every independent variable should be as close to zero as possible. Ridge regression uses the alpha symbol to denote the regularization parameter. Alpha can be any positive integer. Ridge regression is also called L2 regularization. Increasing alpha pushes the coefficients closer to zero and an alpha of zero means the model is essentially just a regular linear regression. You should choose your alpha value by experimenting with different values of alpha and seeing which one gives you the highest r-squared or adjusted r-squared value.
An alternative to ridge regression is lasso regression. Lasso regression differs from ridge regression in that ridge regression pushes coefficients toward zero. Lasso regression pushes coefficients EXACTLY TO ZERO. When a coefficient is pushed to zero, that independent variable has not effect on the model. This can show you which variables should be in your model. Lasso also has a regularization parameter called alpha. The higher the alpha, the more variables are pushed to zero. An alpha of 0 is the same as not having regularization at all. You should choose your alpha value by experimenting with different values of alpha and seeing which one gives you the highest r-squared or adjusted r-squared value.
import pandas as pd
from sklearn.linear_model import Lasso
from sklearn.linear_model import Ridge
from sklearn.model_selection import train_test_split
data = pd.read_csv("MOCK_Income_Data.csv")
data.head()
X = data[["Experience","Bachelors","Masters","PhD","Age"]]
Y = data[["Income"]]
x_train, x_test, y_train, y_test = train_test_split(X,Y)
x_train
ridge_model = Ridge(alpha=1).fit(x_train,y_train)
ridge_model.score(x_test,y_test)
lasso_model = Lasso(alpha=1).fit(x_train,y_train)
lasso_model.score(x_test,y_test)
When you have a lot of data points (or rows of data), regularization becomes less important because your model will have a lot of data to learn how to generalize well. However, if you have a lot of variables, then you may want to use lasso regularization to see which ones affect the dependent variable the most.
Lasso is easier to interpret than Ridge because it reduces the amount of variables. YOu can also use scikit-learn's ElasticNet class, which combines Lasso and Ridge, but then you have two sets of parameters to adjust.