Lasso Regression

 Python implementation of the Lasso Regression algorithm


Prerequisites:

Linear Regression 

Gradient Descent


Introduction:

Another linear model that is descended from Linear Regression and uses the same fictitious function for prediction is Lasso Regression. J is a representation of the linear regression's cost function.






The dataset's total number of training examples, m, is indicated here.

The hypothetical prediction function represents h(x(i)).

The target variable's value for the ith training example is represented by y(i).


The Linear Regression model considers all the features equally relevant for prediction. when a dataset contains many features, even if some of them are irrelevant to the predictive model. Due to the test set's too incorrect forecast, the model becomes increasingly sophisticated ( or overfitting ). A high variance model like this does not generalize to the new data. Lasso Regression steps in to save the day. It changed the cost function of linear regression by adding an L1 penalty, equivalent to the absolute value of the magnitude of weights. Below is provided the adjusted cost function for the Lasso Regression.





Here, w(j) represents the weight for a jth feature.  

n is the number of features in the dataset.

lambda is the regularization strength.



Lasso Regression: What is it?

A machine learning approach called lasso regression can be used to perform linear regression while also minimizing the number of characteristics that are included in the model. The least absolute shrinkage and selection operator is referred to as LASSO. Keep an eye out for the terms "least absolute shrinkage" and "selection." We'll talk about it shortly. Machine learning uses lasso regression to avoid overfitting. Additionally, it can be used to choose features by setting coefficients to 0. L1-norm regularisation is another name for lasso regression.


By adding a regularisation parameter multiplied by the total of the absolute weights to the loss function (typical least squares) of linear regression, lasso regression is an extension of linear regression. Regularized linear regression is another name for lasso regression. In order to maintain the overall objective of the minimized sum of squares, it is intended to introduce the penalty against complexity by including the regularisation term so that with the rising value of the regularisation parameter, the weights get reduced (and, hence, the penalty introduced). Similar to linear regression, the hypothesis or mathematical model (equation) for lasso regression can be written as follows. The loss function, however, is distinct.


Intuition in mathematics:

Weights that were applied to the gradient descent optimization were reduced to values close to zero or zero. The properties contained in the hypothetical function are eliminated when the weights are reduced to zero. As a result, irrelevant features are excluded from the prediction model. The hypothesis becomes simpler as a result of the weights' penalization, which promotes sparsity ( model with few parameters ). It doesn't change if the intercept is introduced. Hyperparameter lambda allows us to regulate the regularization's degree of strength. The same lambda reduction factor is applied to all weights. In various situations for adjusting lambda values, Lasso regression is equivalent to linear regression if lambda is set to 0. All weights are reduced to zero if the value of lambda is set to infinity.


LASSO: When to Use It:

Do you have any questions about when a LASSO model is preferable to a ridge regression model? Or perhaps you're debating between using a LASSO model and a typical regression model? In any case, we have you covered!

Everything you need to know to understand when to utilize a LASSO model is covered in this article. We begin by talking about the different kinds of outcomes that LASSO models can handle. The primary benefits and drawbacks of LASSO models are then discussed in more detail. Finally, we give some particular instances of situations when using LASSO models is appropriate and inappropriate.



What kinds of results can LASSO manage?

What kind of outcome variables are supported by LASSO models? When considering the kinds of outcomes that LASSO models can manage, it's important to keep in mind that the term "LASSO" does not necessarily relate to one specific model. Instead, it describes a group of models that develops when an L1 penalty is added to a typical regression model.

What does all of this mean? It implies that there are a limited number of distinct LASSO model types, each of which handles a variety of outcome variables.


Can a continuous result be used with a LASSO model?

Can a continuous result be used with a LASSO model? Yes, a continuous outcome can be used with a LASSO model. In fact, it's safe to say that this is the LASSO model type that is used the most. This model results from adding an L1 penalty to a conventional linear regression model.

The model that is derived from a common linear regression model is nearly often meant when the word "LASSO" is used to describe a single type of model.


Can a binary result be utilized with a LASSO model?

Can a binary result be utilized with a LASSO model? Due to the prevalence of binary categorization issues, this is yet another frequent query. There is a LASSO model that can be applied to a situation with a binary result.


The LASSO model for binary outcomes is frequently referred to as a "logistic LASSO" regression model, just as the traditional regression model for binary outcomes is called a "logistic" regression model.


Benefits and drawbacks of LASSO:

What are some of the LASSO regression model's primary benefits and drawbacks? The main LASSO model benefits and drawbacks are listed below.


Benefits of the LASSO regression

automatic selection of features A LASSO regression model's key benefit is its ability to reduce the coefficients for features that are not interesting to zero. This indicates that the model makes some automatic feature selection decisions regarding which features should be included and which ones should not.

decrease in overfitting The L1 penalty that is introduced to the model during a LASSO regression also has the benefit of preventing the model from becoming overfitted. This makes basic sense given that model complexity lowers when feature coefficients are set to zero, thereby removing features from the model.




Problems with LASSO regression:

coefficients with bias. The biased coefficients generated by a LASSO model are one of the key drawbacks of LASSO regression. The L1 penalty that is applied to the model causes the coefficients to be artificially shrunk, sometimes all the way to zero. This means that the coefficients from a LASSO model indicate a scaled-down version of the true magnitude of the link between the characteristics and the result.

Estimating standard errors is challenging.  It is challenging to calculate precise standard errors for the biased coefficient estimates in a LASSO model. Because of this, it is challenging to do tasks like conducting statistical analyses on them and creating confidence intervals around them. battling linked characteristics When trained on data containing linked characteristics, LASSO models also have the drawback of being extremely unstable. Usually, one characteristic is chosen relatively randomly, and all of the other features that are strongly connected with it are effectively eliminated from the model. This could mislead someone into thinking that only the feature that was kept in the model is significant, whereas, in fact, some of the other features might be equally or even more significant. Typically unstable projections Since LASSO models are known to provide estimates that are relatively unstable, training them on slightly different datasets might cause significant changes in those estimates. For instance, you may anticipate seeing different characteristics removed from the model for each dataset if you bootstrap the data several times to produce a number of distinct sample datasets. Even if all of the datasets you are using for training are quite comparable, this can still occur. a hyperparameter is introduced. Although this drawback is less significant, LASSO models incorporate a hyperparameter to control the severity of the L1 penalty. As a result, you must perform actions related to hyperparameter tuning that are not necessary when using a regular regression model.

There are further problems with conventional regression models. The same problems that impact LASSO regression models also affect ordinary regression models. This family of models is equally subject to issues with interactions, outliers, and strict model assumptions.


Implementation:

It contains 2 columns for each of the 30 employees in the company: "years experience" and "Salary." In order to determine the relationship between each employee's compensation and the number of years of experience, we will train a Lasso Regression model in this. We will be able to forecast an employee's salary based on his years of experience once the model has been trained.









Comments