Linear Regression with added constraints on coefficients

Linear regression is used to obtain the linear relationship between dependent and independent variables. For example, we want to find the relation between individuals age and weight, where age is independent variable and weight is dependent variable.

The simple linear regression is of the form

Y=a + b*X

Where X is independent variable and Y is dependent. Slope of this line is b and y-intercept is a. Here our task is to find the coefficients a and b which minimize the error between our predicted data and actual data. Error is

Error= Actual — predicted

In the above situation there are no conditions on the coefficients of independent variables. But sometimes in real life scenarios, you may face problems where you need to impose certain constraints on the coefficients of the decision variables.

To overcome this problem , linear programming optimization can be used.

Some real-life situations can be as explained below :

  1. Coefficients needs to be positive :

If you want to forecast the sales of a company based on its previous demands along with the effect of promotions on it , then the coefficients for the promos shouldn’t be negative as it is supposed that sales should not decrease due to promotions even if they may not shoot up the sales. Hence the relationship between sales and promotions should be positive. This can be dealt by adding following constraints :

where y represents the sales, x1 and x2 are two different types of promotions, x3 and x4 are two other independent variables.

By adding constraint (3), we are making sure that our requirement for the positive coefficients is attained.

Constraints (1) and (2) are used to linearize the objective function of linear regression problem i.e. mod(error function).

2. Coefficients sum should be less than or equal to one :

In one of our used cases, we forecasted the total inventory required for a warehouse on monthly basis, which the client wanted to further disintegrate to the different merchandise level according to their historical demands. So, ultimately our task was to forecast the inventory required for each of the merchandise and we observed that there was a linear relationship between the total inventory and the individual inventories. Clearly we wanted that the sum of individual inventories should be less than or equal to the total inventory. We solved this problem by using linear regression with added constraints where the added constraint was that the sum of the coefficients should be less than equal to one.

The problem was formulated as :

Where y represents the total inventory and x1, x2, x3 are the individual inventories.

The following is the Python codes for this problem

The above-mentioned used cases are some examples for real life problems. There can be more possibilities like if we want coefficients for some variables to be negative or some coefficients needed to be more dominating than the other ones etc. In all such situations linear regression with added constraints can be really helpful.