How is multiple linear regression used in machine learning? 01.09.2017Category: Linear RegressionAuthor: lifehacker Multiple linear regression and polynomial regressionContents1 Multiple linear regression and polynomial regression2 Defining the multidimensional problem and deriving the solution3 Solution of multiple linear regression using matrices4 Multiple Regression Solution in the code of Python5 Polynomial regression is an extension of linear regression with the code in Python6 Prediction of systolic blood pressure by age and weight In this article, we will examine the case where we have several input dependent variables. By the way, it is the most realistic and typical case. In fact, as a rule, several factors influence the final result at the same time. For example, it is better to predict blood pressure taking into account not only the age of a person but also his/her weight. Or, for instance, the price of housing can depend on both its net floor area and the area where it is situated. It means several input variables affect the output quantity, and it is often important to take all of them into account to achieve better forecast accuracy. We`d like to remind that we started with a one-dimensional linear regression, where we had input data {(x1, y1), (x2, y2), …, (xN, yN)}. Defining the multidimensional problem and deriving the solution Now, since several factors are being considered at once, so our xi becomes a vector. It is also called a feature vector. The number of elements of a vector is called dimension and, as a rule, it is represented as D. Our model has the form ŷ=wTx+b. Since we need to multiply x by the transposed vector of parameters w, the vector w must also have the dimension D. Notice that we can always include the free term b in the parameter vector w. This is because we can rename b to w0 and add x0 to it, and set x0 = 1: ŷ = b + w1x1 + … + wDxD, ŷ = w0 + w1x1 + … + wDxD, ŷ = w0x0 + w1x1 + … + wDxD = w’Tx’, x0 = 1. It is equivalent to adding a column of ones to our data matrix X that has the original dimension NxD). Why does our data matrix have the dimension NxD? Do not forget N is the number of experiments (observations), and D is the number of input factors. If we take one line from X, it means that we have taken the result of only one observation. So we get a column vector xi of dimension 1xD, which is а feature vector. But since vectors are usually considered as column vectors of dimension Dx1 in linear algebra, in the case of one experiment we have to transform the formula as follows: ŷ=wTx. If we want to calculate the value of the output variable ŷ for all observations of N simultaneously, we have to write It is because the column vector w has the dimension Dx1, and the data matrix X has the dimension NxD, and as you know, if you want to multiply the matrices correctly, it is necessary that their respective dimensions coincide. Then, as a result, we obtain the correct column vector of dimension Nx1. Of course, at first, it looks a bit strange, as our w is now on the right. To clarify this idea, let’s take a simple example. Let N = 4 it means the number of observations we have made is four, and D = 3, that is the number of input factors being studied is three. Thus, our data matrix X has a dimension of 4×3. Let w be vector with three elements 1, 2, 3: