Linear Regression is a fundamental concept in statistics and machine learning, widely used for modeling relationships between variables. Whether you're analyzing trends, making predictions, or understanding data patterns, linear regression provides a straightforward approach to uncovering insights.
Linear Regression is a statistical method used to model the relationship between a dependent variable (target) and one or more independent variables (predictors). The goal is to fit a linear equation to observed data, allowing for predictions and trend analysis.
In its simplest form, Simple Linear Regression involves two variables:
The relationship is modeled by the equation:
Where:
This method assumes a straight-line relationship between the variables and is typically used when there's a clear, linear association between them.
Linear Regression Analysis involves several key steps:
This process helps in understanding the strength and nature of the relationship between variables and in making informed predictions.
Multiple Linear Regression (MLR) extends simple linear regression by modeling the relationship between a dependent variable and two or more independent variables. The equation for MLR is:
Where:
MLR allows for a more nuanced understanding of how multiple factors simultaneously influence the dependent variable. For example, predicting house prices might involve variables like square footage, number of bedrooms, and location.
Example 1: A company wants to predict its sales based on the amount spent on advertising. The data collected for 5 months is as follows:
The task is to find the Linear Regression Equation that models the relationship between Advertising Spend and Sales, and use it to predict sales if the advertising spend is 6.0 thousand.
Solution:
Step 1: Calculate the Mean of X and Y
Step 2: Calculate the Slope (m) using the formula
The formula to calculate the slope m is:
Now, let’s compute the required values:
Now, calculate the slope m:
Step 3: Calculate the Y-Intercept (c)
The formula for the y-intercept c is:
Substitute the known values:
Step 4: Write the Linear Regression Equation
The linear regression equation is:
Where:
Step 5: Predict Sales for an Advertising Spend of 6.0 Thousand
To predict the sales when the advertising spend is 6.0 thousand, substitute x = 6.0 into the regression equation:
Thus, the predicted sales for an advertising spend of 6.0 thousand is 5.65 thousand.
Final Answer:
Example 2: The following data shows the number of hours studied and the corresponding marks obtained by a student:
Find the linear regression equation and predict the marks when a student studies for 6 hours.
Solution:
Compute the values:
Therefore,
Answer: The predicted marks for 6 hours of study is 95.
Example 3: The following data shows the amount spent on advertising and the corresponding sales:
Find the linear regression equation and predict sales for an advertising spend of 6.
Solution:
Therefore,
Answer: Predicted sales for an advertising spend of 6 is 10.2.
Example 4: The following data shows the number of hours a person watches TV and their productivity at work:
Find the linear regression equation and predict productivity when a person watches 6 hours of TV.
Solution:
Therefore:
Answer: Predicted productivity for 6 hours of TV is 46.5.
Question 1: The data below shows the number of hours studied and the marks obtained by a student: Find the Linear Regression Equation (i.e., y = mx + c) and use it to predict the marks if a student studies for 6 hours.
Question 2: A company collects the following data on the amount spent on advertising and the sales generated. Find the Linear Regression Equation for predicting sales based on advertising spend, and predict the sales if the advertising spend is 6.
Question 3: The following data shows the number of years of experience and the corresponding salaries. Find the Linear Regression Equation to predict salary based on years of experience. Then, predict the salary for someone with 6 years of experience.
Question 4: The table below shows the number of books sold by a bookstore in a week and the number of advertisements they ran. Find the Linear Regression Equation for predicting the number of books sold based on the number of advertisements run, and predict the number of books sold if 3 ads are run.
Question 5: The data below shows the number of products produced by a factory and the number of workers employed. Find the Linear Regression Equation for predicting the number of products produced based on the number of workers employed. Predict the number of products produced when 25 workers are employed.
1. How do you calculate the slope and intercept?
Ans: Use the formulas:
(Session 2025 - 26)