

Linear regression formula helps to define this linear relation that is present between the two quantities and how they are interdependent. If these two quantities are further plotted on a graph, it is observed that there is a linear relation between them. def linear_regression(x, y): N = len(x) x_mean = x.mean() y_mean = y.mean() B1_num = ((x - x_mean) * (y - y_mean)).sum() B1_den = ((x - x_mean)**2).sum() B1 = B1_num / B1_den B0 = y_mean - (B1*x_mean) reg_line = 'y = ) plt.title('How Experience Affects Salary') plt.xlabel('Years of Experience', fontsize=15) plt.ylabel('Salary', fontsize=15) plt.plot(x, B0 + B1*x, c = 'r', linewidth=5, alpha=.5, solid_capstyle='round') plt.scatter(x=x.mean(), y=y.mean(), marker='*', s=10**2.Ever been to a shop and have noticed how the size of an object directly affects its price as well? Well, a relation is seen when two quantities are compared and there is either an increase or decrease in the value of both of them or it can also be that one quantity increases while the other decreases and vice versa.

First I will show the full function, then I will break it down further. Now we will have to translate these two formulas to Python to calculate the regression line. head() function provided by Pandas, which will show us the first few rows of the data. To get a look at the data we can use the. data = pd.read_csv('Salary_Data.csv') x = data y = data For this example, we will be using the years of experience to predict the salary, so the dependent variable will be the salary ( y) and the independent variable will be the years of experience ( x).

Next, we will load in the data and then assign each column to its appropriate variable. import numpy as np import pandas as pd import matplotlib.pyplot as plt All we will need is NumPy, to help with the math calculations, Pandas, to store and manipulate the data and Matplotlib (optional), to plot the data.

The data can be found here.įirst, we will import the Python packages that we will need for this analysis. The data consists of two columns, years of experience and the corresponding salary. Simple Linear Regression Using Pythonįor this example, we will be using salary data from Kaggle. For every 1-unit increase in the independent variable ( x), there will be a 0.50 increase in the dependent variable ( y). For example, let's say we have a regression equation of y = 2 + 0.5x.
