Solve Linear regression problem mathematically with least square method : Find slope and intercept of linear classifier

Linear regression is one of basic supervised learning which is used to predict outcome. In linear regression problem, we find best fit line using sample points with one independent variable and one dependent variable. Basic idea is to finds a linear function which predicts the dependent variable values as a function of the independent variables.
In order to best fit line which is represented by find linear function - we can use ordinary least squares method(minimize the residuals) or least absolute deviations (minimizing the sum of absolute values of residuals). Residuals means vertical distances between the points of the data set and the fitted line (wiki).
Linear best fit line(blue) for data points(Red) and green line indicates error/residues (Source:wiki

Least square method :-  Using this approach vertical distances between the data set points and the fitted line is computed such that sum of all distances for each point from best fit line is minimum. 

Dataset:- (x, y) = (2,10) (4,9) (3,6) (6,6) (8,6)  (8,3) (10,2)

Algorithm : For finding best fit line (y= . mX+ c) , we have to find value of m(slope) and c(intercept). Follow below steps to find slope and intercept.

1. Compute mean of x and y values. Here x̅  and È³ are mean of x and y data points.


2.  Calculate slope of line(linear classifier)

3. Calculate intercept of line .
---------------------------------------------------------

Use python terminal to find mean of x and y data points.
Python 2.7.10 (default, Jul 15 2017, 17:16:57) 
[GCC 4.2.1 Compatible Apple LLVM 9.0.0 (clang-900.0.31)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> 
>>> import numpy
>>> a = [2,4,3,6,8,8,10]
>>> numpy.mean(a)
5.8571428571428568
>>> 
>>> 
>>> 
>>> b = [10,9,6,6,6,3,2]
>>> numpy.mean(b)
6.0

x̅ = 5.86
ȳ = 6.0

Find slope(m):  Pre-process sample data in tabular form below and compute slope of line.

iteration# xi yi xi - x̅ yi - ȳ (xi - x̅)(yi -ȳ)= (P) (xi - x̅)2 =(Q)
1 2 10 -3.86 4 -15.44 14.9
2 4 9 -1.86 3 -5.58 3.46
3 3 6 -2.86 0 0 8.18
4 6 6 0.14 0 0 0.02
5 8 6 2.14 0 0 4.58
6 8 3 2.14 -3 -6.42 4.58
7 10 2 4.14 -4 -16.56 17.14
                                                             
Slope of line:
 
Compute y-intercept :

Now equation of line  :


Best fit line separates data points below and this line can be used to predict outcome for other test(new) data points.



4 Comments

  1. A IEEE project is an interrelated arrangement of exercises, having a positive beginning and end point and bringing about an interesting result in Engineering Colleges for a particular asset assignment working under a triple limitation - time, cost and execution. Final Year Project Domains for CSE In Engineering Colleges, final year IEEE Project Management requires the utilization of abilities and information to arrange, plan, plan, direct, control, screen, and assess a final year project for cse. The utilization of Project Management to accomplish authoritative objectives has expanded quickly and many engineering colleges have reacted with final year IEEE projects Project Centers in Chennai for CSE to help students in learning these remarkable abilities.



    Spring Framework has already made serious inroads as an integrated technology stack for building user-facing applications. Spring Framework Corporate TRaining the authors explore the idea of using Java in Big Data platforms.
    Specifically, Spring Framework provides various tasks are geared around preparing data for further analysis and visualization. Spring Training in Chennai

    ReplyDelete
Previous Post Next Post