In order to best fit line which is represented by find linear function - we can use

**ordinary least squares method(**minimize the residuals) or**least absolute deviations**(minimizing the sum of absolute values of residuals). Residuals means vertical distances between the points of the data set and the fitted line**(**wiki**)**.Linear best fit line(blue) for data points(Red) and green line indicates error/residues (Source:wiki) |

**Least square method**:- Using this approach vertical distances between the data set points and the fitted line is computed such that sum of all distances for each point from best fit line is minimum.

**Dataset**:- (x, y) = (2,10) (4,9) (3,6) (6,6) (8,6) (8,3) (10,2)

**Algorithm**: For finding best fit line (y= . mX+ c) , we have to find value of m(slope) and c(intercept). Follow below steps to find slope and intercept.

1. Compute mean of x and y values. Here x̅ and ȳ are mean of x and y data points.

2. Calculate slope of line(linear classifier)

3. Calculate intercept of line .

**---------------------------------------------------------**

**Use python terminal**to find mean of x and y data points.

Python 2.7.10 (default, Jul 15 2017, 17:16:57) [GCC 4.2.1 Compatible Apple LLVM 9.0.0 (clang-900.0.31)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> >>> import numpy >>> a = [2,4,3,6,8,8,10] >>> numpy.mean(a) 5.8571428571428568 >>> >>> >>> >>> b = [10,9,6,6,6,3,2] >>> numpy.mean(b) 6.0

**x̅ =**5.86

**ȳ =**6.0

**Find slope(m):**Pre-process sample data in tabular form below and compute slope of line.

iteration# | xi | yi | xi - x̅ | yi - ȳ | (xi - x̅)(yi -ȳ)= (P) | (xi - x̅)2 =(Q) |

1 | 2 | 10 | -3.86 | 4 | -15.44 | 14.9 |

2 | 4 | 9 | -1.86 | 3 | -5.58 | 3.46 |

3 | 3 | 6 | -2.86 | 0 | 0 | 8.18 |

4 | 6 | 6 | 0.14 | 0 | 0 | 0.02 |

5 | 8 | 6 | 2.14 | 0 | 0 | 4.58 |

6 | 8 | 3 | 2.14 | -3 | -6.42 | 4.58 |

7 | 10 | 2 | 4.14 | -4 | -16.56 | 17.14 |

**Slope of line**:

**Compute y-intercept**:

**Now equation of line**:

Best fit line separates data points below and this line can be used to predict outcome for other test(new) data points.