% LaTeX source for Legendre on Least Squares

\documentclass{article}

\usepackage{amsmath}
\usepackage{times}

\newcommand{\inte}{\mbox{$\int$}}

\begin{document}

\begin{center}
LEGENDRE \\
\medskip
On Least Squares
\end{center}

\setcounter{page}{1}

\noindent
[Translated from the French by Professor Henry A Ruger and Professor Helen
M Walker, Teachers College, Columbia University, New York City.]

\medskip

of the nineteenth century were due in no small part to the development of the
method of least squares.  The same method is the foundation for the calculus
of errors of observation now occupying a place of great importance in the
scientific study of social, economic, biological, and psychological problems.
Gauss says in his work on the \textit{Theory of Motions of the Heavenly Bodies}
(1809) that he had made use of this principle since 1795 but that it was first
published by Legendre.  The first statement of the method appeared as an
appendix entitled  Sur la M\'ethode des moindres quarr\'es'' in Legendre's
\textit{Nouvelles m\'ethodes pour la d\'etermination des orbites des
com\`etes}, Paris 1805.  The portion of the work translated here is found on
pages 72--–75.

Adrien-Marie Legendre (1752--1833) was for five years a professor of
mathematics in the \'Ecole Militaire at Paris, and his early studies on the
paths of projectiles provided a background for later work on the paths of
heavenly bodies.  He wrote on astronomy, the theory of numbers, elliptic
functions, the calculus, higher geometry, mechanics and physics.  His work
on geometry, in which he rearranged the propositions of Euclid, is one of
the most successful textbooks ever written.

\begin{center}
\textit{On the Method of Least Squares}
\end{center}

In the majority of investigations in which the problem is to get from
measures given by observation the most exact result which they can furnish,
there almost always arises a system of equations of the form
$a + bx + cy + fz + \text{\&c.}$
in which $a$, $b$, $c$, $f$, \&c. are the known coefficients which vary
from one equation to another, and $x$, $y$, $z$, \&c. are the unknowns which
must be determined in accordance with the condition that the value of $E$
shall for each equation reduce to a quantity which is either zero or very
small.

If there are the same number of equations as unknowns $x$, $y$, $z$, \&c.,
there is no difficulty in determining the unknowns, and the error $E$ can be
made absolutely zero.  But more often the number of equations is greater than
that of the unknowns, and it is impossible to do away with all the errors.

In a situation of this sort, which is the usual thing in physical and
astronomical problems, where there is an attempt to determine certain
important components, a degree of arbitrariness necessarily enters in the
distribution of the errors, and it is not to be expected that all the
hypotheses shall lead to exactly the same results; but it is particularly
important to proceed in such a way that extreme errors, whether positive
or negative, shall be confined within as narrow limits as possible.

Of all the principles which can be proposed for that purpose, I think
there is none more general, more exact, and more easy of application, that
of which we made use in the preceding researches, and which consists of
rendering the sum of squares of the errors a minimum.  By this means, there
is established among the errors a sort of equilibrium which, preventing the
extremes from exerting an undue influence, is very well fitted to reveal
that state of the system which most nearly approaches the truth.

The sum of the squares of the errors
$E^2 + E^{\prime2}+ E^{\prime\prime2} + \text{\&c.}$
being
$\begin{array}{llllllllll} & &(a &+ &bx &+ &cy &+ &fz + &\text{\&c.})^2 \\ &+ &(a' &+ &b'x &+ &c'y &+ &f'z + &\text{\&c.})^2 \\ &+ &(a''&+ &b''x &+ &c''y &+ &f''z + &\text{\&c.})^2 \\ &+ &\multicolumn{8}{l}{\text{\&c.,}} \end{array}$
if its \textit{minimum} is desired, when x alone varies, the resulting
equation will be
$o = \inte ab + x\inte b^2 + y\inte bc + z\inte bf +\text{\&c.,}$
in which by  $\int ab$ we understand the sum of similar products, i.e.,
$ab + a'b' + a''b'' + \text{\&c}$; by  $\int b^2$ the sum of the squares of
the coefficients of $x$, namely
$b^2 + b^{\prime2} + b^{\prime\prime2} + \text{\&c.}$, and similarly for the
other terms.

Similarly the minimum with respect to $y$ will be
$o = \inte ac + x\inte bc + y\inte c^2 + z\inte fc +\text{\&c.,}$
and the minimum with respect to $z$,
$o = \inte af + x\inte bf + y\inte cf + z\inte f^2 +\text{\&c.,}$
in which it is apparent that the same coefficients $\int bc$, $\int bf$,
\&c. are common to two equations, a fact which facilitates the calculation.

In general, to form the equation of the minimum with respect to one of
the unknowns, it is necessary to multiply all the terms of each given
equation by the coefficient of the unknown in that equation, taken with
regard to its sign, and to find the sum of these products.

The number of equations of minimum derived in this manner will be equal
to the number of the unknowns, and these equations are then to be solved
by the established methods.  But it will be well to reduce the amount of
computation both in multiplication and in solution, by retaining in each
operation only so much signification of figures, integers or decimals,
as are determined by the degree of approximation for which the inquiry calls.

Even if by a rare chance it were possible to satisfy all the equations
at once by making all the errors zero, we could obtain the same result from
the equations of minimum; for if after having found the values of $x$, $y$,
$z$, \&c. which make $E$, $E'$ , \&c. equal to zero, we let $x$, $y$, $z$
vary by $\delta x$, $\delta y$, $\delta z$, \&c., it is evident that
$E^2$, which was zero, will become by that variation
$(a\delta x + b\delta y + c\delta z + \text{\&c.})^2$.  The same will be
true of $E^{\prime2}$, $E^{\prime\prime2}$, \&c.  Thus we see that the sum
of squares of the errors will by variation become a quantity of the second
order with respect to $\delta x$, $\delta y$, \&c., which is in accord with
the nature of a minimum.

If after having determined all the unknowns $x$, $y$, $z$, \&c., we
substitute their values in the given equations, we will find the value of
the different errors $E$, $E'$ , $E''$, \&c., to which the system gives
rise, and which cannot be reduced without increasing the sum of their
squares.  If among these error are some which appear too large to be
admissible, then those equations which produced these errors will be
rejected, as coming from too faulty experiments, and the unknowns will
be determined by means of the other equations, which will then give much
smaller errors.  It is further to be noted that one will not then be obliged
to begin the calculations anew, for since the equations of minimum are
formed by the addition of the products made in each of the given equations,
it will suffice to remove from the addition those products furnished by
the equations which would have led to errors that were too large.

The rule by which one finds the mean among the results of difference
observations is only a very simple consequence of our general method, which
we will call the method of least squares.

Indeed, if experiments have given different values $a$, $a'$, $a''$, \&c.
for a certain quantity x, the sum of squares of the errors will be
$(a' - x)^2 + (a'' - y)^2 + (a''' - x)^2$, and on making that sum a minimum,
we have
$o = (a' - x) + (a'' - y) + (a''' - x),$
from which it follows that
$x = \frac{a' + a'' + a''' + \text{\&c.}}{n},$
$n$ being the number of the observations.

In the same way, if to determine the position of a point in space, a
first experiment has given the coordinates $a'$, $b'$, $c'$; a second the
coordinates $a''$, $b''$, $c''$; and so on, and if the true coordinates
of the point are denoted by $x$, $y$, $z$; then the error in the first
experiment will be the distance from the point $(a', b', c')$ to the point
$(x,y,z)$.  The square of this distance is
$(a' - x)^2 + (a'' - y)^2 + (a''' - x)^2,$
If we make the sum of the squares of all such distances a minimum, we get
three equations which give
$x=\frac{\int a}{n},\quad y=\frac{\int b}{n},\quad z=\frac{\int c}{n},$
$n$ being the number of points given by the experiments.  These formulas
are precisely the ones by which one might find the common centre of gravity
of several equal masses situated at the given points, whence it is evident
that the centre of gravity of any body possesses this general property.

\textit{If we divide the mass of a body into particles which are equal
and sufficiently small to be treated as points, the sum of the square of the
distances from the particles to the centre of gravity will be a minimum.}

We see then that the method of least squares reveals to us, in a fashion,
the centre about which all the results furnished by experiments tend to
distribute themselves, in such a manner as to make their deviations from it
as small as possible.  The application which we are now about to make of
this method to the measurement of the meridian will display most clearly
its simplicity and fertility.\footnote{An application of the method to an
astronomical problem follows.}

\bigskip

\noindent
From D E Smith, \textit{A Source Book in Mathematics}, McGraw-Hill 1929 and
Dover 1959, Volume II, pages 576--–579.

\end{document}

%