The Cauchy-Schwarz inequality is not hard to prove, so there is not much reason for a page devoted to simplifying the usual proof, or rather simplifying the usual presentation of the usual proof. What is more, the idea that follows is so natural that it must be well known to a significant proportion of mathematicians. Hence the word `tiny' above.

Nevertheless, most textbooks and all analysis courses I have attended favour the approach where you write down a quadratic form, use the fact that it is non-negative everywhere, and observe that this implies the Cauchy-Schwarz inequality. No explanation is usually given of where the quadratic form comes from. This page is intended for those who happen not to have observed, or been shown, that more or less the same argument can be made to seem much more natural. Indeed, this is another example of a proof that a well-programmed computer could reasonably be expected to discover.First, let us consider the basic, real-analysis version of the inequality, namely

a_{1}b_{1}+...+a_{n}b_{n}
__<__
(a_{1}^{2}+...+a_{n}^{2})
^{1/2}(b_{1}^{2}+...+b_{n}^{2})
^{1/2}

with equality if and only if the sequences (a_{i}) and
(b_{i}) are proportional.

How might one go about proving this statement using no tricks?
One idea is to try to find a natural way to express the fact that
two sequences are proportional. Of course, we could say something
like `there exists a constant lambda such that a_{i}=
lambda b_{i} for every i', but this introduces an unknown
constant lambda, and it will make our proof harder later on if we
have to find this lambda.

This is not a serious problem though, as we can identify lambda
as something like a_{1}/b_{1}. And if we dislike the
lack of symmetry involved in choosing a_{1}/b_{1}
rather than some other a_{i}/b_{i}, we could
simply say that all the a_{i}/b_{i} are equal.

This still leaves a minor problem that some of the b_{i}
may be zero, and the related minor problem that we are not dealing
with the a_{i} and b_{i} symmetrically. To get
round these small difficulties, let us define (a_{i}) and
(b_{i}) to be proportional if a_{i}b_{j}=
a_{j}b_{i} for every pair i,j.

Now we would like to express this fact * analytically *,
and for this there is a very standard idea. If you want lots of
real numbers to be zero then you can achieve this by insisting that
the sum of their squares is zero. In this case we want all the numbers
a_{i}b_{j}-a_{j}b_{i} to be zero,
so the sequences (a_{i}) and (b_{i}) are proportional
if and only if

Sum_{i,j}
(a_{i}b_{j}-a_{j}b_{i})^{2}=0

and the expression on the left is trivially at least zero.

Expanding out the bracket on the left hand side we get

Sum_{i,j}(a_{i}^{2}b_{j}
^{2}
+a_{j}^{2}b_{i}^{2}
-2a_{i}b_{j}a_{j}b_{i})

which equals

2(Sum_{i}a_{i}^{2})
(Sum_{j}b_{j}^{2})
-2(Sum_{i}a_{i}b_{i})^{2}

The inequality, together with the equality case, follows immediately, provided that the two sequences are positive, which we may clearly assume.

Note that the only idea above was to write down the proportionality of the two sequences in a nice way. The rest of the argument was an entirely mechanical manipulation. Can we do something similar for the more abstract, inner-product-space version of the inequality?

For some reason the keyboard I am writing this on refuses to do vertical bars, so I shall write [x] for the norm of x and < x,y > for the inner product of x and y. Beginning with the real case, we would like to show that < x,y > is at most [x][y], with equality if and only if x and y are proportional with a positive constant. Can we express the proportionality of x and y without using coordinates?

A first attempt is to say that x and y are proportional if and only if x/[x] and y/[y] are equal. This is not quite accurate (for example, y might be -x), but the inaccuracy works in our favour as the condition is in fact equivalent to x and y being proportional with a positive constant. Bearing in mind that we eventually want a nice expression to deal with, let us rewrite this equality as x[y]-y[x]=0.

We now want some way of distinguishing zero amongst all vectors in an inner-product space. We need go no further than the axioms! Indeed, x[y]-y[x]=0 if and only if

[x[y]-y[x]]^{2}=0

I put the square in because one always likes to expand such an expression in terms of inner products. Indeed, let us do just that, obtaining that

2[x]^{2}[y]^{2}-2[x][y]< x,y >

is greater than or equal to zero, with equality only if x[y]-y[x]=0. If either [x] or [y] is zero then the Cauchy-Schwarz inequality is trivial. Otherwise, we can divide through by 2[x][y] and obtain the inequality in general, with equality if and only if x/[x] and y/[y] are equal, that is, if and only if x and y are proportional with a positive constant of proportionality.

The complex case is not much harder. This time [x[y]-y[x]]^{2}
expands out as

2[x]^{2}[y]^{2}-[x][y](< x,y > + < y,x >)

Let w be a complex number of modulus 1 with the property that < x,wy > is real and non-negative, and therefore equal to the modulus of < x,y >. Replacing y with wy we find that the modulus of < x,y > is at most [x][y], with equality if and only if x[y]-wy[x]=0. Thus, equality holds for the modulus of the inner product if and only if x and y are proportional, from which it is easy to see that it holds for the inner product itself if and only if the constant of proportionality is real and positive. (Choosing w above is, admittedly, a trick, but it is a very standard one.)

The idea of the above arguments is to contrast them with
the usual, slightly less motivated approach of considering the
expression [x-cy]^{2}, which is real and non-negative,
and then choosing a `clever' value of c from which to deduce
the Cauchy-Schwarz inequality. Of course, c can be justified
as the value that minimizes the quadratic expression that results
from expanding [x-cy]^{2}, but even so the idea of
writing down [x-cy]^{2} in the first place is not an
obvious one.

Actually (this paragraph was added a day or two later)
it can be justified as follows. Two vectors x and y are
proportional if and only if 0 is a non-trivial linear
combination of the two. Moreover, if neither is zero, then
they are proportional if and only if x-cy=0 for some constant
c. If this does not happen, then the line of points of the
form x-cy has some positive distance from 0, which we can
calculate by minimizing [x-cy]. However, it seems perverse to
bother with this calculation when
we know that * if * x-cy is ever zero, then c must equal
[x]/[y].

Just for the record, I looked through my bookshelf for all proofs that I could find of the Cauchy-Schwarz inequality. Only Apostol (Mathematical Analysis, p.20 exercise 1-15) and Jeffreys and Jeffreys (Methods of Mathematical Physics, 3rd Ed. p.54) prove the inequality (for real numbers) this way. The identity that proves it is known as Lagrange's identity. Even they merely ask you to note that Lagrange's identity is true and that it implies the Cauchy-Schwarz inequality. My point above is that the identity is an obvious thing to write down.