This page is intended to be read after two others: one on what it means to solve an equation and the other on algebraic numbers, field extensions and related ideas .
Let us imagine ourselves faced with a cubic equation x3 + ax2 +bx +c = 0. To solve this equation means to write down a formula for its roots, where the formula should be an expression built out of the coefficients a, b and c and fixed real numbers (that is, numbers that do not depend on a, b and c) using only addition, subtraction, multiplication, division and the extraction of roots.
As I have done in other pages, I shall try to show that it is possible to derive such a formula by following standard mathematical instincts, without the need for mysterious flashes of inspiration. I am certainly not claiming that any sensible person should be able to derive the formula in an hour or two - finding the right `standard mathematical instinct' normally involves trying several that do not work. Nevertheless, the list of suitable ones to try in any given situation is usually not too long. If you are young and ambitious and do not yet know how to solve cubics, I would recommend having a go, or perhaps reading a short way into this page and then having a go. Your chances of succeeding in a few hours are probably higher than you think.
Let us begin with one of the most useful (and obvious) general problem-solving principles in mathematics.
If you are trying to solve a problem, see if you can adapt a solution you know to a similar problem.
By using this principle, one can avoid starting from scratch with each new problem. What matters is not the difficulty of the problem itself but the difficulty of the difference between the problem and other problems whose solutions are known.
In this case, it is absolutely obvious that the similar problem we should take is that of finding a solution to the quadratic equation x2 + 2ax + b = 0. (I have put in the factor 2 just for convenience - of course it makes no difference mathematically.) How do we do that? Well, we `observe' that
x2 + 2ax +b = (x+a)2 + b-a2
which leads quickly to the solution
x = -a +/- (a2 -b)1/2
Was that observation clever? It will be useful to dwell on this more elementary question before continuing with the cubic. So let us imagine that we do not even know how to solve quadratics. One line of thought that might lead us to a solution is the following. After staring at the general equation x2 + 2ax +b = 0 and having no ideas, we fall back on the following question.
Are there special cases that I do know how to solve?
Then, with some embarrassment, we note to ourselves that we can solve the equation when a = 0. That is, we can solve the equation x2 + b = 0 (because we are allowed to take square roots). Next, we perhaps note that if b=a2 then we have the equation x2 + 2ax + a2 = 0, which can be rewritten (x+a)2 = 0. As soon as we have noticed this, we will realize that what helps is not that the right hand side is zero, but that the left hand side is a perfect square. We can therefore solve (x+a)2=b for any b. This gives us a whole family of quadratics that we can solve, so we would be mad not ask the next question.
Are there quadratic equations that cannot be written in the form (x+a)2=b?
To answer that, we need to put it back in the original form, by multiplying out the bracket and taking b over to the left hand side. This gives us the equation x2 + 2ax + a2-b = 0. It is then clear that we can make 2a any number we want, and that, having done so, we can make a2-b any other number we want. So the quadratic is solved.
If you think that it was asking too much to notice that the equation x2 + 2ax + a2 = 0 could be solved, then here is another route. It doesn't take much curiosity to wonder whether 1+21/2 is an algebraic number, or much talent to notice that if x=1+21/2 then (x-1)2 = 2. Generalizing this example leads quickly to the observation that equations of the form (x+a)2=b can be solved.
What would be the natural generalization to cubics of the process of completing the square? To answer a question of this kind, the following tactic is often useful.
Give a general description of what it is that one would like to generalize.
I shall try to illustrate what I mean just by doing it. To complete the square one notes that (x+a/2)2 = x2 + ax +a2/4, so that we can write any quadratic that begins x2 + ax as (x+a/2)2 plus a constant. To put that another way, if we let y=x+a/2, then y satisfies a quadratic equation of the particularly simple form y2+C=0. Of course, once we have solved the equation for y, it is easy to obtain a solution for x, since x is a very simple linear function of y.
What was simpler about the equation for y? There are two reasonable answers to this question, and it is worth looking at both of them. The first is to note that the equation for y involves only y2 and a constant term - so replacing x by y allows us to assume that the coefficient of the linear term is zero. The second is more obvious - it is simpler because by allowing ourselves to take square roots we have declared that equations of the form y2+C=0 can be solved at a stroke.
This line of thought leads to two questions.
1. Is there a similar way to simplify a cubic so that some of the coefficients become zero?
2. Is there a similar way to simplify a cubic so that it becomes of the form y3+C=0?
The answer to question 1 is not hard to find. If y=x+t then y3=x3+3tx2 +3t2x+t3. Therefore, if t=a/3 then the cubic x3 + ax2 +bx +c can be rewritten as y3 + py +q, where (for what it is worth) p=b-3t2 and q=c-bt+2t3. Writing this in terms of a we have p=b-a2/3 and q=c-ab/3+2a3/27.
As for the second question, we can start to think about it by asking ourselves the following direct generalization of a question we asked about quadratics.
Are there cubic equations that cannot be written in the form (x+a)3=b, and if so, which ones can?
Expanding and subtracting, we find that we can easily solve equations of the form
x3 + 3ax2 + 3a2x + a3 - b = 0
When is the equation
x3 + ax2 + bx +c = 0
of this type? Comparing it with the preceding one we see that it is of the required form if the pair (a,b) is of the form (3s,3s2) for some s, which it is if and only if a2=3b. So the following question arises naturally.
Can we replace x by some y=x+t in such a way that y satisfies a cubic with a'2=3b' (where a' and b' are the coefficients of y2 and y respectively).
This approach looks promising, because t gives us one degree of freedom and all we want is one condition - that a'2-3b' should be zero. It is obvious how to answer the question, so let us go ahead and do it. Writing x=y-t and substituting we obtain the equation
(y-t)3 + a(y-t)2 + b(y-t) +c = 0
which rearranges itself to
y3 + (a-3t)y2 + (b-2at+3t2y) + c-bt+at2+t3
This gives us a'=a-3t and b'=b-2at+3t2. Therefore,
What we have shown is that we cannot change the quantity a2-3b by making a substitution of the form y=x+t. In other words, the answer to question 2 above is no, at least when `a similar way' is taken to mean that we should use such a substitution. A slightly fancier way to say that a2-3b doesn't change is to call it an invariant .
Is it an unfortunate accident that a2-3b is an invariant? Further reflection gives us a reason for this phenomenon, and shows that we were foolish ever to expect that the cubic could be solved so simply. You may already have noticed that a2-3b=-3p, where p was the coefficient of the linear term we obtained when we converted the cubic x3 + ax2 + bx +c into the simpler cubic y3 + py + q. We chose y to be x+a/3 and it is easy to see that no other choice would have led to the coefficient of y2 being zero. Hence, the invariant we have discovered has an interpretation (as one should always expect): it is the coefficient of the linear term when the quadratic term has been removed by a substitution of the form y=x+t.
But it is now obvious that this quantity is an invariant. After all, if I substitute y=x+s (for any s) and then ask what further substitution z=y+r will remove the quadratic term, the answer is that z=x+r+s and r+s has to be a/3. Therefore, the p that I obtain for y is the same as the p that I obtain for x.
Actually, it was obvious in advance that the second approach to solving the cubic was doomed to fail, since if it were possible to `complete the cube' then every cubic would be of the form (x+a)3+b. But if that were true, then why would we have been bothering to convert a cubic into that form? So, not only is completing the cube impossible, it is impossible for simple and compelling reasons. On the other hand, isn't completing the cube the natural generalization of completing the square? Now that we have tried it and failed, it is as though we have blown our main chance to solve the cubic (which was to see how we solved the quadratic and adapt our method).
However, this sort of defeatist attitude is often a mistake. Perhaps one can even express this view with another general principle.
There may be many ways to adapt or generalize a proof.
But how, one might ask, should one search for different generalizations? Let me modify an earlier suggestion.
Give a description of the argument that one would like to generalize. Explain why it worked. Make the explanation vaguer and more general, and then try to find different arguments that work for the same (vaguer) reasons.
So that we can put that into practice, let me once again describe how to solve the quadratic.
Let y=x+a/2. Then y satisfies a quadratic equation of the particularly simple form y2+C=0. Once we have solved this equation for y, it is easy to obtain a solution of the original equation for x, since x is a very simple linear function of y.
Why, in general terms, did that work? We needed two properties of y. First, y should satisfy an equation that we knew how to solve, and secondly x should depend on y in a simple way - so that once we knew y we could work out x.
If we wish to carry this approach over to the cubic, then we should have clear answers to the following two questions.
(i) Which equations can we solve?
(ii) How are we prepared to allow y to depend on x?
The answer to the first question we more or less know already. We can solve linear and quadratic equations, and also cubic equations if they happen to have the nice form x3+C=0. As for the second, so far we have considered substitutions of the form y=x+t. What other substitutions could there possibly be?
I shall answer this question by yet another time-honoured method, which occurs all over mathematics.
Do the most general thing you can possibly imagine. Then, when you find that you need certain properties, make what you have done more specific by introducing those properties.
Suppose then that we make the substitution y=f(x). (It is hard to see how we could be more general than that.) Let us suppose that this leads to an equation for y that we can solve. When will knowing y be of any use? The answer is obvious - when we can solve the equation y=f(x) for x in terms of y. But we know which equations we can solve - linear, quadratic and simple cubic equations. We have already tried linear substitutions and seen their limitations, so we are left with two reasonable possibilities for f(x). One is x2+ax+b (it is not hard to see that giving x2 a different coefficient is not going to make a significant difference) and the other is x3+c.
Following some very general problem-solving techniques has led us to an idea that is definitely new. By standing back a little, we realized that the important thing about the substitution y=x+t in the solution of the quadratic was not that it magically worked, or that it was linear, but that it was invertible in the sense that we could give a formula for x in terms of y. The impasse is now broken in the sense that we have an approach to try with the cubic. It may not work, but having an approach that may or may not work is much better than having no approach at all.
If a linear substitution worked for quadratic equations, then which sounds more likely to work for cubic equations - a quadratic substitution or a particular kind of cubic substitution? Somehow the quadratic one is more promising, as it fits the general description of having degree one less than that of the equation one is trying to solve. This is not a particularly convincing argument, but the worst that can happen is that we try it and it doesn't work. So let us see where we can get with the substitution y=x2+ux+v.
We now run into a problem. We are hoping that y will satisfy a cubic of a particularly simple form. But is it obvious that it satisfies any cubic? If you do not find it obvious then this is the point where it will help to have read my page on algebraic numbers , because there I repeatedly used a trick which works here as well (and which, I stress, arose naturally in that context).
We know that x satisfies the equation
x3 + ax2 + bx +c = 0
But this means that every time we write down a polynomial in x, we can replace x3 by -ax2-bx-c, x4 by -ax3-bx2-cx and so on. That is, every polynomial expression in x is equal to some quadratic function of x. But y2 and y3 are polynomial functions of x, and hence equal to quadratic ones. This is trivially true of 1 and y as well. Hence, the numbers 1, y, y2 and y3 are all of the form rx2+sx+t. For y to satisfy a cubic we need a non-trivial linear combination of 1, y, y2 and y3 to be zero. To obtain this, we need to solve three homogeneous linear equations in four unknowns, which we can always do.
So now we can describe a possible method more precisely: let y=x2+ux+v, work out y2 and y3 in terms of x, reduce them to quadratics using the fact that x3=-ax2-bx-c, find a non-trivial linear relationship between 1, y, y2 and y3, write out the corresponding cubic y3+dy2+ey+f in y and finally (the most important part) cleverly choose u and v in such a way that d2=3e.
We have no guarantee that this will work, because it may be that, as happened with linear substitutions, there simply is no choice of u and v that makes d2 equal to 3e, or it may be that, although such a choice exists, the dependence of u and v on a, b and c is so complicated that we don't know how to solve the resulting equations. It is reasonable not to worry too much about the first potential difficulty, because we now have an extra degree of freedom, and there doesn't seem to be an argument telling us that it cannot possibly help. However, if you now go away and try to work out the details of the argument outlined above, you will see that complication is something one should definitely worry about. Indeed, it may seem after a while that in order to work out u and v you will have to solve quintics .
Let me take it as read that just plunging in is a bad idea. In any case, it is another good problem-solving strategy to try simpler (but less general) approaches first, just in case they work. So how might we make the above calculations manageable?
One obvious idea is to use the simplification we obtained earlier: we might as well assume that a=0. This will allow us to replace x3 by -px-q. In fact, it is a little nicer to say that x3=px+q, which we can do by changing the definitions of p and q. And what about the substitution y=x2+ux+v? Well, remembering the invariant we discovered earlier, we should realize that y satisfies a cubic that we can easily solve if and only if y-v does. So we might as well save on algebra by setting v=0. In other words, not only will we simplify calculations by setting y=x2+ux, we will not even lose any generality.
Here are some calculations that arise when one starts with the equation x3=px+q, sets y=x2+ux, and tries to find a cubic satisfied by y. Doing it directly (that is, by working out y2 and y3 and solving some simultaneous equations) still gets unpleasant, but the calculations can be kept manageable by simplifying as we go along, as is done below. I shall also save time by writing C to mean a constant (depending on p,q and u) which may vary from line to line.
From an earlier line we have that
so this is equal to
(u2+p)y2 +(up+q-u3)uy+(p-u2)(y2-(u2+p)y) +2uqy+C
Thus, y satisfies the cubic equation
and all that is left to decide is whether u can be chosen in such a way that
that is, such that
This, being a quadratic in u, can be solved. Using this value of u one obtains a cubic in y that can be `completed'. This gives a solution of y. Then x can be worked out from y by solving a further quadratic.
Of course, the resulting formula, if one worked it out, would be pretty unpleasant, and I should now say that better methods have been discovered (involving different substitutions) that lead to easier calculations and neater answers. They are easy to find on the internet, but all tend to have a `magic' quality about them. I should also say that I haven't discussed the annoying problem that not all the `solutions' that arise by the above method will necessarily be solutions, since knowledge of y doesn't uniquely determine x.
Just to see how there might be other sensible substitutions, let us return to the question of which ones allow us to calculate x. We noted that we could do so if y was a quadratic function of x. But this was not absolutely necessary, even if we could only solve quadratic equations. For example, if y were defined implicitly by x2+uxy+v=0, then knowing y would still allow us to determine x.
As a matter of fact, the simplest method works for a completely different reason. It involves substituting the other way - by setting x=w+p/3w (when the equation is x3=px+q) and finding that the resulting equation in w is a quadratic in w3. This method is described in more detail here . I do not have a plausible explanation for how it was discovered.