Why the concept of a field extension is a natural one

Prerequisites

Basic linear algebra.

Warning

This page is not yet finished.

Discovering the notion of field extensions.

It is not the aim of this page to get at all far with Galois theory. All I shall demonstrate is that one can be led very naturally to think about field extensions, even if one does not have the mentality of an algebraist.

A well known elementary exercise is to show that 2^1/2+3^1/2 is algebraic. The answer is as follows: x=(2^1/2+3^1/2)²= 2+3+2.6^1/2. It follows that (x²-5)²=24, which shows that x is algebraic.

It is now very natural to look at other examples, or even to try to prove a general theorem. What can we say about 2^1/3+3^1/3, for example? If we square it, we obtain 2^2/3+2.6^1/3+3^2/3, which doesn't look very nice. On the other hand, cubing would turn at least some of the terms into integers, so let's try that. We obtain 2+3.2^2/33^1/3 +3.2^1/33^2/3+3. Unfortunately, this is still not very nice, because there are two irrational terms. In other words, we don't seem to be any better off than we were when we started.

Let us now apply the following general principle.

Principle

Don't forget what it is you are trying to prove.

We are trying to show that x=2^1/3+3^1/3 is algebraic. What does this mean? It means that there is some true equation of the form a₀+a₁x+...+a_n xⁿ=0, with rational coefficients not all zero. Does this remind us of anything? Yes it does - linear algebra. We are trying to show that the numbers 1,x,x²,... are not all linearly independent over the rationals.

We have written out a few of these numbers. Let us write another one.

x⁴=2.2^1/3+4.2.3^1/3+ 6.2^2/33^2/3+4.3.2^1/3+3.3^1/3

Is any general pattern emerging? Yes it is: all our numbers are integer combinations of things like 2^1/33^2/3. To be precise, they are integer combinations of the numbers 1,2^1/3,2^2/3,3^1/3,3^2/3, 2^1/33^1/3,2^1/33^2/3, 2^2/33^1/3 and 2^2/33^2/3. Thus, they all live in the subspace of R (considered as a vector space over Q) generated by these nine numbers. You can't have ten independent vectors in a nine-dimensional space, so there must be a linear relationship between 1,x,x²,...,x⁹. This proves that 2^1/3+3^1/3 is algebraic.

Note a very interesting feature of the above proof - we did not actually calculate the polynomial of which 2^1/3+3^1/3 is a root. Of course, we could have gone ahead and solved a nasty system of simultaneous equations, but would the answer have been enlightening? Surely we can be satisfied with the knowledge that such a polynomial exists, and can be calculated if necessary.

It is now very natural to ask whether the above result can be generalized. For example, is it true that the sum of two algebraic numbers is always algebraic? That is, suppose that a₀+a₁x+...+a_nxⁿ=0 and b₀+b₁y+...+b_ny^m=0. Can we find a polynomial with x+y as a root? Inspired by the previous argument, we consider the sequence of powers (x+y)^k, k=0,1,2,... . By the binomial theorem, each of these is an integer combination of terms of the form x^ry^s.

Our success in the example of 2^1/3+3^1/3 came down to the fact that we had to consider only finitely many terms of this kind. (Notice that in that example x was 2^1/3 and y was 3^1/3.) Why was that? It was because as soon as we cubed x or y it became an integer. In general this is not the case, but notice that we can still simplify xⁿ, since the polynomial satisfied by x gives us a linear expression for xⁿ in terms of lower powers. (I am assuming of course that a_n is not zero.) Therefore, we only have to consider powers xⁱ with i < n and y^j with j < m. Thus, we need to consider only mn products x^ry^s, and the same argument goes through, showing that x+y must be the root of some polynomial of degree mn.

It is not hard to give a similar argument that shows that xy is also algebraic.

Now let us analyse the above argument a little. We see that what made it work was that the vector space of all rational combinations of powers of x+y was finite-dimensional. We then observe that we deduced this from the fact that the corresponding vector spaces for powers of x and powers of y were also finite-dimensional - and also that we could build a spanning set for the (x+y)-space out of the spanning sets for the x-space and the y-space.

We have thus arrived at the following idea. Given an algebraic number a, why not consider the vector space of all rational linear combinations of powers of a? The dimension of this space will be the degree of a (that is, the smallest degree of a non-trivial polynomial with rational coefficients and with a as a root).

Now a very important fact in Galois theory is that this vector space is also a field . Since it is easy to see that it is closed under multiplication, this statement boils down to the statement that the reciprocal of a non-zero element of the vector space also belongs to the space. How might one be led to ask whether this was true?

One possible answer is simple curiosity: one notices that elements of the vector space can be multiplied (since they can be thought of as ordinary real numbers) and that the space is closed under multiplication (since every element of the space is a polynomial in a, so the product of two of them is also a polynomial in a, which can be reduced to a polynomial of degree the degree of a by rewriting aⁿ whenever it occurs - we have basically seen this already). If one has knows at least the definition of a field, then it is surely quite natural to ask whether elements of the vector space have multiplicative inverses.

However, I personally find a piece of theory more illuminating if it arises in response to an actual problem. (Not all mathematicians would feel the same way.) Do we know all there is to know about the algebraic numbers if we know that they are closed under sums and products? We could ask whether they are closed under taking reciprocals, but this is easy: if a₀+a₁x+...+a_nxⁿ=0, and y is the reciprocal of x, then a_n+a_n-1y+...+a₀yⁿ=0.

Thus, the algebraic numbers form a field. (I have not proved that -a is algebraic if a is. I leave this as an exercise.) What more could we ask for? Well, the rationals form a field, but, as the Greeks discovered, this field is inadequate for some purposes because the square root of a rational number need not be rational. On the other hand, it is pretty easy to see that the square root of an algebraic number is algebraic. However, this leads to an interesting line of thought. The algebraic numbers are obtained from the rational numbers by including all solutions to polynomials with rational coefficients. What if we repeat the process? That is, why don't we consider roots of polynomials with algebraic coefficients? Do we get anything new?

One piece of evidence that this is a natural question is that a few years ago I was actually asked it by an undergraduate in a supervision. Taking the advice of Doron Zeilberger's Opinion 34 , let me confess that I did not then know the answer, though I was pretty sure of my guess - which was correct - that we do not get anything new. That is, a root of a polynomial with algebraic coefficients is itself algebraic.

How are we to prove this? One lesson we should have learnt from considering the sum of two algebraic numbers is that we probably don't want to take a polynomial with algebraic coefficients and construct a new polynomial with rational coefficients which has amongst its roots all the roots of the original polynomial. We would be much happier if we could prove the existence of this polynomial indirectly, perhaps using linear algebra as above.

Once again let us consider a particular example, such as the equation x²-2^1/2x+3=0. This turns out to be too simple an example, because we can use the formula for a quadratic and we find that the answer is the sum of two square roots. So let us change the equation to x³-2^1/2x+3=0. We are not very keen to apply the formula for the solution of the cubic, so what can we do instead? We are trying to find a rational linear combination of powers of x that gives zero, and all we have is a single combination with one of its coefficients irrational.

What we did before was to show that all powers of x+y lived in some finite-dimensional vector space, and therefore were linearly dependent. That is, we found a way of spanning all powers, as rational linear combinations of a few numbers, and deduced dependence from this. It is in that sense that we used linear algebra - it saved us from solving lots of simultaneous equations. So for our new problem is there any way of writing the powers of x as rational linear combinations of just a few numbers? Let us try.

We know that x³=2^1/2x-3. Then x⁴=2^1/2x²-3x, which can't obviously be simplified further. Next, x⁵=2^1/2x³-3x². We can simplify the x³, which tells us that

x⁵=2^1/2(2^1/2x-3)-3x² =-3x²+2x-3.2^1/2.

It is beginning to look as though every power of x is a rational combination of 1,x,x²,2^1/2,2^1/2x and 2^1/2x². In fact, this is easy to prove by induction. Therefore, we have proved that the powers of x live in a six-dimensional space. It follows that x is a root of a non-zero polynomial of degree six with rational coefficients.

Before we try to generalize the above example, let us just note why the number six arose. One notices that it is 2x3, that is the degree of 2^1/2 times the degree of the polynomial satisfied by x. Moreover, looking at the proof one sees that this is not a coincidence. Whenever x³ appeared, we were able to simplify it, and whenever 2^1/2 was squared, it became an integer. (Had it been a more complicated quadratic irrational, we could still have simplified it.) Therefore, the number of products that we need to consider is 2x3.

And now, what about the generalization? We have a number x, satisfying an equation

a₀+a₁x+...+a_nxⁿ=0

where all of the a_i are algebraic. Since the quotient of two algebraic numbers is algebraic, we can assume that a_n=1, which is a convenient thing to do because then xⁿ has a simpler expression in terms of lower powers of x, something we have already seen is useful. If we use the equation

xⁿ= a₀+a₁x+...+a_n-1x^n-1

to write every power of x in terms of 1,x,...,x^n-1, then what can we say about the coefficients? They will depend on a₀,a₁,...,a_n-1 and it is not hard to notice and then prove inductively that they will be polynomials in these n variables. Moreover, the power of a_j in one of these polynomials will never be higher than its degree, since otherwise we could simplify it. If the degree of a_j is d_j, then the number of products of powers of the a_j that will occur is at most d₀d₁...d_n-1, which means that the coefficients we are talking about live in a vector space over the rationals of dimension d₀d₁...d_n-1. Combining this with the fact that the largest power of x that arises (after simplification) is n-1, we find that all powers of x live in the rational linear span of the nd₀d₁...d_n-1 numbers x^ra₀^r₀ a₁^r₁... a_n-1^r_n-1, where r < n and r_j < d_j. Therefore, x is the root of a non-trivial polynomial of degree nd₀d₁...d_n-1, with rational coefficients.

I don't seem to have proved that the vector space generated by the rationals and the powers of an algebraic number is a field, which I must admit I was expecting to be forced to do. Nevertheless, if we think a bit about the above arguments and try to systematize them, we will be led quickly to this assertion.

One of the first things we noticed was that the degree of an algebraic number a is the same as the dimension of the vector space (over the rationals) Q(a) generated by the powers of a. This is, of course, virtually true by definition, but nevertheless it is a nice way of thinking about things, and has proved itself useful. What we then discovered was that if we take a new number b, given as a root of a polynomial with its coefficients in Q(a) (in such a way that no smaller polynomial with coefficients in Q(a) will do) then the degree of b is the degree of a times the degree of the polynomial giving b. Now let us try to rewrite this assertion in terms of dimensions of vector spaces. It says that the dimension of Q(b) over Q equals the dimension of Q(a) over Q times the dimension of Q(b) over Q(a).

But what is the dimension of Q(b) over Q(a)? It means the dimension of the vector space of all linear combinations of powers of b, with coefficients in Q(a). But for this to be a vector space we need Q(a) to be a field. Is it? We have asked ourselves the question at last.

The proof that it is a field is simple but instructive. For notational reasons, let me talk about x instead of a, and suppose that x satisfies the equation

xⁿ= a₀+a₁x+...+a_n-1x^n-1

once again. (Suppose also that x is of degree n.) How does that allow us to write 1/x in terms of 1,x,...,x^n-1? Well, since the polynomial is of minimal degree, a₀ is not zero, so dividing through by x gives us an expression for 1/x of exactly the kind we want.

This does not prove that Q(x) is a field, since all we have done is give an inverse for x. How would we invert x+x², say? Well, we were able to invert x by examining a suitable polynomial equation in x. Can we find a polynomial equation satisfied by x+x²? Yes, by the usual trick. We simply work out powers of x+x² and simplify them to show that they are all rational combinations of 1,x,...,x^n-1. This shows that x+x² satisfies a polynomial of degree at most n with non-zero constant term, and we can then invert x+x² in exactly the way we inverted x.

This argument is clearly sufficiently general to prove the whole result - that every non-zero element of Q(x) has a multiplicative inverse in Q(x). Can we express it more neatly? What we did when inverting x was to write 1 as a linear combination of x,x²,...,xⁿ. Why did we want to do that? Because then we simply divided by x. So what we were really trying to do was to find an element z of Q(x) such that xz=1. Could we somehow prove that this element exists rather than `finding' it? To do so we must describe our situation more abstractly. We notice that multiplication by x is a map from Q(x) to Q(x), and in fact it is a linear map. We want to show that 1 belongs to its image. If we want to do this indirectly (that is, using abstract linear algebra) then instead of finding z that maps to 1 we can instead simply prove that this map is non-singular - that is, has zero kernel. But this is obvious, since x, thought of as a complex number, is not zero. A nicer proof that does not involve the real or complex numbers is to say that if z is in Q(x) and xz=0, then xⁿ has been written in terms of x,x²,...,x^n-1. Since it can also be written in terms of 1,x,x²,...,x^n-1 with a non-zero coefficient of 1, we deduce that 1,x,x²,...,x^n-1 are linearly dependent, contrary to our hypothesis that x was of degree n.

Thus, multiplication by x is a non-singular linear map, so there exists z such that xz=1 as desired. This argument generalizes straightforwardly: multiplication by any non-zero element y of Q(x) is a non-singular linear map. Therefore, we can always find z in Q(x) such that yz=1.