How to find an unusual proof that continuous functions on the closed interval [0,1] are bounded.

Prerequisites.

The definitions of continuity (in terms of epsilons and deltas) and boundedness. The least upper bound axiom.

Sketches of the usual proofs.

One proof starts with the Heine-Borel theorem: however you write [0,1] as a union of a collection of open sets, you can also write it as the union of a finite subcollection. To deduce the statement, for each x let U_x be the set of all y such that |f(y)-f(x)| < 1. The sets U_x are open and their union is obviously all of [0,1]. By the Heine-Borel theorem there are x₁,...x_n such that the union of the sets U_{x_i} is all of [0,1]. But then f is bounded above by 1+max f(x_i).

Another proof starts with the Bolzano-Weierstrass theorem: every sequence in [0,1] has a convergent subsequence. Then, if the theorem is false, we can find an infinite sequence (x_n) such that f(x_n) > n for every n. Pick a subsequence converging to x, and check that f cannot possibly be continuous at x.

Discovering a proof if you don't know about compactness.

Both the above proofs depend, in one way or another, on the statement that the closed interval [0,1] is compact . (Roughly speaking, a set is compact if it satisfies the conclusion of the Heine-Borel theorem. For subsets of Rⁿ one can use the Bolzano-Weierstrass theorem instead.) What if one had never met the idea of compactness in any form, even implicitly?

Let us imagine ourselves faced with a function on [0,1] about which we know nothing except that it is continuous. How can we pin it down? One very small observation is that f(x) is finite for every x, since that is what we mean by a real-valued function defined on [0,1]. Can we say anything more? Well, we ought to use the definition of continuity. Let us remind ourselves of it. A function f is continuous at x if

(for every e > 0) (there exists d > 0) such that |y-x| < d => |f(y)-f(x)| < e.

Since we know that f is continuous at x, we can pick any old e - for definiteness let us go for e=1 - and find d such that f(y) < f(x)+1 for every y in the interval (x-d,x+d).

This looks like a small amount of progress, since we have shown that f is bounded in the whole of an open interval. However, we have no control on the width of the interval, so we should not get too excited.

Is there anything else we can do? Well, we might notice that if we have two overlapping intervals in each of which f is bounded, then f is bounded in their union. Is there some way of finding an interval that overlaps with (x-d,x+d) on which f is bounded? The only way we know, so far, of constructing such an interval, is to pick a point y and choose c such that |f(z)-f(y)| < 1 for every z in the open interval (y-c,y+c). How can we get such an interval to overlap with (x-d,x+d)? Well, if y itself belongs to (x-d,x+d) then we get some overlap, but there is the risk that (y-c,y+c) is entirely contained within (x-d,x+d) in which case we have learnt nothing. On the other hand, if y does not belong to [x-d,x+d] then c might be so small that there is no overlap at all. This suggests that we should try y=x+d or y=x-d. If we go for the first, then we find c > 0 such that f is bounded on the interval (x-d,x+d+c).

Now let us try to be a bit more systematic. We'll begin with x=0 and try to build up a larger and larger interval [0,t) on which f is bounded. With luck, we'll be able to get t all the way up to 1.

The first step is to find t₁ such that f(x) < 1 for every x in [0,t₁) (using the definition of continuity again). Next, we apply the definition of continuity to the point t₁ and obtain, as above, a t₂ > t₁ such that f(x) < 2 for every x in [0,t₂). Then we keep going. In this way we produce a sequence t₁ < t₂ < t₃ < .... such that f(x) < n for every x in [0,t_n).

Are we done? No, because it might be that the sequence (t_n) converges to a number s < 1. Then all we would have done is show that the restriction of f to [0,t) is bounded for every t < s. That is an interesting statement, but much weaker than what we are hoping to prove.

Are we completely stuck? Yes in the sense that we have gone on for ever and still not reached our target. But no in the sense that there may still be things to try. We have, in a certain sense `reached' s. Can we get any further?

An obvious thing to try is the technique we have used up to now, namely applying the definition of continuity, with e=1, to some carefully chosen point. Is there any point we could go for now? Yes indeed there is - the point s. This tells us that there exists d such that |f(x)-f(s)| < 1 for every x in the interval (s-d,s+d). Now this interval must contain some t_n, from which it is easy to deduce that f(s) < n+1 and hence that f(x) < n+2 for every x < s+d. Thus, we have managed to continue the sequence (t_n) `beyond infinity'.

To those who know about ordinals, the above observation immediately suggests an idea for a proof - just continue the above argument, obtaining a transfinite sequence. This idea works, as the resulting sequence, if constructed properly, is a well-ordered set. If the theorem were false, it would be possible to construct an uncountable well-ordered subset of the reals, which is easy to prove does not exist.

Somebody who does not know about ordinals will probably have thoughts of the following kind. I can go on and on constructing larger and larger intervals [0,t) on which f is bounded. Given any such interval, I can always extend it, and given a countable sequence ([0,t_n)) of such intervals, I can always find another such interval containing all of them. This seems to be enough, but I can't think how to describe what happens. I start with t₁ < t₂ < t₃ < ...., then find a new s bigger than all of them, which I could call s₁. Then I could continue with s₁ < s₂ < s₃ < .... and so on, but then I would construct an r bigger than all of those, and construct a new sequence from that. I could then do this whole process infinitely many times, but would still not necessarily have reached 1. On the other hand I could still continue.

The problem seems to be a difficulty of describing a sequence that goes beyond infinity in this way. How can we get round this? (As I said earlier, actually there is a way of describing this `transfinite' process rigorously and constructing an argument along these lines, but it is not necessary.) Let us think carefully about what we are doing. We are constructing a sequence, initially by an inductive argument (once we have constructed t_n we go ahead and construct t_n+1), and later by something that resembles an inductive argument (once we have constructed all the t_n we go ahead and construct s₁). The generalized induction is harder because we don't have a nice indexing set like the natural numbers. Is there a way of doing induction without the crutch of an indexing set? Yes there is: you look for a minimal counterexample.

Can we make sense of the idea of a minimal counterexample for our problem? What we would be asking is the following. We have a process for gradually extending the interval [0,t) on which we know that f is bounded. If we can get t all the way to 1 then we are done. If we can't, then there must be some sort of barrier - that is, a point beyond which we cannot go. Why not look for the first such point?

It is still not absolutely clear that we have an idea that makes sense. However, let us try it. Must there be a minimal t with the property that f is unbounded on [0,t)? Well, if f is unbounded, then it is unbounded on [0,1) (by applying continuity at 1), so at least there exists t such that f is unbounded on [0,t). Then, by the least upper bound axiom (though here we find a greatest lower bound) we choose the infimum s of all t with this property. If anything is going to be our minimal t, then s will.

We know that f is unbounded on [0,t) for every t > s. Does that imply that f is unbounded on [0,s)? Yes it does, because if f were bounded on [0,s) then by continuity at s (the usual argument) it would be bounded on [0,s+c) for sufficiently small c.

Having found the minimal s such that f is unbounded on [0,s), we now hope for a contradiction. By now it is a reflex to apply continuity at s. Then f is bounded on (s-c,s+c) for some suitable c. It is also bounded on [0,s-c/2). Hence, it is bounded on [0,s) and we have our contradiction.

A small variant of this argument.

One can also argue as follows. Let X be the set of all t such that f is bounded on the interval [0,t]. What can we say about X? Well, if s belongs to X and r < s then r certainly belongs to X. It follows that X is an interval of the form [0,u) or [0,u]. In the first case, apply continuity at u to show that [0,u) is not the whole of X after all. In the second case, apply continuity at u to show that [0,u] is not the whole of X unless u=1.

Finding a more conventional proof?

I find that the above proof is natural in the sense that the idea of extending the interval on which f is bounded is in a sense `the first thing to try'. However, it is not as neat (at least in its first version) as the usual proofs via the Heine-Borel theorem or Bolzano-Weierstrass theorem. It is not impossible to imagine thinking of those theorems without going through the above argument first. When we first applied the definition of continuity, we constructed, for an arbitrary x, an open interval U_x containing x, on which f was bounded. We observed that we would like it if these intervals overlapped. It perhaps doesn't take too much imagination to notice that if we can cover [0,1] with finitely many of these intervals, then we are done. As for the Bolzano-Weierstrass theorem, it arises naturally if one tries to prove the result by contradiction. What does it mean for f to be unbounded? It means that for every n there is an x with f(x) > n. Then we try to do something with that sequence. We can't get a contradiction just from an individual f(x) being large - they have to collect together somewhere.

Another way to arrive at the Heine-Borel theorem is to think about the extending-intervals argument and work out what we actually used. The main lemma that we used over and over again is that every x is contained in an open interval U_x on which f was bounded. We tried to put those intervals together, running into problems if we had infinitely many of them, which we then managed to solve by getting back down to finitely many. It is clear that what we were really trying to do was cover [0,1] with finitely many of the U_x. It is but a short step from here to thinking of the statement of the Heine-Borel theorem.