Let X, Y and Z be subsets of R, let f be a continuous function from X to Y and let g be a continuous function from Y to Z. Then the composition gf is continuous.

1. The conclusion we are asked to show is

(i) (for all a > 0) (for all x in X) (there exists b> 0) (for all y in X) |x-y|< b implies |g(f(x))-g(f(y))|< a.

The information we are given is

(ii) (for all c > 0) (for all z in X) (there exists d> 0) (for all w in X) |z-w|< d implies |f(z)-f(w)|< c.

(iii) (for all r > 0) (for all u in Y) (there exists s> 0) (for all v in Y) |u-v|< s implies |g(u)-g(v)|< r.

2. We begin by removing "for all"s from (i). That is, we write

* Let * a> 0 * and let * x * be in
* X.

leaving ourselves with the task of proving

(i)' (there exists b> 0) (for all y in X) |x-y|< b implies |g(f(x))-g(f(y))|< a.

3. Now we search the ends of (ii) and (iii) for a potential match to the statement [|g(f(x))-g(f(y))|< a]. Since f(x) and f(y) belong to Y, it is clear that (iii) is the better choice.

4. If we want to convert [|g(u)-g(v)|< r] into [|g(f(x))-g(f(y))|< a] then we should specialize (iii) to the case r=a, u=f(x). In order to convert v into f(y), we need to change a statement of the form "(for all v in Y) P(v)" into one of the form "(for all y in X) P(f(y))" (a similar move to ones that have been made in earlier examples). Making these specializations and substitutions, we arrive at

(iii)' (there exists s> 0) (for all y in X) |f(x)-f(y)|< s implies |g(f(x))-g(f(y))|< a.

5. Let's remove "there exists s> 0" by fixing such an s. So now we have as a hypothesis

(iii)'' (for all y in X) |f(x)-f(y)|< s implies |g(f(x))-g(f(y))|< a.

6. This is looking good, but we want to convert [|f(x)-f(y)|< s]
into [|x-y|< b]. This we cannot do by matching (for example, we
obviously can't change f(x) into x by specializing or renaming), so we
must use the other technique which has appeared several times in earlier
examples: fix it so that |x-y|< b * implies * |f(x)-f(y)|< s.
To be precise, we will be done if we can show that

(iv) (there exists b> 0) (for all y in X) |y-x|< b implies |f(y)-f(x)|< s.

7. But (iv) obviously comes from (ii) by matching: specialize (ii) by setting z=x and c=s, and renaming w as y and d as b.

So far I have been imaginining the most basic sort of algorithm - one that tries to match chunks of text and do a few elementary logical operations. But one can imagine making such an algorithm more sophisticated in various simple ways. For example, given that there are several characterizations of continuity, perhaps one could search through them to see if they could be used to shorten the proof.

Suppose, for example, that we try to prove that a composition of continuous functions is continuous, using the result, already discovered and proved, that a function is continuous if and only if the inverse image of every open set is open. The problem becomes almost trivial, but let us see what happens with matching techniques.

1. The conclusion we are asked to show is

(i) gf is continuous

The information we are given is

(ii) f is continuous

and

(iii) g is continuous

2. We are now given as a hint that instead of expanding
out a sentence of the form "h is continuous" in terms of
epsilons and deltas, we should replace it by "h^{-1}(U)
is open for every open set U". If we do this to our desired
conclusion and hypotheses, we obtain

(i)' (gf)^{-1}(U) is open for every
open set U

(ii)' f^{-1}(V) is open for every
open set V

and

(iii)' g^{-1}(W) is open for every
open set W

3. Let us remove "for every" from (i) by writing

* Let * U * be an open set. *

Now our task is to prove

(i)'' (gf)^{-1}(U) is open

4. We have two statements telling us that certain sets are
open. Since (gf)^{-1}(U) is a subset of the domain of
gf, which is also the domain of f, it is more appropriate to
try to use (ii)'. In other words, we would like to specialize
(ii)' by choosing an appropriate set V and obtaining (i)''
as a result. For this we would need

(iv) f^{-1}(V)=(gf)^{-1}(U)

5. Expanding out the definition of inverse image, this requirement becomes

(iv)' {x:f(x) is in V}={x:g(f(x)) is in U}

Using the definition of equality of sets, this turns into

(iv)'' (for all x) f(x) is in V if and only if g(f(x)) is in U.

6. A slightly more sophisticated point: because we know next to nothing about the image of f, it is likely that the only reasonable way to choose such a V will be so that it satisfies the stronger property

(iv)'' (for all y) y is in V if and only if g(y) is in U.

7. Now we can work backwards through the definitions again, obtaining first

(iv)''' {y:y is in V}={y:g(y) is in U}

and then

(iv)'''' V=g^{-1}(U)

8. We chose V so that f^{-1}(V) would equal
(gf)^{-1}(U). We were hoping to specialize (ii)'
to this V, thereby proving (i)'. This is allowed, as
long as V is open. So we are left wishing to prove

(v) g^{-1}(U) is open

9. It is clear that matching works - (iii)' is the appropriate statement to use, specialized to U, which is open by hypothesis.

The above discovery would have been considerably simplified if it had been obvious to the imagined algorithm that

(gf)^{-1}(U)=f^{-1}(g^{-1}(U))

Then the choice of V would have been extremely natural, and the entire proof could have taken place at a higher level. It is not hard to imagine equipping an algorithm with the knowledge of basic facts of this kind.