Why is multiplication commutative?

The main point of this page is that the above question can be interpreted in many different ways and that different interpretations lead to different answers. A secondary goal is to discuss whether the fact that we can `just see' that multiplication is commutative means that we have some ability that computers could never have (as, for example, Penrose has suggested).

mn=nm for any pair of positive integers m and n.

Why does this statement seem obvious? After all, if you decide to multiply together two numbers such as 395 and 428 using long multiplication, then the calculations you do will depend very much on whether you work out 395 428s or 428 395s. In the first case you will find yourself adding 128400, 38520 and 2140 while in the second you will add 158000, 7900 and 3160. Why isn't it a miracle that both triples of numbers add to 169060?

Most people would probably say that 395x428 and 428x395 both count the number of points in a 395-by-428 rectangular grid. If you rotate such a grid through 90 degrees then you turn it into a 428-by-395 rectangular grid but you clearly leave the number of points unchanged. (They might not express the argument in exactly this form, but this would be the rough line of reasoning.)

This argument, compelling as it is, doesn't quite qualify as a mathematical proof. Can we turn it into one? Here is an attempt.

Attempted Proof

If A is a set of cardinality m and B is a set of cardinality n, then the Cartesian product AxB has cardinality mn. But the map (a,b)-->(b,a) is easily seen to be a bijection between AxB and BxA, from which it follows that BxA has cardinality mn. But we already know that it has cardinality nm, so mn=nm.

Was that a proof?

It was certainly a valid argument, as long as one was prepared to accept the initial assertion about the sizes of Cartesian products. But what is the status of that assertion? Is it a theorem of set theory? Is it the definition of multiplication? In fact, what is the definition of multiplication?

I have discussed definitions of familiar concepts elsewhere . Even if you have not read that, you are probably aware that there is often a considerable arbitrariness in what one regards as a `basic' definition and what one thinks of as an easy theorem. So instead of asking about "the definition" of multiplication, it may be better to ask, "Could one, in principle, define mn to be the cardinality of AxB when A has cardinality m and B has cardinality n?"

If we want to adopt this as a definition, then we must prove that it is well-defined - that is, that the cardinality of AxB depends only on the cardinalities of A and B and not on the sets themselves. This is easily done. If C has the same cardinality as A and D has the same cardinality as B, then there are (by the definition of `has the same cardinality as') bijections f:A-->C and g:B-->D. It is then easy to check that (a,b)-->(f(a),g(b)) is a bijection between AxB and CxD, which shows that AxB and CxD have the same cardinality.

We now have a serviceable definition of multiplication, at least for positive integers, and a 99% rigorous proof that multiplication, thus defined, is commutative. The extra 1% consists in observing that, for any positive integer m, there is some set A of cardinality m: the set {1,2,...,m} will do, for example.

Is there any possible objection to the above definition and argument? As I have said, it is completely rigorous, but one might choose to object on the aesthetic grounds that multiplication is a very basic arithmetical operation and it should not be necessary to use set theory, however simple, to define it. What about our intuition that multiplication is "repeated addition", an idea that can be understood by a child who has never heard of sets? (One indication that children do not initially think of multiplication as the size of something like a Cartesian product is that both my sons went through a phase of understanding it reasonably well without seeing that it was obviously commutative. Now they do. I don't yet know about my daughter.)

A rough definition of mxn that does justice to the repeated-addition idea is that mn=n+n+....+n (m times). The roughness comes in the not entirely standard notation, but there is a standard way of making such definitions precise, which is to use induction. A precise definition of multiplication as repeated addition is that 1n=n and mn=(m-1)n+n when m> 1. Hence, for example, 5n=4n+n=3n+n+n=2n+n+n+n=n+n+n+n+n, a series of equalities that corresponds very closely to our intuitive idea, especially when we look at them in the reverse order.

Notice that with this definition it is no longer quite so obvious that mn=nm. Here is a proof.

A second proof.

First I shall show, by induction, that 1n=n1 for all positive integers n. This is certainly true when n=1. For larger n we know, using the inductive hypothesis, that n1=(n-1)1+1=1(n-1)+1=n-1+1=n.

Now I shall show, this time by induction on m+n, that mn=nm for all positive integers m and n. We have proved the result when either m or n is 1. For m,n> 1 we have

nm=(n-1)m+m=m(n-1)+m=(m-1)(n-1)+(n-1)+m= (m-1)(n-1)+(m-1)+n=(n-1)(m-1)+(m-1)+n=n(m-1)+n=(m-1)n+n=mn.

In what sense was that a proof?

In the course of the above argument, I assumed various "obvious" facts, such as the commutativity of addition and the principle of induction. Given that the commutativity of multiplication is itself very basic, why was it legitimate to make these assumptions?

One possible answer to this question is that addition is more basic than multiplication since multiplication is defined using addition rather than the other way round. Another is that the commutativity of addition can itself be proved, again in terms of more basic concepts. (If you want to see this done, look here. See below for what is meant by s(m) on that page.) Of course, one must eventually stop asking "Yes, but why is that true?" and take certain statements as axioms. The most standard system of axioms for the natural numbers is Peano's:

The last axiom is a form of the principle of induction, and it allows one to define addition as follows: if y=1 then x+y=s(x). If y> 1 then y=s(z) for some z (this is easy to prove by induction) and x+y=s(x+z). One can prove inductively that addition, thus defined, is commutative, and this proof naturally appears well before a proof that multiplication is commutative.

A third proof.

Somehow, although the inductive definition of multiplication was more basic than the Cartesian-products definition, the inductive proof of commutativity was much less natural than the proof using Cartesian products. Here is a way to get the best of both worlds.

I would like to prove, by induction, that if A and B are sets of cardinality m and n respectively, then AxB has cardinality mn. Here, I am defining mn in the inductive way. Since I have already remarked that the cardinality of AxB depends only on the cardinalities of A and B, I am free to let A={1,2,...,m} and B={1,2,...,n}. If m=1 then the result is obvious, since the map (1,x)-->x is a bijection between AxB and B, B has cardinality n, and 1n is defined to be n. If m> 1, then let C={1,2,...,m-1}. By induction, CxB has cardinality (m-1)n. We also know that CxB is a subset of AxB, and that the set-theoretic difference AxB-CxB consists of all ordered pairs (m,x) such that x is in B. The map (m,x)-->x is therefore a bijection between AxB-CxB and B, from which it follows that AxB-CxB has cardinality n. Therefore, AxB has cardinality (m-1)n+n=mn. It follows that the inductive definition and the Cartesian-products definition are equivalent, and hence that multiplication (defined inductively) is commutative.

One way to think of the above argument.

The Cartesian-product definition of mn does justice to our idea that mn counts the number of points in an m-by-n rectangular grid. The picture suggested by the inductive definition of mn is of a long line consisting of n points, then another n, then another n, and so on. The above proof makes rigorous the idea that we could take the points n at a time from the long line and rearrange them into an m-by-n rectangle, the first n points forming the bottom row, the next ones the second row etc.

mn=nm for any pair of integers m and n.

In order to justify this statement we must say what we mean by zero and the negative integers, and explain how they are multiplied together. Here is one way of doing it. First, I shall regard 0 and -n as mere symbols. I then regard the integers as the set of all symbols of the form 0, n or -n, where n is a positive integer. These I add according to the following rules.

For the last two definitions to make sense I need a lemma (concerning the positive integers) that tells me that if m> n then there is a unique positive integer t such that n+t=m. This integer is defined to be m-n. This can easily be proved by induction.

As for multiplication, I define 00=0, 0n=n0=0, 0(-n)=(-n)0=0, mn to be what it usually is, m(-n)=(-n)m=-mn and (-m)(-n)=mn.

There is now a rather ugly proof that multiplication is commutative: if x and y are integers, then they are both of the form 0, n or -n for some positive integer n. This gives nine possibilities and for each one it follows directly from the definition and the commutativity of multiplication for positive integers that xy=yx.

What I have just done is to `construct' the negative numbers. This can be done more neatly - the usual approach is to define an integer to be an equivalence class of ordered pairs (m,n) of positive integers, with (m,n) equivalent to (k,l) if and only if m+l=k+n. Addition and multiplication can be defined by thinking of (m,n) as m-n and seeing what you ought to get, so for example (m,n)(k,l) is defined to be (the equivalence class of ) (mk+nl,ml+nk), and this can be shown not to change if (m,n) and (k,l) are replaced by equivalent pairs.

Do these constructions "prove" that multiplication of integers is commutative? In a way they do, but there is something slightly back to front about the proof. What is it that makes us think, with the above constructions, that what we have constructed is basically the same as what we ordinarily think of as the integers? What is natural about the constructions? Surely we do not have total freedom to construct them, or it would be a simple matter to define multiplication in such a way as to make it commutative.

In order to think about this question, let us focus on a more specific one: why do we believe that (-2)x(-3)=6? Surely it is not because we arbitrarily constructed multiplication of negative integers in order to make it so. It feels like a statement that one can justify.

Here are three different, but related, justifications of the statement.

  1. Suppose Alice owes Bob and Charles three pounds each. Then Alice has a debt of 6 pounds, which can be thought of as a credit of -6 pounds. Now suppose that Bob `takes away' his portion of the debt by writing it off, and that Charles then does the same. Then Alice has had -3 pounds of credit removed twice. In other words, her credit has had -3 subtracted from it twice, or, to put it another way, has had -3 added to it -2 times. In other words, her credit has increased by (-2)x(-3) pounds. But we already know that her credit has increased by 6 pounds (since she was 6 pounds in debt and now isn't) so (-2)x(-3) must equal 6.
  2. 1x(-3)=-3 since one times anything is that thing. 2x(-3)=(-3)+(-3)=-6 since two times anything is that thing plus itself. By this sort of reasoning we can form the following pattern: 4x(-3)=-12, 3x(-3)=-9, 2x(-3)=-6, 1x(-3)=-3. Obviously the pattern should continue with 0x(-3). What will this equal? Well, the right hand sides have been increasing by 3 every time, so it ought to be (-3)+3=0. After that will come (-1)x(-3) and that should then be 0+3=3. Next is (-2)x(-3) and that will be 3+3=6.
  3. 0=2x0=2x(3+(-3))=2x3+2x(-3)=6+2x(-3). Hence, 2x(-3)=-6. Now, 0=(2+(-2))x(-3)=2x(-3)+(-2)x(-3)=-6+(-2)x(-3). Hence, (-2)x(-3)=6.

Notice that the third of these proofs depends on the distributivity of multiplication over addition. How do we know that that holds for negative integers? It won't do to say that they can be constructed, and addition and multiplication defined, in such a way that the distributive law can be checked to hold, because we were trying to examine the "direct intuition" that this construction is meant to capture. The second argument suffers from a similar defect: how do we know that the pattern continues? Obviously it will be very tidy if it does, but how do we know that we have the tidiness? It follows from the distributive law, or even from the special case (x+1)y=xy+y and induction (assuming we know how to add), but this gets us back to our earlier difficulty.

Is the first argument any better? Let us try to write it out more mathematically. Then the middle of the argument claims that 0=(-6)-(-3)-(-3)=(-6)-2x(-3)=(-6)+(-2)x(-3). Adding 6 to both sides then gives that 6=(-2)x(-3), but the steps of the calculation are still not really based on a prior intuition about negative numbers, but more on familiar rules for manipulating them.

These difficulties arise from not thinking about negative numbers in the correct way. Multiplication of negative numbers is not something we understand directly - we do not ask how many apples there are if -2 people have -3 apples each - but is rather a definition that is forced on us if we want it to obey certain rules. Instead of being grateful that multiplication of negative numbers happens to be commutative, we are grateful that there happens to be a unique binary operation on the negative numbers that is commutative, associative, distributive over addition, with 1 as an identity element. Even this way of explaining it assumes that the negative numbers themselves are just there, waiting to be manipulated. Better still is to say this: we are grateful that the natural numbers, with their familiar operations, can be embedded into a commutative ring. Since this can be done in many ways, we can go further and say that one such ring, which we call Z, is the smallest, in the sense that all it contains is N (or strictly speaking the embedded copy of N), 0 and additive inverses of every n in N and that these can all be shown to be necessarily distinct.

A more algebraic way to say something similar is that if R is a commutative ring and f:N --> R is an embedding (that is, an injection that preserves addition and multiplication) then there are unique embeddings g:N --> Z and h:Z --> R such that hg=f. (Exercise: this "universal property" of Z makes it unique up to isomorphism.)

To be continued ...

This page has been sitting around for some time, waiting for me to go on to discuss larger number systems. However, it is so closely relevant to what I am lecturing that I have decided to put it up in unfinished form.