What does it mean to define a mathematical concept? As any experienced mathematician is aware, there is a small subtlety about this, but it is not always highlighted, for which reason I wish to highlight it here.

Here are a few examples of mathematical definitions.

1. A positive integer is * prime * if it has
exactly two factors.

2. A * group * is a set together with an associative
binary operation such that there is an identity element and
every element has an inverse.

3. Let X be a metric space. A subset Y of X is * open *
if for every y in Y there exists d> 0 such that x is in Y for
every x with d(x,y)< d.

4. A function f:X--> Y is an * injection * if no element
of Y has more than one preimage. That is, f(x)=f(y) ==> x=y.

5. Let A be a set of positive integers and for each n let
d_{n}=n^{-1}|A intersect {1,2,...,n}|. That is,
d_{n} is the proportion of numbers up to n that belong
to A. If d_{n} tends to a limit d as n tends to infinity
then A is said to have * density * d. The lim sup of the
d_{n} is called the * upper density * of A.

6. For html reasons write E for the empty set. Then
* the number 4 * is the set {E,{E},{E,{E}},{E,{E},{E,{E}}}}.

7. Let x and y be mathematical objects. The * ordered pair *
(x,y) is the set {x,{x,y}}.

8. A * real number * is a partition of the rational
numbers into two sets A and B such that every element of A is
less than every element of B.

9. A * function * from A to B is a subset F of the
Cartesian product AxB with the property that for every a in A
there is exactly one b in B such that (a,b) is in F.

10. The * function f(x)=sin(x) * is the function
f:R--> R defined by f(x)=x-x^{3}/3!+x^{5}/5!-... .

Notice that these definitions are of two kinds. Definitions 1-5 are all straightforward, in the sense that a mathematical word is introduced and the definition tells us what it means. Definitions 6-10 take words we thought we understood already and redefine them in peculiar ways. If somebody says, `A function from A to B is a subset of the Cartesian product AxB with certain properties,' it is tempting to reply, `No it isn't.' Surely a function is more like a procedure for taking elements of A and unambiguously associating with them elements of B. Similarly, a real number isn't a partition of the rationals - it's more like a single object, a position on the number line. As for the number 4, it's just the positive integer that comes after 3, which comes after 2, which comes after 1, which is the first one. And isn't sin(x) something to do with trigonometry? Maybe it so happens that its value is given by a nice power series, but that is surely a theorem, rather than the definition, which is the ratio of the lengths of the opposite and hypoteneuse of an appropriate triangle.

What is the point of definitions like 6-10? The answer is that
they are artificial constructions which are useful from the point of
view of providing mathematics with solid foundations. However, that
is largely where their usefulness ends, and one should not make the
silly mistake of thinking that they somehow reveal the `true essence'
of the concept being defined. This would be too obvious to be worth
mentioning were it not for the fashion of introducing these definitions
as though they * were * at last uncovering such an essence.
When a lecturer says, `log(x) is the integral of t^{-1}
from 1 to x,' and follows it up with a proof that log(x), thus
defined, has all the familiar properties, this should be understood
as an abbreviated way of saying something like, `For every positive
real number c there is at most one continuous function L from the
positive reals to R such that L(ab)=L(a)+L(b) for every a,b and such
that L(c)=1. The integral I(x) of t^{-1} from 1 to x has the
first property, and increases with x. Hence by the intermediate value
theorem there is a unique number e such that I(e)=1.'

To return to the examples above, what would be a natural way to define the ordered pair (x,y)? It isn't all that easy to say. We would like to say that it is the object you get by putting x first and y second, or that it is the set {x,y} `except that order matters'. Unfortunately, these attempts are a little vague. Another possibility, which is perhaps a little artificial, but less so than {x,{x,y}}, is to say that (x,y) is the function from {1,2} to {x,y} defined by f(1)=x and f(2)=y. The trouble with that is that later we may want to define functions in terms of Cartesian products and Cartesian products in terms of ordered pairs.

Despite these difficulties, we have no difficulty thinking about ordered pairs, and moreover had no difficulty thinking about them long before anybody told us the `correct' definition (at least if my experience is anything to go by). Why is this? The answer is that to use ordered pairs all one needs to know about them is the following axiom:

(x,y)=(z,w) * if and only if * x=z
* and * y=w.

In a sense, this axiom is the true definition of an ordered
pair. It would be nice if one could use it to produce a definition
of the first type - something like that an * ordered pairing *
is a class of objects (x,y) where x and y can be anything, such
that the ordered-pair axiom holds. But mathematical etiquette
demands that we say what those objects are, or at least prove
that the axiom doesn't lead to inconsistencies. In the end,
in order to do this one is forced to come up with some
non-canonical * construction * that satisfies the axiom.
But others could have done just as well. If I felt like it, I
could define the ordered pair (x,y) to be {x,{x,{y}}}. (Proof:
Let {x,{x,{y}}}={z,{z,{w}}}. x doesn't equal {x,{y}} or it
would be an element of itself. Similarly for z and w. It
is not possible for x to equal {z,{w}} and z to equal {x,{y}}
or x is an element of an element of itself. So x must equal
z and {x,{y}} must equal {z,{w}}={x,{w}}. It follows that
either {y}={w} and hence y=w, or x={w}. In the second case,
{x,{w}} has only one element, which then forces x to equal
{y} as well, so again {y}={w} and hence y=w.)

Similar remarks can be made about all of definitions 6-10. In each case there are some properties with which we are familiar from our `naive' experience of some concept. In order to reassure ourselves that we are being rigorous, we look for a construction that has those properties and which is isomorphic to any other construction that has the properties. Usually this can be done in many ways, and no way is more correct than any other - though it may have advantages of efficiency and mathematical neatness. Sometimes there are several natural constructions. For example, sin(x) can be defined as a power series, or as the inverse function of arcsin (itself defined via an appropriate integral) or as the unique solution to a certain differential equation etc. etc. Whichever definition one chooses, one must then prove that it has the properties one wants - such as being given by a power series, getting back to itself after you differentiate four times, being periodic with period some number (defined to be 2 pi), satisfying the addition law sin(a+b)=sin(a)cos(b)+cos(a)sin(b) (I assume that one defines cos at the same time) and so on.

It would be asking too much to suggest a change to normal mathematical parlance, but it might have been better if there had been different words for the two kinds of definition. However, since they are usually easily distinguishable, this is not a serious problem.

When meeting a definition of the second, artificial kind, the kind that one might call a `construction-definition', one should not accept it passively. Instead, one should pay careful attention to the basic properties possessed by whatever has just been constructed/defined, because it is these that are interesting rather than the definition itself. In a typical lecture course or textbook these properties will be found in the easyish propositions that immediately follow the definition.

I said that the two kinds of definition are `usually' easily distinguishable, but sometimes the distinction is blurred. A good example of this is the definition of homology groups. We do not start an algebraic topology course with a preconceived notion of homology, and are therefore inclined to accept whatever definition is thrown at us, however complicated it might seem. As a result, the subject seems to many people to be difficult.

It may not solve everything, but one way to navigate oneself through these difficulties is to follow the suggestion above and focus on the properties that homology groups are supposed to have. In particular, when calculating homology groups, far more useful than the definition will usually be the cluster of theorems that follow it - things like the Mayer-Vietoris sequence - that tell you the homology groups of an object X if it is built out of other objects whose homology groups you already know.

In fact, there is a system of axioms, known as the Eilenberg-Steenrod axioms, which tell you exactly what the properties are that are needed from homology, and, just as with definitions 6-10 above, there are many equivalent ways of satisfying those axioms. What is interesting about this situation is that the discovery of these axioms was a relatively late development in topology (it took place in the 1940s) and had a major clarifying effect on the subject.

Here are three more examples of artificial definitions, which are better thought of as constructions with certain properties.

11. An * ordinal * is a transitive set that is
well-ordered by inclusion. (See
here for a discussion of ordinals.)

12. A * complex number * is an ordered pair
(a,b) of real numbers. (a,b)+(c,d)=(a+c,b+d) and
(a,b)(c,d)=(ac-bd,ad+bc).

13. The * hyperbolic plane * is the set of all
complex numbers with positive real part, with the Riemannian
metric $dz/x$, where $x$ is the real part of $z$. (The main
property of interest here is the large group of symmetries.
It is usually made clear that this is a construction rather
than a definition - one would normally call it the half-plane
model of the hyperbolic plane rather than the hyperbolic plane
itself. See here for a further
discussion of this point.)

As has probably occurred to most people who have read this far, it is possible to classify construction-definitions further. One could defend the power-series definition of sin(x) by saying that it picks out the unique real number that it is supposed to pick out. If you do that in a different way you don't get a different object, but just a different way of describing the same object. By contrast, if you decided to define a real number as a Cauchy sequence of rationals, you would be giving a genuinely different definition.

This distinction, valid though it may be, does not alter the point that when we work with functions like sin(x) and log(x) it is their basic properties that we tend to use. I don't care that sin(2) is approximately 0.90929743, but I do care that sin(a+b)=sin(a)cos(b)+cos(a)sin(b). It is because the power series has these properties that we know that it picks out the values it is meant to.

A logician might explain the distinction between the two sorts of definition as follows: the first kind specifies a system of axioms whereas the second provides a model for a system of axioms that is already given.