Chapter 2

Operations on machines and languages

Languages are sets, and there are operations one can perform on them simply in virtue of their being sets. If K and L are languages, so obviously are K∪L, K∩L and K \ L. There is one further operation that we need which is defined only because languages are sets of strings and there are operations we can perform on strings, specifically concatenation. This concatenation of strings gives us a notion of concatenation of languages. KL = {wu : w ∊ K ∧ u ∊ L}, and the reader can probably guess what K* is going to be. It's the union of K, KK, KKK . . .

Exercise: If K = {aa, ab, bc} and L = {bb, ac, ab} what are (i) KL, (ii) LK, (iii) KK, (iv) LL?

Click for answer

Exercise: Can you express |KL| in terms of |K| and |L|?

Click for answer

The following fact is fundamental.
If K and L are regular languages so are K ∩ L, K \ L, KL and K*.

So every language you can obtain from regular languages by means of any of the operations we have just seen is likewise regular.

Let's talk through this result.

K ∩ L

The thought-experiment shows very clearly that the intersection of two regular languages is regular : if I have a machine ℳ₁ that recognises L₁ and a machine ℳ₂ that recognises L₂ I obviously run them in parallel, giving each new incoming character to both machines and accept a string if they both accept it. Using the imagery of the thought-experiment, the clipboard of information that I hand on to you when I go for my coffee break has become a pair of clipboards, one for ℳ₁ and one for ℳ₂.

But although this makes it clear that the intersection of two regular lan- guages is regular, it doesn’t make clear what the machine is that recognises the intersection. We want to cook up a machine ℳ₃ such that we can see running-ℳ₁-and-ℳ₂-in-parallel as merely running ℳ₃. ℳ₃ is a sort of composite of ℳ₁ and ℳ₂. What are the states of this new composite machine?

The way to see this is to recall the idea that a state of a machine is a state-of-knowledge about the string-seen-so-far. What state-of- knowledge is encoded by a state of the composite machine? Obviously a state of the composite machine must encode your knowledge of the states of the two machines that have been composed to make the new (composite) machine. So states of the composite machine are ordered pairs of states of ℳ₁ and ℳ₂.

What is the state transition function for the new machine? Suppose ℳ₁ has a transition function δ₁ and ℳ₂ has a transition function δ₂, then the new machine has the transition function that takes the new state 〈s₁, s₂〉 and a character c and returns the new state 〈δ₁(s₁, c), δ₂(s₂, c)〉.

K∪L and K \ L

The proofs that a union or difference of two regular languages is regular is precisely analogous.

People sometimes talk of the complement of a language L. L is a language over an alphabet Σ, and its complement, relative to Σ, is Σ* \ L. It's easy to see that the complement of a regular language L is regular: if we have a machine ℳ that recognises L, then we can obtain from it a machine that recognises the complement of L just by turning all accepting states into non-accepting states and vice versa.

(A common mistake is to assume that every subset of a regular language is regular. I think there must be a temptation to assume that a subset of a language recognised by ℳ will be recognised by a version of ℳ obtained by throwing away some accepting states.)

KL

It's not obvious that the concatenation of two regular languages is regular, but it's plausible. We will explain this later. For the moment we will take it as read and press ahead. This leads us to . . .

Next: 2.1 Regular Expressions
Back: 1.4.2 A few more corollaries