Discrete Structures

# A Recursive Definition of Strings

 Last Update: 30 March 2009 Note: or material is highlighted

1. Many objects (such as sets and "strings") can be defined recursively:

Base Case:
Show a few members of the class of objects you are defining.
(I.e., you "give away" some "free samples".)

Recursive Case:

Show how to construct more (i.e., "new") members from current (i.e., "old") ones.

Compare building blocks (e.g., Legos):

• Each block is a "base case".
• Each structure that can be built out of several blocks put together—or out of previously-put-together constructions—is a recursively constructed case.

* The recursive case tells you how to make more of something if you already have some.
* The base case tells you how to get some in the first place.

2. In what follows, we will see how to recursively define "strings": sequences of symbols that can be used to model languages, mathematical codes, etc.

3. #### Definition:

An atomic symbol isdef a (single) mark on paper (or multiple marks that are considered to be a single object).
E.g.:

– is an atomic symbol
+ is an atomic symbol, even though it consists of 2 marks

4. #### Definition:

Concatenation isdef an abstract operation on symbols that:
• takes as input any two symbols (starting with atomic ones)
• and constructs as output a new, molecular symbol consisting of the two others.

1. #### Remark:

Concatenation is usually implemented (on paper) by the physical act of "juxtaposition",
i.e., writing down a symbol and then writing down a(nother) symbol immediately to the right of the first one, with no blank spaces in between.

2. #### Remark:

If the blank (" ", or "_" in order to make it visible) is an atomic symbol,
then two atomic symbols with a blank in between them is a molecular symbol consisting of 3 atomic symbols, one of which is the blank.

5. #### Definition:

The empty symbol is an atomic symbol.

• #### Remarks:

1. The empty symbol is an analogue of the empty set.

2. It looks like the symbol between these quotes: ""

3. We can't make it visible, but we can give it a name: λ
(i.e., lower-case Greek letter lambda (= lower-case L).

4. It can be thought of as a location on the paper where you will start writing.

5. It is not the blank symbol:

• There is a difference between a blank space and no space at all.

• Cf.: There is a difference between saying nothing (which is like the blank letter)
and not saying anything (which is like the empty letter).

6. #### Remark:

Let Σ be a finite set of symbols.

1. This has nothing to do with summations!

2. λ ∈ Σ

7. #### Recursive Definition of "string":

Notation: If Σ is a set of symbols, then Σ* denotes the set of strings "over" Σ

#### Base Case:

The empty symbol is a string. (It is the "empty string".)
I.e., λ ∈ Σ*.

#### Recursive Case:

If w is a string and s is a symbol, then ws is a string.

• I.e., ( (w ∈ Σ* ∧ (s ∈ Σ) ) → ( (w concatenated with s) ∈ Σ* ).

• #### Remarks:

1. If we interpret the symbols as letters, then strings are words that are spelled with those letters.

E.g., Let Σ = {a, b, c, …, z}
Then Σ* = {s | s is a string "spelled" with letters from Σ}
= {λ, a, b, …, z, aa, bb, …, zz, …, abc, …, cat, dog, chat, chien, gato, perro, … ewiucjmgvjrjhtugvhshcuej, …}

2. We could define ΣEng* = the subset of Σ* consisting of all and only correctly spelled words of English.

We might even have a set of spelling rules that tell us which these are. So, "spelled" would be in the set, but "speled" probably wouldn't be.
Would "lurf" be in the set? (It's pronounceable, but not a real word (so far).)
Would "lrxf" be in the set? (It's not even pronounceable, so it couldn't be a real word.)

3. We could define ΣFr* = the subset of Σ* consisting of all and only correctly spelled words of French
(we might need to add a few more letters, in order to handle the accents).

4. Note that the string "chat" ∈ ΣEng* ∩ ΣFr*.

If we have a meaning function (called a "semantic interpretation function")

• from the set of words (of some language)
• to the set of objects, properties, and relations in the world,

then these two words would have different meanings.
(In English, "chat" means "talk"; in French, it means "cat".)

8. #### Remark:

If we interpret the symbols as words (instead of letters)—possibly including the blank, and possibly including these symbols:

",",   ";",    ".",    "!",    "?"

—then strings are sentences consisting of those words, possibly separated by spaces and possibly with punctuation.

We could also have a "grammar" that tells us which sentences are grammatically correct (like "this sentence is grammatically correct")
and which ones aren't (like "this sentence grammatically incorrect").

9. #### Remark:

If we interpret the symbols as sentences, then strings are paragraphs consisting of those sentences.

This is how we can mathematically (and computationally!) model languages.