ECE-374, Lecture 2
Regular Languages
Slides by Pavel Loskot
Fall 2025
ZJUI
2025-09-15-17-57
Lecture brain teaser
Consider a n-input AND function. The input (x) is a string of
n-digits from the alphabet Σ = {0, 1}, and the output (y ) is
logical AND of all the elements of x.
Formulate a language that describes the AND function (problem).
1
Lecture brain teaser
Consider a n-input AND function. The input (x) is a string of
n-digits from the alphabet Σ = {0, 1}, and the output (y ) is
logical AND of all the elements of x.
Formulate a language that describes the AND function (problem).
0|0, 1|1,
0 · 0|0,
0 · 1|0, 1 · 0|0, 1 · 1|1
LANDN = .. .. .. ..
. . . .
n n−1 n
(0·) |0, (0·) 1|0, ... (1·) |1 . . .
This is example of a regular language to be discussed today.
If n is fixed, then language should be one row in the matrix above.
1
Chomsky Hierarchy
2
Regular Languages
3
Regular Languages
Theorem (Kleene’s Theorem)
A language is regular if and only if it can be obtained from finite
languages by applying the three basic operations:
• Union
• Concatenation
• Repetition
a finite number of times.
• famous 20th century mathematician (pronounced as Klay-nee)
• student of Alonzo Church! (same advisor as Alan Turing)
• founder of Computability (which Turing took and really ran
with)
4
Regular Languages
A class of simple but useful languages.
The set of regular languages over some alphabet Σ is defined
inductively.
Base Case
• ∅ is a regular language
• {ϵ} is a regular language
• {a} is a regular language for each a ∈ Σ, interpreting a as a
string of length 1
5
Regular Languages
Inductive step:
We can build up languages using a few basic operations:
• If L1 , L2 are regular, then L1 ∪ L2 is regular.
• If L1 , L2 are regular, then L1 L2 is regular.
• If L is regular, then L∗ = ∪n≥0 Ln is regular.
The [·]∗ operator is called Kleene star.
• If L is regular, then so is L = Σ∗ \ L.
Regular languages are closed under operations of union,
concatenation, Kleene star, and language complement.
6
Some simple regular languages
Lemma
If w is a string, then L = {w } is regular.
Example: {aba} or {abbabbab}. Why?
Lemma
Every finite language L is regular.
Examples: L = {a, abaab, aba}. L = {w | |w | ≤ 100}. Why?
Because we can construct any finite string using a finite number of
operations, e.g., La = {a}, Lb = {b}, {aba} = La · Lb · La
finite string vs. finite language
7
Regular Languages
Define basic operations to build regular languages.
Important: Any language generated by a finite sequence of such
operations is regular.
Lemma
Let L1 , L2 , . . . , be regular languages over alphabet Σ. However,
the language, ∪∞ i=1 Li , is not necessarily regular.
• Kleene star (repetition) is a single operation!
• For languages with infinite number of strings, we can create
an arbitrary infinite set of languages, each with one string,
and use infinite union to make them all into one language,
thus, making every such language regular, which is absurd!
• Inifinite union of different strings is different than Kleene star,
which is a union of one repeated langauge! 8
Regular Languages - Example
Example
The language L01 = {0i 1j |∀i, j ≥ 0} is regular:
• L0 = {0} and L1 = {1}
• L0− = L∗0 and L1− = L∗1
• L01 = L0− · L1−
9
Rapid-fire questions - regular languages
1. L1 = 0i i = 0, 1, . . . , ∞ . Language L1 is regular. T/F?
True. L0 = {0}, L1 = L∗0
2. L2 = 017i i = 0, 1, . . . , ∞ . Language L2 is regular. T/F?
True. L0 = {0}, L2 = (L17 ∗
0 )
3. L3 = 0i i is divisible by 2, 3, or 5 , L3 is regular. T/F?
True. L0 = {0}, Li/2 = (L20 )∗ , Li/3 = (L30 )∗ , Li/5 = (L50 )∗ ,
L3 = Li/2 ∪ Li/3 ∪ Li/5
4. L4 = {w ∈ {0, 1}∗ | w has at most 2 1s}. L4 is regular. T/F?
True. L0 = {0}, L1 = {1}, L00 = L∗0 , L10 = L∗0 L1 L∗0 ,
L20 = L∗0 L1 L∗0 L1 L∗0 , L4 = L00 ∪ L10 ∪ L20
10
Regular Expressions
11
Regular Expressions
Reggular expressions (RegEx’s) is one way to describe regular
languages.
• simple patterns to describe related strings
• useful in
• text search (editors, Unix/grep, emacs)
• compilers: lexical analysis
• compact way to represent interesting/useful languages
• dates back to 50’s: Stephen Kleene
who has a star operator named after him
Kleene, Stephen C.: “Representation of Events in Nerve Nets
and Finite Automata”. In Shannon, C. E.; McCarthy, J.,
Automata Studies, Princeton Univ. Press. pp. 3–42., 1956.
12
Inductive Definition
A regular expression r over an alphabet Σ is one of the following:
Base cases:
• ∅ denotes the language ∅
• ϵ denotes the language {ϵ}
• a denote the language {a}
Inductive cases: If r1 and r2 are regular expressions denoting
languages, R1 , and, R2 , respectively, then
• (r1 + r2 ) denotes the language R1 ∪ R2
· ·
• (r1 r2 ) = r1 r2 = (r1 r2 ) denotes the language R1 R2
• (r1 )∗ denotes the language R1∗
13
Regular Languages vs Regular Expressions
Regular Language Regular Expression
∅ ∅
{ϵ} ϵ
{a}, for a ∈ Σ a
R1 ∪ R2 r1 + r2
R1 R2 ·
r1 r2
R∗ r∗
Regular expressions denote regular languages — they explicitly
show the operations that were used to form the language.
14
Notation and Parenthesis
• For a regular expression r, L(r) is the language defined by r.
Multiple regular expressions can define the same language!
Example: (0 + 1) and (1 + 0) denotes same language {0, 1}
• Regular expressions r1 and r2 are equivalent, if L(r1 ) = L(r2 )
• Can omit parenthesis by adopting the precedence order:
∗, concatenate, +
Example: r ∗ s + t = ((r ∗ )s) + t
• Omit parenthesis by associativity of operations:
Examples:
rst = (rs)t = r (st), r + s + t = r + (s + t) = (r + s) + t
• Superscript +. For convenience, define, r+ = rr∗ . Hence,
if L(r) = R, then L(r+ ) = R +
• Notations: r + s, r ∪ s, r |s all denote union, and
rs (concatenation) is sometimes written as r s· 15
Examples of regular expressions
16
Interpreting regular expressions
1. (0 + 1)∗ = {ϵ, 0, 1, 00, 11, 01, 10, . . .}: all binary strings
for long strings, ‘restart’ reg-exp
2. (0 + 1)∗ 001(0 + 1)∗ :
all strings with ”001” as substring
3. 0∗ + (0∗ 10∗ 10∗ 10∗ )∗ :
number of 1’s divisible by 3
4. (ϵ + 1)(01)∗ (ϵ + 0):
alternating 0s and 1s; alternatively, no two consecutive 0s and
no two consecutive 1s
Remember:
0∗ means zero or more repetitions of 0
0+ means one or more repetitions of 0
17
Creating regular expressions
1. All strings that end in 1011?
→ (0 + 1)∗ 1011
2. All strings except 11?
→ ε + 0 + 1 + 00 + 01 + 10 + (0 + 1)(0 + 1)(0 + 1)+
3. All strings that do not contain 000 as a subsequence?
(other way of saying < 3 consecutive zeros in an expression)
→ 1∗ (ε + 0)1∗ (ε + 0)1∗
4. All strings that do not contain the substring 10?
→ all zeros must come before all 1s i.e., 0∗ 1∗
reg-exp for ‘do not contain’ are usually more difficult to find
18
Tying everything together
Consider a n-input AND function. The input (x) is a string of
n-digits from input alphabet Σi = {0, 1}, and the output (y ) is
logical AND of all the elements of x.
The language describing this function (problem) is:
0|0, 1|1
0 · 0|0,
0 · 1|0, 1 · 0|0, 1 · 1|1
LANDN = . . . .
..
.. .. ..
n n−1 n
(0·) |0, (0·) 1|0, ... (1·) |1 . . .
Formulate the regular expression, which describes this language.
19
Tying everything together
Consider a n-input AND function. The input (x) is a string of
n-digits from input alphabet Σi = {0, 1}, and the output (y ) is
logical AND of all the elements of x.
The language describing this function (problem) is:
0|0, 1|1
0 · 0|0,
0 · 1|0, 1 · 0|0, 1 · 1|1
LANDN = . . . .
..
.. .. ..
n n−1 n
(0·) |0, (0·) 1|0, ... (1·) |1 . . .
Formulate the regular expression, which describes this language.
Σ = {0, 1, ‘·‘, ‘|‘}
all output 1 instances
z }| {
rANDN = (‘0·‘ + ‘1·‘)∗ 0(‘0·‘ + ‘1·‘)∗ ‘|0‘ + (‘1·‘)∗ ‘|1‘
| {z }
all output 0 instances 19
Regular expressions in programming
20
One last expression ...
21
Bit strings with odd number of 0s and 1s
The corresponding regular expression:
∗
00 + 11 (01 + 10)
∗
00 + 11 +(01 + 10)(00 + 11)∗ (01 + 10)
(solved by techniques to be presented in the following lectures)
22