Theory of Computation

TagsComputer scienceMathScienceautomata theory




Machines, Languages, and Computation at MITPeter J. DenningTheoretical Models of Computationanecdotes
Lectures on computationRichard P. Feynman


What is the Theory of computation?

mathematical cybernetics

Why does the Theory of computation matter to you?





Standards, jobs, industry, roles, …



Worked examples

Chapter Name




Worked examples

  1. Mathematics or Code. Automatic Verification such as Testing or Lean Proven?
  1. Languages in Anki.


Further resources

Strings and Languages


Alphabet. Any non-empty finite set. We generally use either capital Greek letters Σ,Γ\Sigma ,\Gamma , or uppercase letters A,B,...A,B,... to designate alphabets.

Symbol. The members of an Alphabet.

Examples: Σ1={0,1};Σ2={a,b,c,d,e,f,g,h};Γ={0,1,x,y,z}\Sigma_1=\{0,1\};\Sigma_2=\{a,b,c,d,e,f,g,h\};\Gamma=\{0,1,x,y,z\}

String over an alphabet. A finite sequence of symbols from that alphabet.

Length of a string. If ww is a string over Σ\Sigma, the length of ww, written w|w|, is the number of symbols that it contains. Some authors prefer to use na(w)n_a(w) for indicating some aa characteristic applied on ww.

Empty string. w=ϵ=λ=0|w|=\epsilon=\lambda=0

Language. Set of strings. We generally use upper-case letters A,B,...A,B,... to designate languages.

Lexicographic ordering.

1. Shorter strings precede longer strings.

2. The first symbol in the alphabet precedes the next symbol.

Regular operations.

Let A,BA,B be languages:

Union: AB={wwA or wB}A\cup B=\{w|w\in A\text{ or }w\in B\}

Concatenation: AB={xyxA or yB}=ABA\cdot B=\{xy|x\in A\text{ or }y\in B\}=AB

Ak=AA...Ak times,A1=A,A0={ϵ}A^k=A \overset{\text{k times}}{ A...A},A^1=A,A^0=\{\epsilon\}

Kleene star: A={x1...xk each xiA for k0}=A0A1A2A3...A^*=\{x_1...x_k|\text{ each }x_i\in A \text{ for } k\ge 0\}=A^0\cup A^1\cup A^2 \cup A^3 \cup..., note ϵA\epsilon\in A^* always.

Kleene plus: A+=A1A2A3...=VVA^+=A^1\cup A^2\cup A^3\cup...=V^*V

Example. Let A={good,bad}A=\{good,bad\} and B={boy,girl}B=\{boy,girl\}

AB={good,bad,boy,girl}A\cup B=\{good,bad,boy,girl\}




Operator precedence

aSigmaa\in Sigma1


ax|b → (ax)|b

ab*|c → (a(b*))|c

ab?* → a((b?)*)

ab*? → a((b*)?)

Note: L(a((b?)*)) = L(a((b*)?))

If two operators have the same precedence, then they generate the same language. ??




Worked examples




Worked examples

Steps to convert regular expressions directly to regular grammars and vice versa


Free Grammars. LL.

Deterministic context-free languages.


Are axioms in math equivalent to production rules in unrestricted grammars?

how could a CFG hold an unrestricted grammar (UG), which is turing complete?


Notes and references

[1] M. Johnson, “Formal Grammars Handout written,” 2012 [Online]. Available: https://web.stanford.edu/class/archive/cs/cs143/cs143.1128/handouts/080 Formal Grammars.pdf. [Accessed: 28-Feb-2022]

Finite state machine


Moore and Mealy. Responses are all kinds of values.

DFA and NFA. Responses are 0 or 1.


Worked examples

  1. Make a program in C++ that simple emails by argument (Hint: use a library).

Moore and Mealy automata

Boolean Algebra and Digital Logic. Circuit conversion. From Moore to Circuit.

Finite automata

Finite automata and their probabilistic counterpart Markov chains. Finite Automaton (FA) or Finite-State Machine. An automaton (Automata in plural).

State diagram of a finite automaton M1. It has two states.
OOP uses a State Pattern, which is a design pattern.

We feed the input string 0101 to the machine M1M_1, the processing proceeds as follows:

  1. Start in state q1q_1.
  1. Read 1, follow the transition from q1q_1 to q2q_2.
  1. Read 1, follow the transition from q2q_2 to q1q_1.
  1. Accept because M1M_1 is in an accept state q2q_2 at the end of the input.

The formal definition of a deterministic finite automaton (DFA)

A finite automaton is a 5-tuple (Q,Σ,δ,q0,F)(Q,\Sigma,\delta,q_0,F) where

  1. QQ is a finite set called the states,
  1. Σ\Sigma is a finite set called the alphabet,
  1. δ:Q×ΣQ\delta:Q\times \Sigma\rarr Q is the transition function,
  1. q0Qq_0 \in Q is the start state, and
  1. FQF\subseteq Q is the set accept states, they are sometimes called final states.

The formal definition of a nondeterministic finite automaton (NFA)

A finite automaton is a 5-tuple (Q,Σ,δ,q0,F)(Q,\Sigma,\delta,q_0,F) where

  1. QQ is a finite set called the states,
  1. Σ\Sigma is a finite set called the alphabet,
  1. δ:Q×(Σ{λ})P(Q)={RRQ}\delta:Q\times (\Sigma\cup\{\lambda\})\rarr P(Q)=\{R|R \subseteq Q\} is the transition function (P is a power set),
  1. q0Qq_0 \in Q is the start state, and
  1. FQF\subseteq Q is the set accept states, they are sometimes called final states.

Converting NFAs to DFAs

delta = {
  'state1': {
     'symbol1': {'state1', 'state2',...}


What is the difference between finite automata and finite state machines?

Pattern Matching

Extended finite-state machine


Automata and Grammar

Regular expressions. Regex.

Stephen Cole Kleene invented regex.

Regular expressions, in short regex, are expressions that generate regular languages.

Although some modern regex engines support regular expressions, they support nonregular languages (Regular Expression Recursion) too. It can be misleading, but we use the above definition forward.

POSIX regular expressions. BRE (Basic Regular Expressions), ERE (Extended Regular Expressions), and SRE

Flavors. Perl and PCRE.


Regex engine.

perl vs awk


Accepters are equivalent to regular expressions

Are traducers equivalent to regular expressions?

Worked example





Roman numbers: https://wiki.imperivm-romanvm.com/wiki/Roman_Numerals


Check equivalence of regular expressions



Goyvaerts, J. (2022, December 02). Regular Expressions Reference. Retrieved from https://www.regular-expressions.info/reference.html


“We tend to forget that every problem we solve is a special case of some recursively unsolvable problem!” Knuth, D. E. (1973). The dangers of computer-science theory. In Studies in Logic and the Foundations of Mathematics (Vol. 74, pp. 189-195). Elsevier.

21] D.E. Knuth, On the translation of languages from left to right, Information and Control 8 (1965) 607–639.

[22] D.E. Knuth, A characterization of parenthesis languages, Information and Control 11 (1967) 269–289.

Rice’s theorem


We describe the syntax of language as grammar but not semantics.


A grammar is a 4-tuple (V,T,S,P),(V,T,S,P) ,

where VV is a finite set of objects called variables or nonterminals,

TT is a finite set of objects, disjoint from VV, called terminal symbols, in short terminals, and sometimes tokens because it is a terminal identifier.

SVS\in V is a special symbol called the start variable,

PP is a finite set of rules, called production rules, with each rule being a variable a string of variables and terminals.

If production rules are of the form

xyx\to y

where x(VT)+x \in (V \cup T)^+, y(VT)y \in (V \cup T)^*. Some authors call x and y Head and Body respectively. Others call x and y left-hand side (LHS) and right-hand side (RHS).

The production rules are applied in the following manner: Given a string w1=uxvw_1=uxv

xyx \to yw1=uxvw_1=u\bold{x}vw2=uyvw_2=u\bold{y}v

We say that w1w_1 derives w2w_2 o that w2w_2 is derived from w1w_1.

A rule can be used whenever it’s applicable.

If w1w2...wn,w_1\to w_2\to ...\to w_n,

we say that w1w_1 derives wnw_n and writes

w1wnw_1 \to^* w_n . The * indicates an unspecified number of steps (including zero).

Informally, grammar consists of terminals, nonterminals, a start symbol, and productions.

  1. Terminals.
    class TerminalSymbol
    TerminalSymbol : token // id or name
    TerminalSymbol : lexeme // regular expression
  1. Nonterminals are syntactic variables that can be replaced, so they denote a set of strings.
  1. Start symbol.
  1. Productions. We may follow a notation for production rules, which is a particular form.

BNF. A particular form of notation for grammar (Backus-Naur form).

CNF. A particular form of notation for grammar (Chomsky normal form).

Bottom up

graph LR
    subgraph scanning
        W -. find the body X that match w=uXv .-> Body
    subgraph substitution
        Body -. substitute body X to head Y, w=uYv.-> Head
        Head -- search if it is not StartSymbol --> W


Chomsky normal form (CNF) example

V={P,S,M}V=\{P,S,M\}, T={+,,1,2,3,4}T=\{+,*,1,2,3,4\}, SVS\in V

PSSS+MMMMTTT1234P \to S \\ S \to S+M|M\\ M \to M*T|T\\ T \to 1|2|3|4

Backus–Naur form (BNF) example

V={P,S,M}V=\{P,S,M\}, T={+,,1,2,3,4}T=\{+,*,1,2,3,4\}, SVS\in V

<P>::=<S><S>::=<S>+<M><M><M>::=<M><T><T><T>::=1234<P> ::= <S> \\ <S> ::= <S> + <M> | <M>\\ <M> ::= <M> * <T> | <T>\\ <T> ::= 1 | 2 | 3 | 4


<P>a<Q>::=1Q1<P>::=b<Q>::=2<P>a<Q> ::= 1Q1\\ <P> ::= b \\ <Q> ::= 2

Parse trees and derivations


Writing a grammar

Verifying the language generated by a grammar

The language generated by GG. Let G=(V,T,S,P),G=(V,T,S,P) , be a grammar. Then the set

L(G)={wT:Sw}L(G)=\{w\in T^*:S\to^* w\}

is the language developed by G.

Grammar Hierarchy

Set view


Table view

Grammar typeGrammar acceptedLanguage AcceptedAutomatonProduction rule form
Type 0. These are the most general.Free or unrestricted grammarRecursively enumerable languageTuring Machineuvu\to v, where both u and v are arbitrary strings of symbols in V, with u non-null.
Type 1Context-sensitive grammarsContext-sensitive languageLinear-bounded automatonuXwuvw,uXw\to uvw, where uu,vv, and ww are arbitrary strings of symbols in VV, with vv non-null, and XX a single nonterminal. In other words, in a particular context.
Type 2Context-free grammarsContext-free languagePushdown automatonXvX\to v, where vv is an arbitrary string of symbols in VV, and XX is a single nonterminal. i.e. regardless of context.
Type 3. These grammars are the most limited in terms of expressive power.Regular grammarsRegular languageFinite-state automatonXaX \to a, XaYX \to aY (right-regular grammar), XYaX \to Ya (left-regular grammar), XϵX \to \epsilon, where XX and YY are nonterminals and aa is a terminal. Regular grammar is either right-linear grammar or left-linear grammar, but not both. Mixing left-regular and right-regular rules are the context-free grammar.


Augmented Grammar

Definite clause grammars

Context-free grammars

TODO: Context-Free Grammars Versus Regular expressions

Formal definition


From Berkley Prof. Sen CS 164 Lecture 8-9

Notional conventions



  1. Conversión de gramáticas ambiguas a no ambiguas.
  1. Estrategias para el manejo de la ambigüedad en los lenguajes.
  1. Problemas equivalentes a determinar si una gramática es ambigua.
  1. Complejidad de determinar si una gramática es ambigua.

Writing a grammar

Eliminating Ambiguity

Elimination of Left Recursion

A left recursive grammar has a nonterminal AA such that there is a

derivation for some string aa

A+AaA\to^+ Aa

Top-down parsing methods cannot handle left-recursive, so


Given a grammar, you replace each left-recursive pair of productions AAaβA\to Aa|\beta

AAaβA\to Aa|\beta


AβAAaAϵA\to \beta A'\\ A'\to aA'|\epsilon



We replace recursive pair of productions following the method by

ETEE+TEϵE\to TE'\\ E'\to +TE'|\epsilon

General Algorithm.


Elimination of Left Recursion. (2021, October 31). Retrieved from https://cyberzhg.github.io/toolbox/left_rec

Left Factoring

Non-Context-Free Language Context

Worked examples

Models of computation



Model of computation

Automaton (Mathematics model)Paradigm
Turing MachineImperative (procedural) programming
Lambda-calculusFunctional programming
Pi-calculusEvent-based and distributed programming

Regular expressions to Automata. Generalized nondeterministic finite automaton. Arden’s Rule.

Regular expressions to Automata

  1. Generalized nondeterministic finite automaton.
def convert_to_regular_rxpression(G):
   if G.number_of_states == 2:
  1. Solving right-linear equation by Arden’s rule.



Turing Machine