Theory of Computation

Requisites

Resources

Foundations

Name	Tags
Sipser, Michael. Introduction to the Theory of Computation. 3rd ed. Cengage Learning, 2012. ISBN: 9781133187790.
Hopcroft, J., Motwani, R. and Ullman, J., 2007. Introduction to automata theory, languages, and computation . Boston: Pearson Addison-Wesley.
Denning, P. J., Dennis, J. B., & Qualitz, J. E. (1978). Machines, Languages and Computation. Prentice Hall Professional Technical Reference. doi: 10.5555/578597
The Language of Machines: an Introduction to Computability and Formal Languages. Robert Floyd and Richard Beigel
Michael Sipser. 18.404J Theory of Computation. Fall 2020. Massachusetts Institute of Technology: MIT OpenCourseWare, https://ocw.mit.edu . License: Creative Commons BY-NC-SA
Automata Theory by Jeffrey D. Ullman	coursecoursera
A. V. Aho (ed.) Currents in the Theory of Computing. Prentice Hall, 1973. ISBN 0-13-195651-5
COMS W3261 Computer Science Theory Section 001 Fall 2017 by Alfred Aho	course
CS 332 at Boston University. Fundamentals of Computing by Leonid A. Levin	course
Discrete Mathematics course at Calvin College.
The Programmer’s Guide To Theory: Great ideas explained by Mike James	bookgentle introduction
J.Martin, ―Introduction to Languages and the Theory of Computation, Third Edition, TMH, 2007J.Martin, ―Introduction to Languages and the Theory of Computation, Third Edition, TMH, 2007
Knuth Art of Programming Volume 6. The Theory of Context-free Languages
https://www.nesoacademy.org/cs/04-theory-of-computation

Olivia Gutú, Lenguajes formales y autómatas, Universidad de Sonora (PDF)

Domingo Gómez, Luis M. Pardo, Teoría de Autómatas y Lenguajes Formales (para Ingenieros Informáticos) (PDF)

Elena Jurado, Teorías de Autómatas y Lenguajes Formales (PDF)

Leopoldo Altamirano, Miguel Arias, Jesús González, Eduardo Morales, Gustavo Rodrı́guez, Teorı́a de Autómatas y Lenguajes Formales (PDF)

Sergio Balari, TEORÍA DE LENGUAJES FORMALES Una Introducción para Lingüistas (PDF)

Materia: Autómatas y Lenguajes Formales (2021), material Dr. Francisco Hernández, FC/UNAM)

Materia: Autómatas y Lenguajes Formales (2022), material Mtro. Noé Hernández, FC/UNAM)

Materia: Lenguajes Formales y Autómatas, material Ing. Almicar Monterrosa)

Materia: Teoría de Autómatas y Lenguajes Formales (2010), material Domingo Gómez, Luis M. Pardo

Materia: Teoría de Autómatas y Lenguajes Formales, Eduardo Morales y Leopoldo Altamirano

Material: Teoría de Autómatas y Lenguajes Formales, material Dr. Luis Pineda, PCIC/UNAM

Videos curso previo

Materia: Autómatas y Lenguajes Formales, material Dr. Favio Miranda, PCIC/UNAM

http://ivanvladimir.tumblr.com/search/lfya

Sitio de preguntas sobre teoría de la computación

http://turing.iimas.unam.mx/~ivanvladimir/page/curso_lfya

Lenguajes Formales y Autómatas. (2020, January 19). Retrieved from https://turing.iimas.unam.mx/~ivanvladimir/page/curso_lfya_2022i

Welcome to maquinas’s documentation! — ⚙️ maquinas 0.1.5.14 documentation. (2023, January 23). Retrieved from https://maquinas.readthedocs.io/en/stable

History

Name	Tags
Machines, Languages, and Computation at MIT	Peter J. DenningTheoretical Models of Computationanecdotes
Lectures on computation	Richard P. Feynman

References

Name	Tags
McCarthy, J. A basis for a mathematical theory of computation. In Computer Programming and Formal Systems , P. Braffort and D. Hirschberg, Eds. North-Holland Publishing Company, Amsterdam, The Netherlands, 1963, 3370.
Untitled
Untitled

Introduction

What is the Theory of computation?

mathematical cybernetics

Why does the Theory of computation matter to you?

https://twitter.com/Plinz/status/1541937546006933504

Research

ACM SIGACT

Ecosystem

Standards, jobs, industry, roles, …

Story

FAQ

Worked examples

Models of computation

https://gist.github.com/sanchezcarlosjr/333b9a774c0fa57f55532d772b13c946

https://cs.brown.edu/people/jsavage/book/pdfs/ModelsOfComputation_Chapter4.pdf

Model of computation

Automaton (Mathematics model)	Paradigm
Turing Machine	Imperative (procedural) programming
Lambda-calculus	Functional programming
Pi-calculus	Event-based and distributed programming
OOP
Logic

Mind and computation

https://en.wikipedia.org/wiki/Talk:Theory_of_computation

Strings and Languages

Definitions.

Alphabet. Any non-empty finite set. We generally use either capital Greek letters $\Sigma ,\Gamma$ , or uppercase letters $A,B,...$ to designate alphabets.

Symbol. The members of an Alphabet.

Examples: $\Sigma_1=\{0,1\};\Sigma_2=\{a,b,c,d,e,f,g,h\};\Gamma=\{0,1,x,y,z\}$

String over an alphabet. A finite sequence of symbols from that alphabet.

Length of a string. If $w$ is a string over $\Sigma$ , the length of $w$ , written $|w|$ , is the number of symbols that it contains. Some authors prefer to use $n_a(w)$ for indicating some $a$ characteristic applied on $w$ .

Empty string. $|w|=\epsilon=\lambda=0$ 

Language. Set of strings. We generally use upper-case letters $A,B,...$ to designate languages.

Lexicographic ordering.

1. Shorter strings precede longer strings.

2. The first symbol in the alphabet precedes the next symbol.

Regular operations.

Let $A,B$ be languages:

Union: $A\cup B=\{w|w\in A\text{ or }w\in B\}$

Concatenation: $A\cdot B=\{xy|x\in A\text{ or }y\in B\}=AB$ 

$A^k=A \overset{\text{k times}}{ A...A},A^1=A,A^0=\{\epsilon\}$

Kleene star: $A^*=\{x_1...x_k|\text{ each }x_i\in A \text{ for } k\ge 0\}=A^0\cup A^1\cup A^2 \cup A^3 \cup...$ , note $\epsilon\in A^*$ always.

Kleene plus: $A^+=A^1\cup A^2\cup A^3\cup...=V^*V$ 

Example. Let $A=\{good,bad\}$ and $B=\{boy,girl\}$

$A\cup B=\{good,bad,boy,girl\}$

$AB=\{goodboy,goodgirl,badboy,badgirl\}$

$A^2=\{goodgood,goodbad,badgood,badbad\}$

$A^*=\{\epsilon,good,bad,goodgood,bgoodbad,badgood,badbad,goodgood,good,goodgoodbad,...\}$

Chomsky hierarchy

Regular languages

Operator precedence

Operator	Precedence
()	6
*+?	5
concatenation	3
\|	2
$a\in Sigma$	1

Examples:

ax|b → (ax)|b

ab*|c → (a(b*))|c

ab?* → a((b?)*)

ab*? → a((b*)?)

Note: L(a((b?)*)) = L(a((b*)?))

If two operators have the same precedence, then they generate the same language. ??

https://webcache.googleusercontent.com/search?q=cache:UnlM5CsShGAJ:www.cs.columbia.edu/~aho/cs3261/Lectures/L3-Regular_Expressions.html&cd=1&hl=en&ct=clnk&gl=mx&client=firefox-b-d

https://www.oreilly.com/library/view/learning-awk-programming/9781788391030/ceafcbc7-fe9e-4c33-b899-ed2f48e7e659.xhtml

https://www.boost.org/doc/libs/1_56_0/libs/regex/doc/html/boost_regex/syntax/basic_extended.html#boost_regex.syntax.basic_extended.operator_precedence

TODO: 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaac'.match(/^(a+)+b/)

Regular Expressions

Pathological regular expressions can execute in geometric time. For example 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaac'.match(/^(a+)+b/) will effectively crash any JavaScript runtime. JS-Interpreter has a variety of methods for safely dealing with regular expressions depending on the platform. https://neil.fraser.name/software/JS-Interpreter/docs.html

https://gist.github.com/sanchezcarlosjr/747e1e2c099e7fe1e70b5d5c7b6e2abe

Worked examples

Homework

If $x$ is a string, then $x^R$ denotes the reversal of $x$ . If $x$ and $y$ are strings, then $(xy)^R=$
$y^R x^R$

Finite state machine

Representations.

Moore and Mealy. Responses are all kinds of values.

DFA and NFA. Responses are 0 or 1.

Programming.

Image from Domain Specific Languages by Martin Fowler, with Rebecca Parsons

Worked examples

Make a program in C++ that verify emails by argument (Hint: use the library regex).

Moore and Mealy automata

Boolean Algebra and Digital Logic. Circuit conversion. From Moore to Circuit.

https://www-igm.univ-mlv.fr/~berstel/Exposes/2009-06-08MinimisationLiege.pdf

Finite automata

Finite automata and their probabilistic counterpart Markov chains. Finite Automaton (FA) or Finite-State Machine. An automaton (Automata in plural).

The computer is complex, instead, we use an idealized computer called a computational model.

A finite automaton:

State diagram of a finite automaton M1. It has two states.

🔥

OOP uses a State Pattern, which is a design pattern.

We feed the input string $01$ to the machine $M_1$ , the processing proceeds as follows:

Start in state $q_1$ .

Read 1, follow the transition from $q_1$ to $q_2$ .

Read 1, follow the transition from $q_2$ to $q_1$ .

Accept because $M_1$ is in an accept state $q_2$ at the end of the input.

The formal definition of a deterministic finite automaton (DFA)

A finite automaton is a 5-tuple $(Q,\Sigma,\delta,q_0,F)$ where

$Q$ is a finite set called the states,

$\Sigma$ is a finite set called the alphabet,

$\delta:Q\times \Sigma\rarr Q$ is the transition function,

$q_0 \in Q$ is the start state, and

$F\subseteq Q$ is the set accept states, they are sometimes called final states.

The formal definition of a nondeterministic finite automaton (NFA)

A finite automaton is a 5-tuple $(Q,\Sigma,\delta,q_0,F)$ where

$Q$ is a finite set called the states,

$\Sigma$ is a finite set called the alphabet,

$\delta:Q\times (\Sigma\cup\{\lambda\})\rarr P(Q)=\{R|R \subseteq Q\}$ is the transition function (P is a power set),

$q_0 \in Q$ is the start state, and

$F\subseteq Q$ is the set accept states, they are sometimes called final states.

Converting NFAs to DFAs

Rabin-Scott Theorem

https://twitter.com/getjonwithit/status/1720832283026854098

delta = {
  'state1': {
     'symbol1': {'state1', 'state2',...}
   }
}
# BFS
NFAToDFA(NFA)

Classification

Acceptors

Transducers

Generators (also called sequencers)

https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/40342.pdf

Worked examples

Cellular automaton

FAQ

What is the difference between finite automata and finite state machines?

Pattern Matching

Extended finite-state machine

https://www.seas.upenn.edu/~lee/10cis541/lecs/EFSM-Code-Generation-v4-1x2.pdf

Regular expressions. Regex.

💡

Stephen Cole Kleene invented regex.

Regular expressions, in short regex, are expressions that generate regular languages.

Although some modern regex engines support regular expressions, they support nonregular languages (Regular Expression Recursion) too. It can be misleading, but we use the above definition forward.

POSIX regular expressions. BRE (Basic Regular Expressions), ERE (Extended Regular Expressions), and SRE

Flavors. Perl and PCRE.

https://gist.github.com/CMCDragonkai/6c933f4a7d713ef712145c5eb94a1816

Regex engine.

perl vs awk

https://unix.stackexchange.com/questions/114591/is-perl-ever-a-better-tool-than-awk-for-text-processing

Accepters are equivalent to regular expressions

Are traducers equivalent to regular expressions?

/^1?$|^(11+?)\1+$/

Worked example

Let a regular expression $a^*b^*z^*$ , make a new regular expression that removes $\lambda$ from generated language.
XYZ|XZ|YZ|XZ|X|Y|Z
$\sum_i^n C(n,i)=2^n-1$

Write an algorithm to generate a regex for detect a number $n \in N.$

Generate the language L from a given regex.

Make an expression on PCRE 4.0 or above that checks if a string is a palindrome.

Tools

https://regex101.com/

https://regexr.com/

AWK

Roman numbers: https://wiki.imperivm-romanvm.com/wiki/Roman_Numerals

Make a regular expression to capture roman numbers.
compilers-uabc/roman_to_arabig.awk at main · sanchezcarlosjr/compilers-uabc
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode characters You can't perform that action at this time. You signed in with another tab or window.
https://github.com/sanchezcarlosjr/compilers-uabc/blob/main/lexer-awk/roman_to_arabig.awk

Make a program that converts roman numbers to Arabia numbers using regular expressions.
https://github.com/sanchezcarlosjr/compilers-uabc/blob/main/lexer-awk/roman_to_arabig.awk

RegEq

Check equivalence of regular expressions

https://bakkot.github.io/dfa-lib/regeq.html

https://regex-equality.herokuapp.com/

Goyvaerts, J. (2022, December 02). Regular Expressions Reference. Retrieved from https://www.regular-expressions.info/reference.html

Regular expressions to Automata. Generalized nondeterministic finite automaton. Arden’s Rule.

Regular expressions to Automata

Generalized nondeterministic finite automaton.

def convert_to_regular_rxpression(G):
   if G.number_of_states == 2:
      return

Solving right-linear equation by Arden’s rule.

Homework 7

https://assets.press.princeton.edu/chapters/i11348.pdf

https://www.gwern.net/docs/math/1973-knuth.pdf

Steps to convert regular expressions directly to regular grammars and vice versa

https://cs.stackexchange.com/questions/68575/steps-to-convert-regular-expressions-directly-to-regular-grammars-and-vice-versa

Free Grammars. LL.

Deterministic context-free languages.

FAQ

Are axioms in math equivalent to production rules in unrestricted grammars?

how could a CFG hold an unrestricted grammar (UG), which is turing complete?

https://stackoverflow.com/questions/61952877/chomsky-hierarchy-examples-with-real-languages?rq=1

Notes and references

[1] M. Johnson, “Formal Grammars Handout written,” 2012 [Online]. Available: https://web.stanford.edu/class/archive/cs/cs143/cs143.1128/handouts/080 Formal Grammars.pdf. [Accessed: 28-Feb-2022]

Grammar

We describe the syntax of language as grammar but not semantics.

Definition

A grammar is a 4-tuple $(V,T,S,P) ,$

where $V$ is a finite set of objects called variables or nonterminals,

$T$ is a finite set of objects, disjoint from $V$ , called terminal symbols, in short terminals, and sometimes tokens because it is a terminal identifier.

$S\in V$ is a special symbol called the start variable,

$P$ is a finite set of rules, called production rules, with each rule being a variable a string of variables and terminals.

If production rules are of the form

x\to y

where $x \in (V \cup T)^+$ , $y \in (V \cup T)^*$ . Some authors call x and y Head and Body respectively. Others call x and y left-hand side (LHS) and right-hand side (RHS).

The production rules are applied in the following manner: Given a string $w_1=uxv$

Rule	Application	Result
$x \to y$	$w_1=u\bold{x}v$	$w_2=u\bold{y}v$

We say that $w_1$ derives $w_2$ o that $w_2$ is derived from $w_1$ .

A rule can be used whenever it’s applicable.

If $w_1\to w_2\to ...\to w_n,$

we say that $w_1$ derives $w_n$ and writes

$w_1 \to^* w_n$ . The * indicates an unspecified number of steps (including zero).

Informally, grammar consists of terminals, nonterminals, a start symbol, and productions.

Terminals.

classDiagram
    class TerminalSymbol
    TerminalSymbol : token // id or name
    TerminalSymbol : lexeme // regular expression

Nonterminals are syntactic variables that can be replaced, so they denote a set of strings.

Start symbol.

Productions. We may follow a notation for production rules, which is a particular form.

BNF. A particular form of notation for grammar (Backus-Naur form).

CNF. A particular form of notation for grammar (Chomsky normal form).

Bottom-up

graph LR
    subgraph scanning
        W -. find the body X that match w=uXv .-> Body
    end
    subgraph substitution
        Body -. substitute body X to head Y, w=uYv.-> Head
        Head -- search if it is not StartSymbol --> W
    end

Backtracking.

Chomsky normal form (CNF) example

$V=\{P,S,M\}$ , $T=\{+,*,1,2,3,4\}$ , $S\in V$

P \to S \\ S \to S+M|M\\ M \to M*T|T\\ T \to 1|2|3|4

Backus–Naur form (BNF) example

$V=\{P,S,M\}$ , $T=\{+,*,1,2,3,4\}$ , $S\in V$

<P> ::= <S> \\ <S> ::= <S> + <M> | <M>\\ <M> ::= <M> * <T> | <T>\\ <T> ::= 1 | 2 | 3 | 4

Derivations

<P>a<Q> ::= 1Q1\\ <P> ::= b \\ <Q> ::= 2

Parse trees and derivations

Ambiguity

Writing a grammar

Verifying the language generated by a grammar

The language generated by $G$ . Let $G=(V,T,S,P) ,$ be a grammar. Then the set

L(G)=\{w\in T^*:S\to^* w\}

is the language developed by G.

Grammar Hierarchy

Set view

https://www.geeksforgeeks.org/chomsky-hierarchy-in-theory-of-computation/

Table view

Grammar type	Grammar accepted	Language Accepted	Automaton	Production rule form
Type 0. These are the most general.	Free or unrestricted grammar	Recursively enumerable language	Turing Machine	$u\to v$ , where both u and v are arbitrary strings of symbols in V, with u non-null.
Type 1	Context-sensitive grammars	Context-sensitive language	Linear-bounded automaton	$uXw\to uvw,$ where $u$ , $v$ , and $w$ are arbitrary strings of symbols in $V$ , with $v$ non-null, and $X$ a single nonterminal. In other words, in a particular context.
Type 2	Context-free grammars	Context-free language	Pushdown automaton	$X\to v$ , where $v$ is an arbitrary string of symbols in $V$ , and $X$ is a single nonterminal. i.e. regardless of context.
Type 3. These grammars are the most limited in terms of expressive power.	Regular grammars	Regular language	Finite-state automaton	$X \to a$ , $X \to aY$ (right-regular grammar), $X \to Ya$ (left-regular grammar), $X \to \epsilon$ , where $X$ and $Y$ are nonterminals and $a$ is a terminal. Regular grammar is either right-linear grammar or left-linear grammar, but not both. Mixing left-regular and right-regular rules are the context-free grammar.

https://www.cs.utexas.edu/users/novak/cs343283.html

Augmented Grammar

Definite clause grammars

Worked examples

Automata and Grammar

Homework 6 Automata & Grammars

Context-free grammars

TODO: Context-Free Grammars Versus Regular expressions

Classification

From Berkley Prof. Sen CS 164 Lecture 8-9

Notional conventions

Derivations

Ambiguity

Conversión de gramáticas ambiguas a no ambiguas.

Estrategias para el manejo de la ambigüedad en los lenguajes.

Problemas equivalentes a determinar si una gramática es ambigua.

Complejidad de determinar si una gramática es ambigua.

Writing a grammar

Eliminating Ambiguity

Elimination of Left Recursion

A left recursive grammar has a nonterminal $A$ such that there is a

derivation for some string $a$

A\to^+ Aa

Top-down parsing methods cannot handle left-recursive, so

Method.

Given a grammar, you replace each left-recursive pair of productions $A\to Aa|\beta$

A\to Aa|\beta

A\to \beta A'\\ A'\to aA'|\epsilon

Example.

E\to E+T|T

We replace recursive pair of productions following the method by

E\to TE'\\ E'\to +TE'|\epsilon

General Algorithm.

References

Elimination of Left Recursion. (2021, October 31). Retrieved from https://cyberzhg.github.io/toolbox/left_rec

L-system or Lindenmayer system

https://runestone.academy/ns/books/published/thinkcspy/Strings/TurtlesandStringsandLSystems.html

Left Factoring

Non-Context-Free Language Context

Computability theory

Turing Machine

Undecidable problems

Reductions

Reduce the test question as “what’s the next number in this sequence” into polynomials.
https://twitter.com/b_subercaseaux/status/1622433295807172609

FAQ

Continuous "recursive iteration"

Self-replicating programs. Quine functions (computing).

intron functions computing

function twice(x) {
   console.log(x); // Action 
   console.log("'" + x + "'"); // Template
}
// If language allows read function code
twice(twice.toString())
// If language doesn't allow read function code
twice("function twice(x) {console.log(x);console.log(\"'\" + x + \"'\")}")

// Full Self replication
function self_reply() {
   function twice(x) {
      console.log(x); // Action
      console.log("'" + x + "'") // Template
   }
   twice(twice.toString()+"\ntwice(twice.toString())") // Execution
}

// Another option
function self_reply() {
    A = f.toString() // Template
    function f() {
      console.log("MY Instructions"); // Payload 
      console.log(A+"f()"); // Reflection, Reproduction, and mutation
    }
    f()
}

// Another option
(function quine() {
  console.log(quine.toString());
})();

// Other option
$=_=>`$=${$};$();$()`

Conferences, N. (2020, February 26). The Art of Code - Dylan Beattie. Youtube. Retrieved from https://www.youtube.com/watch?v=6avJHaC3C2U&ab_channel=NDCConferences

Fridman, L. (2020, July 11). Self-replicating Python code | Quine. Youtube. Retrieved from https://www.youtube.com/watch?v=a-zEbokJAgY&ab_channel=LexFridman

Computational complexity theory

A high-level overview of NP

NP

NP are all languages where one can verify membership quickly.

NP are all languages where one can test membership quickly.

Certificate concept.

Does $NP=P?$

Cook-Levin Theorem: $SAT \in P \implies P=NP$ . Proof later.

Def. $A$

Theorem.

Not always “good” algorithms to solve problems. But many problems we think about can be checked in polynomial time or solved by brute force in exponential time.

NP-Completeness

Schema of reductions.

TODO: A Boolean formula $\phi$ is in CNF that consists of literals (a variable or a negated variable) and clauses (an OR of literals).

$3SAT=\{<\phi>|\phi \text{ is a satisfiable 3CNF formula} \}$ , a special case of SAT.

A clique in a graph is a collection of points that are all pairwise connected by lines. k-clique in a graph is a subset of k nodes all directly connected by edges.

Theorem. $3SAT\le_P CLIQUE$

We’re going to reduce the 3SAT problem into CLIQUE.

Reduction.

Nodes are literals.

Edges. G has all non-forben deges.

Algorithms

WalkSAT

Cook-Levin Theorem

References

https://www.claymath.org/sites/default/files/pvsnp.pdf

References

https://www.cs.toronto.edu/~sacook/math_teachers.pdf

FAQ

If we have an input has a size S(n), does our algorithm has to be at least the size, formally $S(n)=O(T(n))$ ? Suppose the permutations of a string with $n$ , it have $S(n)=n!$ permutations, so our algorithm checks the relation $S(n)=n!=O(T(n))$ .

No. Sometimes you can write an algorithm $T(n)=O(S(n))$ .

Graph grammars

Introduction to graph grammars with applications to semantic networks

P system

The Rough P System: Simulation of Logic Gates and Basic DB Tasks using Rough P System

TODO

“We tend to forget that every problem we solve is a special case of some recursively unsolvable problem!” Knuth, D. E. (1973). The dangers of computer-science theory. In Studies in Logic and the Foundations of Mathematics (Vol. 74, pp. 189-195). Elsevier.

21] D.E. Knuth, On the translation of languages from left to right, Information and Control 8 (1965) 607–639.

[22] D.E. Knuth, A characterization of parenthesis languages, Information and Control 11 (1967) 269–289.

Rice’s theorem