# Theory of Computation

Tags | Computer scienceMathScienceautomata theory |
---|---|

Created | |

Updated |

# Requisites

# Resources

#### Foundations

#### History

Name | Tags |
---|---|

Machines, Languages, and Computation at MIT | Peter J. DenningTheoretical Models of Computationanecdotes |

Lectures on computation | Richard P. Feynman |

#### References

# Introduction

## What is the Theory of computation?

mathematical cybernetics

## Why does the Theory of computation matter to you?

## Research

## Ecosystem

Standards, jobs, industry, roles, …

## Story

## FAQ

## Worked examples

# Chapter Name

Prove

## Notes

## Worked examples

- Mathematics or Code. Automatic Verification such as Testing or Lean Proven?

- Languages in Anki.

## FAQ

## Further resources

# Strings and Languages

## Definitions.

**Alphabet.** Any non-empty finite set. We generally use either capital Greek letters $\Sigma ,\Gamma$ , or uppercase letters $A,B,...$ to designate alphabets.

**Symbol.** The members of an Alphabet.

Examples: $\Sigma_1=\{0,1\};\Sigma_2=\{a,b,c,d,e,f,g,h\};\Gamma=\{0,1,x,y,z\}$

**String over an alphabet.** A finite sequence of symbols from that alphabet.** **

**Length of a string. **If $w$ is a string over $\Sigma$, the length of $w$, written $|w|$, is the number of symbols that it contains. Some authors prefer to use $n_a(w)$ for indicating some $a$ characteristic applied on $w$.

**Empty string. ****$|w|=\epsilon=\lambda=0$**

**Language.** Set of strings. We generally use upper-case letters $A,B,...$ to designate languages.

**Lexicographic ordering.**

1. Shorter strings precede longer strings.

2. The first symbol in the alphabet precedes the next symbol.

## Regular operations.

Let $A,B$ be languages:

**Union:** $A\cup B=\{w|w\in A\text{ or }w\in B\}$

**Concatenation: ****$A\cdot B=\{xy|x\in A\text{ or }y\in B\}=AB$**

$A^k=A \overset{\text{k times}}{ A...A},A^1=A,A^0=\{\epsilon\}$

**Kleene star: ****$A^*=\{x_1...x_k|\text{ each }x_i\in A \text{ for } k\ge 0\}=A^0\cup A^1\cup A^2 \cup A^3 \cup...$****, **note $\epsilon\in A^*$ always.

**Kleene plus: ****$A^+=A^1\cup A^2\cup A^3\cup...=V^*V$**

Example. Let $A=\{good,bad\}$ and $B=\{boy,girl\}$

$A\cup B=\{good,bad,boy,girl\}$

$AB=\{goodboy,goodgirl,badboy,badgirl\}$

$A^2=\{goodgood,goodbad,badgood,badbad\}$

$A^*=\{\epsilon,good,bad,goodgood,bgoodbad,badgood,badbad,goodgood,good,goodgoodbad,...\}$

## Operator precedence

Operator | Precedence |

() | 6 |

*+? | 5 |

concatenation | 3 |

| | 2 |

$a\in Sigma$ | 1 |

Examples:

ax|b → (ax)|b

ab*|c → (a(b*))|c

ab?* → a((b?)*)

ab*? → a((b*)?)

Note: L(a((b?)*)) = L(a((b*)?))

If two operators have the same precedence, then they generate the same language. ??

## Worked examples

**Classification**

**Acceptors**

**Transducers**

**Generators**(also called s**equencers**)

https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/40342.pdf

## Worked examples

**Steps to convert regular expressions directly to regular grammars and vice versa**

Free Grammars. LL.

Deterministic context-free languages.

# FAQ

**Are axioms in math equivalent to production rules in unrestricted grammars?**

how could a CFG hold an unrestricted grammar (UG), which is turing complete?

https://stackoverflow.com/questions/61952877/chomsky-hierarchy-examples-with-real-languages?rq=1

## Notes and references

[1] M. Johnson, “Formal Grammars Handout written,” 2012 [Online]. Available: https://web.stanford.edu/class/archive/cs/cs143/cs143.1128/handouts/080 Formal Grammars.pdf. [Accessed: 28-Feb-2022]

# Finite state machine

Representations.

Moore and Mealy. Responses are all kinds of values.

DFA and NFA. Responses are 0 or 1.

Programming.

## Worked examples

- Make a program in C++ that simple emails by argument (Hint: use a library).

# Moore and Mealy automata

## Boolean Algebra and Digital Logic. Circuit conversion. From Moore to Circuit.

# Finite automata

Finite automata and their probabilistic counterpart **Markov chains**. Finite Automaton (FA) or Finite-State Machine. An automaton (Automata in plural).

- The computer is complex, instead, we use an idealized computer called a computational model.

- A finite automaton:

We feed the input string $01$ to the machine $M_1$, the processing proceeds as follows:

- Start in state $q_1$.

- Read 1, follow the transition from $q_1$ to $q_2$.

- Read 1, follow the transition from $q_2$ to $q_1$.

- Accept because $M_1$ is in an accept state $q_2$ at the end of the input.

## The formal definition of a deterministic finite automaton (DFA)

A finite automaton is a 5-tuple $(Q,\Sigma,\delta,q_0,F)$ where

- $Q$ is a finite set called the
**states**,

- $\Sigma$ is a finite set called the
**alphabet**,

- $\delta:Q\times \Sigma\rarr Q$ is the
**transition function**,

- $q_0 \in Q$ is the
**start state**, and

- $F\subseteq Q$ is the
**set accept states,**they are sometimes called**final states.**

## The formal definition of a nondeterministic finite automaton (NFA)

A finite automaton is a 5-tuple $(Q,\Sigma,\delta,q_0,F)$ where

- $Q$ is a finite set called the
**states**,

- $\Sigma$ is a finite set called the
**alphabet**,

- $\delta:Q\times (\Sigma\cup\{\lambda\})\rarr P(Q)=\{R|R \subseteq Q\}$ is the
**transition function**(P is a power set),

- $q_0 \in Q$ is the
**start state**, and

- $F\subseteq Q$ is the
**set accept states,**they are sometimes called**final states.**

## Converting NFAs to DFAs

```
delta = {
'state1': {
'symbol1': {'state1', 'state2',...}
}
}
# BFS
NFAToDFA(NFA)
```

## FAQ

**What is the difference between finite automata and finite state machines?**

Pattern Matching

https://www.seas.upenn.edu/~lee/10cis541/lecs/EFSM-Code-Generation-v4-1x2.pdf

# Automata and Grammar

# Regular expressions. Regex.

Regular expressions, in short regex, are expressions that generate regular languages.

Although some modern regex engines support regular expressions, they support nonregular languages (Regular Expression Recursion**) **too. It can be misleading, but we use the above definition forward.

POSIX regular expressions. BRE (Basic Regular Expressions), ERE (Extended Regular Expressions), and SRE

Flavors. **Perl and PCRE.**

https://gist.github.com/CMCDragonkai/6c933f4a7d713ef712145c5eb94a1816

Regex engine.

perl vs awk

Accepters are equivalent to regular expressions

Are traducers equivalent to regular expressions?

## Worked example

## Let a regular expression $a^*b^*z^*$, make a new regular expression that removes $\lambda$ from generated language.

XYZ|XZ|YZ|XZ|X|Y|Z

$\sum_i^n C(n,i)=2^n-1$

## Make an expression on PCRE 4.0 or above that checks if a string is a palindrome.

## Tools

AWK

Roman numbers: https://wiki.imperivm-romanvm.com/wiki/Roman_Numerals

## Make a regular expression to capture roman numbers.

## Make a program that converts roman numbers to Arabia numbers using regular expressions.

https://github.com/sanchezcarlosjr/compilers-uabc/blob/main/lexer-awk/roman_to_arabig.awk

## RegEq

Check equivalence of regular expressions

https://bakkot.github.io/dfa-lib/regeq.html

https://regex-equality.herokuapp.com/

Goyvaerts, J. (2022, December 02). Regular Expressions Reference. Retrieved from https://www.regular-expressions.info/reference.html

# TODO

“We tend to forget that every problem we solve is a special case of some recursively unsolvable problem!” Knuth, D. E. (1973). The dangers of computer-science theory. In *Studies in Logic and the Foundations of Mathematics* (Vol. 74, pp. 189-195). Elsevier.

21] D.E. Knuth, On the translation of languages from left to right, Information and Control 8 (1965) 607–639.

[22] D.E. Knuth, A characterization of parenthesis languages, Information and Control 11 (1967) 269–289.

**Rice’s theorem**

# Grammar

We describe the syntax of language as grammar but not *semantics*.

## Definition

A grammar is a 4-tuple $(V,T,S,P) ,$

where $V$ is a finite set of objects called **variables **or nonterminals,

$T$ is a finite set of objects, disjoint from $V$, called **terminal symbols**, in short terminals, and sometimes tokens because it is a terminal identifier.

$S\in V$ is a special symbol called the **start variable**,

$P$ is a finite set of **rules, **called** production rules**, with each rule being a variable a string of variables and terminals.

If production rules are of the form

where $x \in (V \cup T)^+$, $y \in (V \cup T)^*$. Some authors call x and y Head and Body respectively. Others call x and y left-hand side (LHS) and right-hand side (RHS).

The production rules are applied in the following manner: Given a string $w_1=uxv$

Rule | Application | Result |

$x \to y$ | $w_1=u\bold{x}v$ | $w_2=u\bold{y}v$ |

We say that $w_1$ **derives** $w_2$ o that $w_2$ is derived from $w_1$.

A rule can be used whenever it’s applicable.

If $w_1\to w_2\to ...\to w_n,$

we say that $w_1$ derives $w_n$ and writes

$w_1 \to^* w_n$. The * indicates an unspecified number of steps (including zero).

Informally, grammar consists of terminals, nonterminals, a start symbol, and productions.

- Terminals.

```
classDiagram
class TerminalSymbol
TerminalSymbol : token // id or name
TerminalSymbol : lexeme // regular expression
```

- Nonterminals are syntactic variables that can be replaced, so they denote a set of strings.

- Start symbol.

- Productions. We may follow a notation for production rules, which is a particular form.

**BNF. **A particular form of notation for grammar (Backus-Naur form).

**CNF. **A particular form of notation for grammar (Chomsky normal form).

## Bottom up

```
graph LR
subgraph scanning
W -. find the body X that match w=uXv .-> Body
end
subgraph substitution
Body -. substitute body X to head Y, w=uYv.-> Head
Head -- search if it is not StartSymbol --> W
end
```

Backtracking.

## Chomsky normal form (CNF) example

$V=\{P,S,M\}$, $T=\{+,*,1,2,3,4\}$, $S\in V$

**Backus–Naur form (BNF) example**

$V=\{P,S,M\}$, $T=\{+,*,1,2,3,4\}$, $S\in V$

## Derivations

## Parse trees and derivations

## Ambiguity

## Writing a grammar

## Verifying the language generated by a grammar

The **language generated by ****$G$****. **Let $G=(V,T,S,P)
,$ be a grammar. Then the set

is the language developed by G.

## Grammar Hierarchy

### Set view

### Table view

Grammar type | Grammar accepted | Language Accepted | Automaton | Production rule form |

Type 0. These are the most general. | Free or unrestricted grammar | Recursively enumerable language | Turing Machine | $u\to v$, where both u and v are arbitrary strings of symbols in V, with u non-null. |

Type 1 | Context-sensitive grammars | Context-sensitive language | Linear-bounded automaton | $uXw\to uvw,$ where $u$,$v$, and $w$ are arbitrary strings of symbols in $V$, with $v$ non-null, and $X$ a single nonterminal. In other words, in a particular context. |

Type 2 | Context-free grammars | Context-free language | Pushdown automaton | $X\to v$, where $v$ is an arbitrary string of symbols in $V$, and $X$ is a single nonterminal. i.e. regardless of context. |

Type 3. These grammars are the most limited in terms of expressive power. | Regular grammars | Regular language | Finite-state automaton | $X \to a$, $X \to aY$ (right-regular grammar), $X \to Ya$ (left-regular grammar), $X \to \epsilon$, where $X$ and $Y$ are nonterminals and $a$ is a terminal. Regular grammar is either right-linear grammar or left-linear grammar, but not both. Mixing left-regular and right-regular rules are the context-free grammar. |

https://www.cs.utexas.edu/users/novak/cs343283.html

## Augmented Grammar

## Definite clause grammars

# Context-free grammars

TODO: Context-Free Grammars Versus Regular expressions

## Formal definition

## Classification

## Notional conventions

## Derivations

## Ambiguity

- Conversión de gramáticas ambiguas a no ambiguas.

- Estrategias para el manejo de la ambigüedad en los lenguajes.

- Problemas equivalentes a determinar si una gramática es ambigua.

- Complejidad de determinar si una gramática es ambigua.

## Writing a grammar

### Eliminating Ambiguity

### Elimination of Left Recursion

A left recursive grammar has a nonterminal $A$ such that there is a

derivation for some string $a$

Top-down parsing methods cannot handle left-recursive, so

**Method.**

Given a grammar, you replace each left-recursive pair of productions $A\to Aa|\beta$

by

**Example.**

We replace recursive pair of productions following the method by

**General Algorithm.**

**References**

Elimination of Left Recursion. (2021, October 31). Retrieved from https://cyberzhg.github.io/toolbox/left_rec

### Left Factoring

### Non-Context-Free Language Context

## Worked examples

# Models of computation

https://gist.github.com/sanchezcarlosjr/333b9a774c0fa57f55532d772b13c946

https://cs.brown.edu/people/jsavage/book/pdfs/ModelsOfComputation_Chapter4.pdf

## Model of computation

Automaton (Mathematics model) | Paradigm |

Turing Machine | Imperative (procedural) programming |

Lambda-calculus | Functional programming |

Pi-calculus | Event-based and distributed programming |

OOP | |

Logic |

# Regular expressions to Automata. Generalized nondeterministic finite automaton. Arden’s Rule.

# Regular expressions to Automata

- Generalized nondeterministic finite automaton.

```
def convert_to_regular_rxpression(G):
if G.number_of_states == 2:
return
```

- Solving right-linear equation by Arden’s rule.

https://assets.press.princeton.edu/chapters/i11348.pdf

https://www.gwern.net/docs/math/1973-knuth.pdf