 ⚛️

# Theory of Computation

Tags Computer scienceMathScienceautomata theory @September 4, 2022 8:36 AM @December 9, 2022 11:47 AM

# Resources

#### History

NameTags
Machines, Languages, and Computation at MITPeter J. DenningTheoretical Models of Computationanecdotes
Lectures on computationRichard P. Feynman

# Introduction

## What is the Theory of computation?

mathematical cybernetics

ACM SIGACT

## Ecosystem

Standards, jobs, industry, roles, …

# Chapter Name

https://kroki.io/#try

Prove

## Worked examples

1. Mathematics or Code. Automatic Verification such as Testing or Lean Proven?
1. Languages in Anki.

# Strings and Languages

## Definitions.

Alphabet. Any non-empty finite set. We generally use either capital Greek letters $\Sigma ,\Gamma$﻿ , or uppercase letters $A,B,...$﻿ to designate alphabets.

Symbol. The members of an Alphabet.

Examples: $\Sigma_1=\{0,1\};\Sigma_2=\{a,b,c,d,e,f,g,h\};\Gamma=\{0,1,x,y,z\}$﻿

String over an alphabet. A finite sequence of symbols from that alphabet.

Length of a string. If $w$﻿ is a string over $\Sigma$﻿, the length of $w$﻿, written $|w|$﻿, is the number of symbols that it contains. Some authors prefer to use $n_a(w)$﻿ for indicating some $a$﻿ characteristic applied on $w$﻿.

Empty string. $|w|=\epsilon=\lambda=0$﻿

Language. Set of strings. We generally use upper-case letters $A,B,...$﻿ to designate languages.

Lexicographic ordering.

1. Shorter strings precede longer strings.

2. The first symbol in the alphabet precedes the next symbol.

## Regular operations.

Let $A,B$﻿ be languages:

Union: $A\cup B=\{w|w\in A\text{ or }w\in B\}$﻿

Concatenation: $A\cdot B=\{xy|x\in A\text{ or }y\in B\}=AB$﻿

$A^k=A \overset{\text{k times}}{ A...A},A^1=A,A^0=\{\epsilon\}$﻿

Kleene star: $A^*=\{x_1...x_k|\text{ each }x_i\in A \text{ for } k\ge 0\}=A^0\cup A^1\cup A^2 \cup A^3 \cup...$﻿, note $\epsilon\in A^*$﻿ always.

Kleene plus: $A^+=A^1\cup A^2\cup A^3\cup...=V^*V$﻿

Example. Let $A=\{good,bad\}$﻿ and $B=\{boy,girl\}$﻿

$A\cup B=\{good,bad,boy,girl\}$﻿

$AB=\{goodboy,goodgirl,badboy,badgirl\}$﻿

$A^2=\{goodgood,goodbad,badgood,badbad\}$﻿

$A^*=\{\epsilon,good,bad,goodgood,bgoodbad,badgood,badbad,goodgood,good,goodgoodbad,...\}$﻿

## Operator precedence

 Operator Precedence () 6 *+? 5 concatenation 3 | 2 @import url('https://cdnjs.cloudflare.com/ajax/libs/KaTeX/0.13.2/katex.min.css')$a\in Sigma$﻿ 1

Examples:

ax|b → (ax)|b

ab*|c → (a(b*))|c

ab?* → a((b?)*)

ab*? → a((b*)?)

Note: L(a((b?)*)) = L(a((b*)?))

If two operators have the same precedence, then they generate the same language. ??

https://www.oreilly.com/library/view/learning-awk-programming/9781788391030/ceafcbc7-fe9e-4c33-b899-ed2f48e7e659.xhtml

https://www.boost.org/doc/libs/1_56_0/libs/regex/doc/html/boost_regex/syntax/basic_extended.html#boost_regex.syntax.basic_extended.operator_precedence

Homework

# Classification

• Acceptors
• Transducers
• Generators (also called sequencers)

## Steps to convert regular expressions directly to regular grammars and vice versa

https://cs.stackexchange.com/questions/68575/steps-to-convert-regular-expressions-directly-to-regular-grammars-and-vice-versa

Free Grammars. LL.

Deterministic context-free languages.

# FAQ

Are axioms in math equivalent to production rules in unrestricted grammars?

how could a CFG hold an unrestricted grammar (UG), which is turing complete?

https://stackoverflow.com/questions/61952877/chomsky-hierarchy-examples-with-real-languages?rq=1

## Notes and references

 M. Johnson, “Formal Grammars Handout written,” 2012 [Online]. Available: https://web.stanford.edu/class/archive/cs/cs143/cs143.1128/handouts/080 Formal Grammars.pdf. [Accessed: 28-Feb-2022]

# Finite state machine

Representations.

Moore and Mealy. Responses are all kinds of values.

DFA and NFA. Responses are 0 or 1.

Programming.

## Worked examples

1. Make a program in C++ that simple emails by argument (Hint: use a library).

# Finite automata

Finite automata and their probabilistic counterpart Markov chains. Finite Automaton (FA) or Finite-State Machine. An automaton (Automata in plural).

• The computer is complex, instead, we use an idealized computer called a computational model.
• A finite automaton: State diagram of a finite automaton M1. It has two states.

We feed the input string $01$﻿ to the machine $M_1$﻿, the processing proceeds as follows:

1. Start in state $q_1$﻿.
1. Read 1, follow the transition from $q_1$﻿ to $q_2$﻿.
1. Read 1, follow the transition from $q_2$﻿ to $q_1$﻿.
1. Accept because $M_1$﻿ is in an accept state $q_2$﻿ at the end of the input.

## The formal definition of a deterministic finite automaton (DFA)

A finite automaton is a 5-tuple $(Q,\Sigma,\delta,q_0,F)$﻿ where

1. $Q$﻿ is a finite set called the states,
1. $\Sigma$﻿ is a finite set called the alphabet,
1. $\delta:Q\times \Sigma\rarr Q$﻿ is the transition function,
1. $q_0 \in Q$﻿ is the start state, and
1. $F\subseteq Q$﻿ is the set accept states, they are sometimes called final states.

## The formal definition of a nondeterministic finite automaton (NFA)

A finite automaton is a 5-tuple $(Q,\Sigma,\delta,q_0,F)$﻿ where

1. $Q$﻿ is a finite set called the states,
1. $\Sigma$﻿ is a finite set called the alphabet,
1. $\delta:Q\times (\Sigma\cup\{\lambda\})\rarr P(Q)=\{R|R \subseteq Q\}$﻿ is the transition function (P is a power set),
1. $q_0 \in Q$﻿ is the start state, and
1. $F\subseteq Q$﻿ is the set accept states, they are sometimes called final states.

## Converting NFAs to DFAs

delta = {
'state1': {
'symbol1': {'state1', 'state2',...}
}
}
# BFS
NFAToDFA(NFA)



## FAQ

What is the difference between finite automata and finite state machines?

Pattern Matching

Extended finite-state machine

https://www.seas.upenn.edu/~lee/10cis541/lecs/EFSM-Code-Generation-v4-1x2.pdf

# Regular expressions. Regex.

Regular expressions, in short regex, are expressions that generate regular languages.

Although some modern regex engines support regular expressions, they support nonregular languages (Regular Expression Recursion) too. It can be misleading, but we use the above definition forward.

POSIX regular expressions. BRE (Basic Regular Expressions), ERE (Extended Regular Expressions), and SRE

Flavors. Perl and PCRE.

https://gist.github.com/CMCDragonkai/6c933f4a7d713ef712145c5eb94a1816

Regex engine.

perl vs awk

https://unix.stackexchange.com/questions/114591/is-perl-ever-a-better-tool-than-awk-for-text-processing

Accepters are equivalent to regular expressions

Are traducers equivalent to regular expressions?

## Worked example

• Let a regular expression $a^*b^*z^*$﻿, make a new regular expression that removes $\lambda$﻿ from generated language.

XYZ|XZ|YZ|XZ|X|Y|Z

$\sum_i^n C(n,i)=2^n-1$﻿

• Make an expression on PCRE 4.0 or above that checks if a string is a palindrome.

## Tools

https://regex101.com/

https://regexr.com/

AWK

Roman numbers: https://wiki.imperivm-romanvm.com/wiki/Roman_Numerals

## RegEq

Check equivalence of regular expressions

https://bakkot.github.io/dfa-lib/regeq.html

https://regex-equality.herokuapp.com/

# TODO

“We tend to forget that every problem we solve is a special case of some recursively unsolvable problem!” Knuth, D. E. (1973). The dangers of computer-science theory. In Studies in Logic and the Foundations of Mathematics (Vol. 74, pp. 189-195). Elsevier.

21] D.E. Knuth, On the translation of languages from left to right, Information and Control 8 (1965) 607–639.

 D.E. Knuth, A characterization of parenthesis languages, Information and Control 11 (1967) 269–289.

Rice’s theorem

# Grammar

We describe the syntax of language as grammar but not semantics.

## Definition

A grammar is a 4-tuple $(V,T,S,P) ,$﻿

where $V$﻿ is a finite set of objects called variables or nonterminals,

$T$﻿ is a finite set of objects, disjoint from $V$﻿, called terminal symbols, in short terminals, and sometimes tokens because it is a terminal identifier.

$S\in V$﻿ is a special symbol called the start variable,

$P$﻿ is a finite set of rules, called production rules, with each rule being a variable a string of variables and terminals.

If production rules are of the form

where $x \in (V \cup T)^+$﻿, $y \in (V \cup T)^*$﻿. Some authors call x and y Head and Body respectively. Others call x and y left-hand side (LHS) and right-hand side (RHS).

The production rules are applied in the following manner: Given a string $w_1=uxv$﻿

 Rule Application Result @import url('https://cdnjs.cloudflare.com/ajax/libs/KaTeX/0.13.2/katex.min.css')$x \to y$﻿ @import url('https://cdnjs.cloudflare.com/ajax/libs/KaTeX/0.13.2/katex.min.css')$w_1=u\bold{x}v$﻿ @import url('https://cdnjs.cloudflare.com/ajax/libs/KaTeX/0.13.2/katex.min.css')$w_2=u\bold{y}v$﻿

We say that $w_1$﻿ derives $w_2$﻿ o that $w_2$﻿ is derived from $w_1$﻿.

A rule can be used whenever it’s applicable.

If $w_1\to w_2\to ...\to w_n,$﻿

we say that $w_1$﻿ derives $w_n$﻿ and writes

$w_1 \to^* w_n$﻿. The * indicates an unspecified number of steps (including zero).

Informally, grammar consists of terminals, nonterminals, a start symbol, and productions.

1. Terminals.
classDiagram
class TerminalSymbol
TerminalSymbol : token // id or name
TerminalSymbol : lexeme // regular expression
1. Nonterminals are syntactic variables that can be replaced, so they denote a set of strings.
1. Start symbol.
1. Productions. We may follow a notation for production rules, which is a particular form.

BNF. A particular form of notation for grammar (Backus-Naur form).

CNF. A particular form of notation for grammar (Chomsky normal form).

## Bottom up

graph LR
subgraph scanning
W -. find the body X that match w=uXv .-> Body
end
subgraph substitution
Head -- search if it is not StartSymbol --> W
end


Backtracking.

## Chomsky normal form (CNF) example

$V=\{P,S,M\}$﻿, $T=\{+,*,1,2,3,4\}$﻿, $S\in V$﻿

## Backus–Naur form (BNF) example

$V=\{P,S,M\}$﻿, $T=\{+,*,1,2,3,4\}$﻿, $S\in V$﻿

## Verifying the language generated by a grammar

The language generated by $G$﻿. Let $G=(V,T,S,P) ,$﻿ be a grammar. Then the set

is the language developed by G.

## Grammar Hierarchy

### Table view

 Grammar type Grammar accepted Language Accepted Automaton Production rule form Type 0. These are the most general. Free or unrestricted grammar Recursively enumerable language Turing Machine @import url('https://cdnjs.cloudflare.com/ajax/libs/KaTeX/0.13.2/katex.min.css')$u\to v$﻿, where both u and v are arbitrary strings of symbols in V, with u non-null. Type 1 Context-sensitive grammars Context-sensitive language Linear-bounded automaton @import url('https://cdnjs.cloudflare.com/ajax/libs/KaTeX/0.13.2/katex.min.css')$uXw\to uvw,$﻿ where @import url('https://cdnjs.cloudflare.com/ajax/libs/KaTeX/0.13.2/katex.min.css')$u$﻿,@import url('https://cdnjs.cloudflare.com/ajax/libs/KaTeX/0.13.2/katex.min.css')$v$﻿, and @import url('https://cdnjs.cloudflare.com/ajax/libs/KaTeX/0.13.2/katex.min.css')$w$﻿ are arbitrary strings of symbols in @import url('https://cdnjs.cloudflare.com/ajax/libs/KaTeX/0.13.2/katex.min.css')$V$﻿, with @import url('https://cdnjs.cloudflare.com/ajax/libs/KaTeX/0.13.2/katex.min.css')$v$﻿ non-null, and @import url('https://cdnjs.cloudflare.com/ajax/libs/KaTeX/0.13.2/katex.min.css')$X$﻿ a single nonterminal. In other words, in a particular context. Type 2 Context-free grammars Context-free language Pushdown automaton @import url('https://cdnjs.cloudflare.com/ajax/libs/KaTeX/0.13.2/katex.min.css')$X\to v$﻿, where @import url('https://cdnjs.cloudflare.com/ajax/libs/KaTeX/0.13.2/katex.min.css')$v$﻿ is an arbitrary string of symbols in @import url('https://cdnjs.cloudflare.com/ajax/libs/KaTeX/0.13.2/katex.min.css')$V$﻿, and @import url('https://cdnjs.cloudflare.com/ajax/libs/KaTeX/0.13.2/katex.min.css')$X$﻿ is a single nonterminal. i.e. regardless of context. Type 3. These grammars are the most limited in terms of expressive power. Regular grammars Regular language Finite-state automaton @import url('https://cdnjs.cloudflare.com/ajax/libs/KaTeX/0.13.2/katex.min.css')$X \to a$﻿, @import url('https://cdnjs.cloudflare.com/ajax/libs/KaTeX/0.13.2/katex.min.css')$X \to aY$﻿ (right-regular grammar), @import url('https://cdnjs.cloudflare.com/ajax/libs/KaTeX/0.13.2/katex.min.css')$X \to Ya$﻿ (left-regular grammar), @import url('https://cdnjs.cloudflare.com/ajax/libs/KaTeX/0.13.2/katex.min.css')$X \to \epsilon$﻿, where @import url('https://cdnjs.cloudflare.com/ajax/libs/KaTeX/0.13.2/katex.min.css')$X$﻿ and @import url('https://cdnjs.cloudflare.com/ajax/libs/KaTeX/0.13.2/katex.min.css')$Y$﻿ are nonterminals and @import url('https://cdnjs.cloudflare.com/ajax/libs/KaTeX/0.13.2/katex.min.css')$a$﻿ is a terminal. Regular grammar is either right-linear grammar or left-linear grammar, but not both. Mixing left-regular and right-regular rules are the context-free grammar.

https://www.cs.utexas.edu/users/novak/cs343283.html

# Context-free grammars

TODO: Context-Free Grammars Versus Regular expressions

## Ambiguity

1. Conversión de gramáticas ambiguas a no ambiguas.
1. Estrategias para el manejo de la ambigüedad en los lenguajes.
1. Problemas equivalentes a determinar si una gramática es ambigua.
1. Complejidad de determinar si una gramática es ambigua.

## Writing a grammar

### Elimination of Left Recursion

A left recursive grammar has a nonterminal $A$﻿ such that there is a

derivation for some string $a$﻿

Top-down parsing methods cannot handle left-recursive, so

Method.

Given a grammar, you replace each left-recursive pair of productions $A\to Aa|\beta$﻿

by

Example.

We replace recursive pair of productions following the method by

General Algorithm.

References

# Models of computation

https://gist.github.com/sanchezcarlosjr/333b9a774c0fa57f55532d772b13c946

https://cs.brown.edu/people/jsavage/book/pdfs/ModelsOfComputation_Chapter4.pdf

## Model of computation

 Automaton (Mathematics model) Paradigm Turing Machine Imperative (procedural) programming Lambda-calculus Functional programming Pi-calculus Event-based and distributed programming OOP Logic

# Regular expressions to Automata

1. Generalized nondeterministic finite automaton.
def convert_to_regular_rxpression(G):
if G.number_of_states == 2:
return 
1. Solving right-linear equation by Arden’s rule.

https://assets.press.princeton.edu/chapters/i11348.pdf

https://www.gwern.net/docs/math/1973-knuth.pdf