
Artificial Intelligence

compressive sensing (sparse coding),

information theory,

control theory,



operations research,

game theory,

and optimization.


What is AI?

We struggle with what is intelligent, but not artificial.


We use rational agents as our approach. An agent is an entity that can perceive an environment X and can act on it where X can be virtual or physical. How an agent decides to act given all previous considerations is a black box.

So, intelligence or rationality is when an agent makes the best decision in a given environment and goal with constraints and acts in accordance with it. Therefore, we evaluate an agent by its results, not by its mental state.

If an agent makes the best decision in all environments and goals, it is called artificial general intelligence.

If an agent makes the best decision in a specific environment, it is called weak AI.

From a scientific point’s view is understanding complete AI, but from an engineering point’s view is making incomplete, imperfect, and weak artificial intelligence.

Some constraints are lack of knowledge, time to learn, time to execute, money, actuators, and sensors.

But, now black boxes become white boxes. If you are here, then you are interested in white boxes which are computational procedures. You’re going to learn them!

The goal of IA
Agents. How can we create intelligence?
Tools. How can we use IA techniques to solve techniques?

An intelligence agents


How can we make systems that behave as humans?
Are we there yet?
Today, Machines do narrow tasks, humans do broad tasks.

IA Agents:
achieving human-level intelligence

IA tools
Understanding and solving real problems.
Predicting poverty.
Self-driven cars.







Computing Machinery and Intelligence.

The Imitation Game (Turing test).

Player A is a computer who claims to be a man.

Player B is a man.


Total Turing Test.

Loebner Prize.

The Argument from Extrasensory Perception. During Cold War, people were interested in viz, telepathy, and precognition, so Turing prepared an argument for the situation.







State-based models: search problems, MDPs, games

Variable-based models: CSPs, Bayesian networks

Logic-based models: propositional logic, first-order logic

Related work

Cognitive science

Philosophy of mind

Worked examples


I learned in the Theory of computation some problems are undecidable, but I see those problems solved with Artificial Intelligence, how is that?

AI Framework.

Intelligent Agents. Rational agents.


Agent class

Agent = Architecture + Program.
Artificial intelligence’s concern is the
program, an autonomous computional program.
graph TD
subgraph Architecture
  Information --> Program
  Program --> Decision

Worked examples

Further readings

Economic agents.


Autonomic computing

Solving problems by Searching.

Searching problems are optimization problems.
graph TD
  S["S, environment"] -->|"cost(S,action1, A)"| A["A, new environment"]
  S -->|"cost(S,action2, B)"| B["B, new environment"]
  S -->|"cost(S,action3, C)"| C["C, new environment"]
  subgraph possible_solutions3
 subgraph possible_solutions2
  subgraph possible_solutions1
  D-->Goal,["Goal, new environment == goal"]

Searching problem model

    class SearchProblem {
      action_cost(state, action, next_state)
      next_state(state, action)
    class State {
    class SearchingStrategy {
    class Agent {
      searchingStrategy: SearchingStrategy
      problem: SearchProblem
    Agent *-- SearchingStrategy
    Agent *-- SearchProblem
    SearchingStrategy <|-- BFS
    SearchingStrategy <|-- DFS
    SearchingStrategy <|-- AStar
    SearchingStrategy <|-- Dijistra
    SearchingStrategy <|-- LinearProgramming

Searching strategies

Searching strategies

Searching for solutions


We build a tree or graph on demand by searching strategies.

We have got to avoid cycles in order to keep away infinite loops.

Each new state saves the reference's previous state and we only choose valid action, so when our searching algorithm reaches the goal; it reconstructs the path from the start state.

SA...GoalS\to A\to ...\to Goal

Codification matters.

Uninformed search strategies



Informed Search Strategies

A*, Greedy Search, Hill Climbing, Simulated Annealing, Best-First Search

Heuristic Functions

A heuristic is a function that estimates the distance from the current state to the goal.

f(n)f(n) is the real or estimated cost of the nn solution.

g(n)g(n) is the cost to reach nn from the start state. i=0ncost(i,action,i+1)\sum_{i=0}^n cost(i,action,i+1)

h(n)h(n) is the estimated cost to reach the goal state from the nn state, so it uses the available information from the problem or environment state in order to estimate the cost.

h(n)h^*(n) is the real cost to reach the goal state from the nn state.

i=ngcost(i,action,i+1)\sum_{i=n}^g cost(i,action,i+1)

Note h(n)0h(n) \to0 and h(n)0h^*(n)\to 0.


Main idea: estimated heuristic \le actual costs.


0h(x)costs to goal0 \le h(x) \le \text{costs to goal}


h(x)h(y)costs(x,y)h(x)-h(y) \le costs(x,y)



How do you find a heuristic function?

Relax your problems, use available information about the current state or the goal, use min, and max functions, or use some distance functions such as Manhattan distance, Euclidean distance, Hamming distance, … and norms.

Beyond Classical Search.

Heuristic. Safe steps.

Offline, Online.

Solving problems by searching are graph algorithms that generate new nodes by heuristic and safe steps and test them, wrong answers are rejected them.

Hands-on Projects

8-Queen solver

Hanoi tower

Maze solver

Worked examples



How to solve it: Modern Heuristics by Zbigniew Michalewicz, David B. Fogel.

Adversarial Search




matchmaking algorithms

Hands-on Projects

Tic Tac Toe


Pacman v2

Notion – The all-in-one workspace for your notes, tasks, wikis, and databases. (2023, April 20). Retrieved from https://www.chessengines.org

Project 2. (2022, October 13). Retrieved from https://inst.eecs.berkeley.edu/~cs188/fa22/projects/proj2

Variable-based models with Factor Graphs

Now we embark on our journey through variable-based models in which we will think in terms of variables, factors, and weights. In particular, in this section we explore Factor graphs and their special cases: Constraint Satisfaction Problems (CSP), Markov networks and Bayesian networks. We don’t relay anymore in searching all possible solutions anymore. Instead, we assign to variables, int this way, allowing algorithms to infer the variables ordering, etc.

graph TD
    subgraph Variables
    subgraph Factors
    X1 --- f1
    X1 --- f2
    X2 --- f2
    X2 --- f3
    X3 --- f3
    X3 --- f4

Formal definition

Constraint Satisfaction problems are defined by a set of variables XiX_i, each with a domain DiD_i of possible values, and a set of constraints CC. The aim is to find an assignment of the variables XiX_i from the domains DiD_i in such a way that none of the constraints CC are violated. Informally, our goal is to find the best assignment of values to the variables.

Variables-based models

A constraint satisfaction problem consists of three components X, D, and C:

X is a set of variables

D is a set of domains

C is a set of constraints that specify allowable combinations of values


Each assignment x

Objective arg max W

continuous-domain CSPs is of linear programming

Message Passing

Exercises and Projects


Key decisions


Reference Notes




Knowledge, reasoning, and planning

Although traditional logical agents provide us expressiveness in a compact way, they are inherently deterministic and struggle to handle unstructured data. These systems follow predefined rules, making it difficult to manage uncertainty and ambiguity across diverse domains. Additionally, representing and processing unstructured data (e.g., text, images, time series, video) is challenging and often requires significant manual effort and expense. This lack of flexibility limits their ability to generalize across different domains as effectively as modern Deep Learning models.

Logical Agents

Knowledge-based agents

Different syntax, same semantics: 2+3    3+22+3\iff3+2

Same syntax, different semantics: 3/2 (Python 2.7)    3+2 (Python 3)3/2 \text{ (Python 2.7)} \iff3+2 \text{ (Python 3)}

A knowledge base is a set of sentences, each sentence is an assertion about the world given a representation language for a specific domain. Logic consists of syntax, semantics, and inference rules. The formulas by themselves are just symbols (syntax), they don’t provide meaning.

A knowledge-based agent is composed of a knowledge base which depends on domain-specific content and an inference mechanism. They can represent states, actions, and weights, incorporate new percepts, update internal representations of the world, and deduce hidden properties of the world.

Semantics is the interpretation function.

In a declarative approach to building a logical agent, we add new sentences because we tell it what it needs to know and query what is known

Natural Language?

We can save knowledge in different data models and apply different inference mechanisms Knowledge graphs.

Entailment. It adds trivial information to KB.


Contingency. It adds non-trivial information to KB.


Learning formulas.

A language needs syntax, semantics, and implementation level.

The syntax of a language defines a set of valid formula

Prolog, Relational databases, SQL, Datalog?

Intelligent agents need knowledge about the world to choose good actions.

A model or world ww in propositional logic is an assignment of truth values to propositional symbols.

Modeling and inference

Propositional logic with only Horn clauses

Propositional logic

Modal logic

First-order logic

Second-order logic

Tell[f] → KB

Possible responses:

Ask[f] → KB

Possible responses:

A knowledge base KB is satisfiable if M(KB)=M(KB)=\emptyset

Execution engine.

Knowledge base (domain-specific facts) + inference engine.

Syntax, set of possible worlds, truth condition.

Sound Algorithm.

Complete Algorithm.








class KnowledgeAgent:
  KB knowledge base
  t int
    tell(KB, MakePerceptSentence(environment, t))
    action = ask(KB, MakeActionQuery(t))
    tell(KB, MakeActionSentence(action, t))
    t = t+1
    return action 

Inference machine

First-Order Logic

Inference in First-Order Logic

Worked examples

Knowledge graph




SQL and open cypher (Apache Age)



Card fraud detector

Make an online quiz system about Artificial Intelligence

8-eight queen

Pacman Finder


Wordle Solver




You may also consider picking up some of the following books

  • Clocksin - Mellish: Programming in Prolog


What are real-world projects where people use PROLOG?




Classical Planning

Planning and Acting in the Real World

Knowledge representation

Uncertain knowledge and reasoning

Quantifying Uncertainty

Probabilistic Reasoning

Probabilistic Reasoning over Time

Making Simple Decisions

Making Complex Decisions

Machine Learning

Machine Learning is field of study that gives computers the ability to learn without being explicitly programmed Arthur Samuel (1959). Traditional programming and classic artificial intelligence involves writing rules that act on data to produce answers. But if you flip this approach, you get machine learning. In this case, we gather a large amount of data and answers, apply a learning algorithm, and as an output, we acquire rules or models. These models can then make predictions without being specifically programmed to perform the task.

Machine Learning.

The main driver of recent successes in IA. Move from “code” to “data” to manage information complexity. The goal is the Generalization.

Reflex-based models. Linear classifiers, deep neural networks.
Modeling. Simplify the real world into a well-defined mathematical model. Example: planning goes from A to B in a city. Inference. Developing algorithms to find new data. Learning. Model without parameters such that we use data to learn those parameters by applying an algorithm.

We classify Machine Learning as Supervised learning, nonsupervised learning and reinforcement learning. The below table gives you an overview of learning algorithms.

Created by https://solclover.com with Plotly
Learning algorithmWhen to UseRelevant Metrics
Linear RegressionWhen there's a linear relationship between the input and output.Mean Squared Error (MSE), R-squared, Adjusted R-squared
Logistic RegressionFor binary classification problems.Accuracy, Precision, Recall, AUC-ROC
Decision TreesWhen there's a need to understand the decision-making process. Useful for both classification and regression.Gini Index, Information Gain for model construction; Accuracy, Precision, Recall for evaluation
Random ForestWhen model interpretability is less important and you need higher performance.Out-of-bag (OOB) error, Accuracy, Precision, Recall
K-Nearest NeighborsWhen instances of the same class are generally close to each other in the feature space.Accuracy, Precision, Recall, F1 Score
Support Vector MachinesWhen there's a clear margin of separation between classes.Accuracy, Precision, Recall, F1 Score
Neural NetworksFor complex problems like image recognition, speech recognition, and natural language processing.Depends on the task, but often includes Accuracy, Precision, Recall, AUC-ROC, and Loss metrics like Cross-Entropy Loss
XGBoostBest for heterogeneous structured datasets


1.1 Introducción
1.2 Aplicaciones
1.3 Principales enfoques de aprendizaje automático
1.4 Paradigmas de aprendizaje automático
1.5 Conceptos básicos
1.6 Problemas fundamentales
1.7 Evaluació n de modelos aprendidos

2.1 Introducció n
2.2 Desarrollo histó rico del paradigma
2.3 Árboles de decisió n
2.4 Reglas de inducció n
2.5 Aplicaciones
2.6 Tó picos selectos

3.1 Introducció n
3.2 Desarrollo histó rico del paradigma
3.3 Algoritmos gené ticos
3.4 Programació n gené tica
3.5 Aplicaciones
3.6 Algoritmos bioinspirado

4.1 Introducció n
4.2 Desarrollo histó rico del paradigma
4.3 Teorema de Bayes
4.4 Ingenuo bayesiano
4.5 Aplicaciones
4.6 Modelos gráficos probabilistas

5.1 Introducció n
5.2 Desarrollo histó rico del paradigma
5.3 Redes Neuronales Artificiales (RNA)
5.4 Algoritmo de retro-propagació n
5.5 Aplicaciones
5.6 Revisió n de arquitecturas de RNA

6.1 Introducció n
6.2 Desarrollo histó rico del paradigma
6.3 K-vecinos más cercanos
6.4 Máquinas de soporte de vectores
6.5 Aplicaciones
6.6 Tó picos selectos







Learning from Examples

When you have a dataset with features (X) and labels (Y), supervised learning means finding the relation mapping from XX to YY.

Which so an interesting algorithm? This algorithm learns from examples, that is, you have a training set, model the task, for example as linear regression, and it gives a hypothesis which is a model that maps input features to target. The learner is an optimization algorithm that needs an optimization problem, that is, our task is split into finding the right optimization model and then employing the right optimization algorithm.

The optimization problem relies on min Loss(x,y,w)Loss(x,y,w).

Loss minimization tasks. min TrainLoss(w)

The score is how confident we are.
The margin is how correct we are.

You might ask how can you get the training set, how can you deploy the hypothesis, and how can you know what method to apply, the answers are in.

Development cycle

Split data into train, val, test
Exploratory data
 - Implement feature/tune hyperparameters
 - Run learning algorithm
 - Sanity check train and val error rates
 — Look at errors to brainstorm improvements
 - Log as far as you can (reports)
- Run on test set to get final error rates

Most of the time, the test metric does not decrease.


Discrete optimization. find the discrete object

min Cost(p)
p in Paths

Algorithmic tool: dynamic programming

Continuous optimization: find the best vector of real numbers that satisfies
min TrainingError(w)
w in R^d

Algorithmic tool: gradient descent

Stanford CS221: Artificial Intelligence: Principles and techniques

Ground true.
It refers to the expected label associated with a dataset.

graph TD
  TrainingSet --> LearningAlgorithm
  LearningAlgorithm -->  Hypothesis

Given the features χ\chi the hypothesis hh is a predictor, to target yy. The features χ\chi denote the space of input values, and target yy the space of output values. So the supervised learning goal is to find a good predictor h:χyh: \chi \to y.

class Model
     apply a learning algorithm to training examples
     generates a hypothesis
      apply the hypothesis to instances

You call regression problems when yy is continuous otherwise yy is discrete, so we call it a classification problem.

Suppose you have a linear regression problem, you may represent hh as h(x)=mx+bh(x)=mx+b, an affine function. More generally, h(x)=θxTh(\bold{x})=\bold{\theta} \bold{x}^T, θ=[θ0,θ1,θ2,...,θn],x=[x0,x1,...,1]\bold{\theta}=[\theta_0,\theta_1,\theta_2,...,\theta_n],\bold{x}=[x_0,x_1,...,1] where θ\bold{\theta} are the “parameters” that allows us to make good predictions.

Binary classification

Loss functions

A. Chadha, V. Jain, Distilled Notes for Machine Learning , https://www.vinija.ai, 2022, Accessed: July 1 2022.

Distance metrics

Computing edit distance

Input: two strings, s and t
Output: minimum number of character insertions, deletions, and substitutions between s and t.


s: a cat
t: the cats!

General principles are reducing the problem and away details.

Linear prediction

Linear Regression

minF(w)=i=1n(wxiyi)2min F(w) = \sum_{i=1}^n (wx_i-y_i)^2

Input: set of pairs.
Output: w\in R that minimizes the squared error F(w= \sum_{i=1}^n (x_I w - y_i)^2.

Algorithm Gradient Descent.

Linear prediction.
Score: a weighted combination of features.
Weight vector w.



Binary linear classifier

Decision boundary

Separate the space into different subspaces to classify.


Case analyzes
\Delta Loss_{hinge} =
0 if w \phi(x)y >1 (
-\phi(x)y 0w
It increases the margin correctly.


Gradient descent and Stochastic gradient descent


Discrete optimization. find the discrete object

min Cost(p)
p in Paths

Algorithmic tool: dynamic programming

Continuous optimization: find the best vector of real numbers that satisfies
min TrainingError(w)
w in R^d

Algorithmic tool: gradient descent

The gradient Δf\Delta f is going to give the direction and the rate of value at a point f
The goal is to move in the contrary direction of the gradient.
Least squares regression.
Objective function:

TrainLoss(w)=1Dtrainsum(x,y)wfwTrainLoss(w) = \dfrac{1}{D_{train}} sum_{(x,y)\in w} f_w
\Delta TrainLoss(W) = \Delta TrainLoss(W)

For each(x,y) in D_train:
w ← w - step_size Loss(x,y,w)

It’s about quality.

SGD can be worse than GD if the dataset has noise.

Step size


Maximum likelihood estimation

Maximum likelihood estimation is the goal of training classifiers, that is, we’re finding the parameters θ\theta that maximize the probability for the actual observed data. p(y=1x;θ)p(y=1|x;\theta) refers to the conditional probability that the output is the class y=1y=1 given the input variables xx and the parameters of the model θ\theta.

independently and identically distributed

Types of problem

Pattern recognition

System recommendation

Types of data

  1. Vector Data: This is the most common and simplest form of data in machine learning. The dataset is a 2D tensor where each data point can be encoded as a vector. Examples can be anything from housing price prediction data (features being the number of rooms, location, size of the house, etc.) to text data (after applying some sort of vectorization like bag-of-words or TF-IDF).
  1. Natural language.
  1. Time series or Sequence Data: Time series data captures a series of data points recorded over regular time intervals. The order of data points is important here because the same set of data points in a different time order might mean something entirely different. Sequence data is very similar, but time isn't necessarily a factor here. Examples of these are stock price data, weather data, or any type of data where time plays a crucial role. For sequence data, a sentence or a DNA sequence would be a good example as the order of words or genes is important.
  1. Image Data: Images are represented as 3D tensors (height, width, color_depth). However, a batch of images used for training a model is stored in a 4D tensor (batch_size, height, width, color_depth). Deep learning models like Convolutional Neural Networks (CNNs) are designed to extract features from these 4D tensors and use them to classify images, detect objects, and more. Applications range from medical imaging (detecting diseases) to self-driving cars (identifying pedestrians, signs, etc.).
  1. Video Data: Video data can be thought of as a series of images, so naturally, this extends the image data tensor by one more dimension, the frame dimension. So a video dataset would be a 5D tensor (batch_size, frames, height, width, color_depth). Video data is used in various applications like activity recognition, video synthesis, and object tracking in videos.
  1. Graph data.

Data labeling



What is Transfer Learning?




Natural Computing



black hole algorithm

Mandelbrot set from scratch, Markov text-generation, and John Conway’s Game of Life ar

Pattern Recognition

Deep Learning


François Chollet - Deep Learning with Python-Manning Publications (2021)


Automatic differentiation

La función de activación tiene como propósito incorporar no linealidades al modelo. Las redes neuronales están inspiradas en las redes neuronales reales.

Que es un gradiente? Metodo de optimización.

La regla de cadena me permite pasar los gradientes de la salida de una neurona hacía la entrada


Forward propagation

back propagation



Lote (Batch)

Entrada (Input)

Activaciones (Activations)

Pesos (Weights)

Salida (Output)


Optimizers. Credits.


In a Keras model, hyperparameters such as optimizer, loss, and metrics have crucial roles in defining how the model will be trained and evaluated. Let's discuss each of these and the context in which they should be used:

  1. Optimizer: Optimizers in Keras help to adjust the attributes of your neural network such as weights and learning rate to reduce the losses. Different optimizers suit different problems and can significantly affect the model's performance and convergence speed.
    • SGD: Stochastic Gradient Descent, which is the most basic optimizer. It's robust but can be slow and sensitive to the learning rate choice.
    • RMSprop: Usually a good choice for recurrent neural networks.
    • Adam: A good default choice for many problems, it combines the advantages of RMSprop and SGD with momentum.
    • Adagrad, Adadelta, Adamax, Nadam: Other variants of optimizers, each with its strengths, but in most cases, Adam should suffice.
  1. Loss: Loss function or cost function is a method to calculate the disparity between the predicted output and the actual output. This is the function that the model will strive to minimize.
    • Mean Squared Error (MSE): Used for regression problems (predicting a continuous value).
    • Binary Cross-Entropy: Used for binary classification problems (predicting a yes/no outcome).
    • Categorical Cross-Entropy: Used for multi-class classification problems, where the outputs are one-hot encoded.
    • Sparse Categorical Cross-Entropy: Like categorical cross-entropy, but for integer targets.
    • Weighted cross-entropy loss: Used for unbalanced multi-class classification problems, where the ouputs are one-hot encoded.
  1. Metrics: Metrics are used to judge the performance of your model. Choosing the right metric is essential to judge your model accurately.
    • Accuracy: Suitable for classification problems, especially if the classes are balanced.
    • Precision, Recall, F1-score: These are more informative than accuracy for binary classification, especially if the classes are imbalanced.
    • MSE, RMSE, MAE (Mean Absolute Error): Suitable for regression problems.

Other hyperparameters include:

Choosing the right hyperparameters often involves trial and error and can be guided by experience, knowledge about the problem and the data, or hyperparameter tuning techniques such as grid search or random search.

Redes neuronales convuncionales

estructura de localidad

Redes neuronales recurrentes

  1. Las redes neuronales son aproximadores universales para funciones continuas: Funciones matemáticas
  1. Las redes neuronales recurrentes son equivalentes a máquinas de Turing: Algoritmos
  1. El algoritmo de backpropagation va encontrar una configuración de la red que imita el comportamiento de los datos

Tensorflow APIS for training

Tensorflow offers the input of the pipeline (tf.data), Keras and Estimator. Batch processing is doing with tf.distributte.

Why input pipeline? Because the data might not enough to fit in memory, efficient utilize hardware, decuple loading and preprepecessing. ETL are the typical stages for batch procesing, extract strage read from memory or remote storage, parse file format, Transform stage it performs specific domain transformations, load stage , transfer data to the accelator.

GPU/TPU processing power has a big gap with respect with CPU processing.

A typical batch processing for Deep Learning looks like

dataset = read data from storage as stream

dataset = 
   apply distributed pipe operators to dataset which is executed as a dataflow graph
build the architecture of the model with high level APIs

Optimizations are Software pipeline, Parallel transformation, and Parallel extraction.

Tensorflow Dataset is a project to onboard new novel users.

Unsupervised Learning - Clustering

Data has lots of rich latent structures. We want methods to discover these structures automatically.

Input: a training set of input points
Output: assignment of each point to a cluster


K-mean algorithm
DB Scan
Hierarchical clustering

K-Means algorithm

Even though KMeans is not the best algorithm for cluster data, it illustrates the task and a simple solution. In Kmeans, each cluster k=1,...,Kk=1,...,K is represented by a centroid μkRd\mu_k \in \mathbb{R}^d and the objective is each vector ϕ(xi)\phi(x_i) is assigned to the closest centroid. Formally, the objective function is

minzminμLosskmeans(z,μ)=i=1nϕ(xi)μzi2min_{z}min_\mu Loss_{kmeans}(z,\mu)=\sum^n_{i=1}||\phi(x_i)-\mu_{z_i}||^2

Algorithm: K-means

Initialize \mu_1, ..., \mu_K randomly
for t=1,...,T:
    Step 1: set assignments z given \mu
    Step 2: set centroids \mu given \z

Knowledge in Learning

Learning Probabilistic Models

Reinforcement Learning

* Communicating, perceiving, and acting

MACTI (Temporal)








Programación en Python



Explainability of Complex Machine Learning Models

graph TD
  ExplanationMethods --> ExplainableByDesign --> ExplnationsForGlassBoxes
  ExplainableByDesign --> EngineerdExplanations
  ExplanationMethods --> PostHocExplanationsForBlackBoxModels
  PostHocExplanationsForBlackBoxModels --> Local
  Local --> Counterfactuals
  Local --> FeatureImportance
  Local --> Prototypes
  PostHocExplanationsForBlackBoxModels --> Global
  Global--> Prototypes


Post-hoc local feature importance, rule-based, prototypes, counterfactuales



*Philosophical Foundations

Weak AI: Can Machines Act Intelligently?

Strong AI: Can Machines Really Think?

The Ethics and Risks of Developing Artificial Intelligence

AI: Present and Future

Build your product with Artificial Intelligence

Low level




Mid level


High level




independently and identically distributed

