🧠

Artificial Intelligence

Image generated by Dalle2 https://twitter.com/Dalle2Pics/status/1545871607767384066

Requisites

compressive sensing (sparse coding),

information theory,

control theory,

economics,

logic,

operations research,

game theory,

and optimization.

Resources

Introduction

What is AI?

We struggle with what is intelligent, but not artificial.

https://www.tor.com/2011/06/21/norvig-vs-chomsky-and-the-fight-for-the-future-of-ai/

We use rational agents as our approach. An agent is an entity that can perceive an environment X and can act on it where X can be virtual or physical. How an agent decides to act given all previous considerations is a black box.

So, intelligence or rationality is when an agent makes the best decision in a given environment and goal with constraints and acts in accordance with it. Therefore, we evaluate an agent by its results, not by its mental state.

If an agent makes the best decision in all environments and goals, it is called artificial general intelligence.

If an agent makes the best decision in a specific environment, it is called weak AI.

From a scientific point’s view is understanding complete AI, but from an engineering point’s view is making incomplete, imperfect, and weak artificial intelligence.

Some constraints are lack of knowledge, time to learn, time to execute, money, actuators, and sensors.

But, now black boxes become white boxes. If you are here, then you are interested in white boxes which are computational procedures. You’re going to learn them!

Ecosystem

https://inside.com/ai

https://www.csail.mit.edu/

https://bair.berkeley.edu/

http://ai.ucsd.edu/

https://www.microsoft.com/en-us/research/collaboration/bair/

https://people.eecs.berkeley.edu/~yima/ by https://twitter.com/YiMaTweets

Story

Computing Machinery and Intelligence.

The Imitation Game (Turing test).

Player A is a computer who claims to be a man.

Player B is a man.

Interrogator.

Total Turing Test.

Loebner Prize.

The Argument from Extrasensory Perception. During Cold War, people were interested in viz, telepathy, and precognition, so Turing prepared an argument for the situation.

https://ai-timeline.sanchezcarlosjr.com/

https://courses.cs.washington.edu/courses/csep590a/06au/projects/history-ai.pdf

https://plato.stanford.edu/entries/artificial-intelligence/

https://journals.sagepub.com/doi/pdf/10.1177/0008125619864925

https://dl.acm.org/doi/fullHtml/10.1145/2063176.2063177

https://sitn.hms.harvard.edu/flash/2017/history-artificial-intelligence/

Association for Computing Machinery (ACM). (2022, December 22). January 2023 CACM: The End of Programming. Youtube. Retrieved from https://www.youtube.com/watch?v=OnYJXm9NvyA&ab_channel=AssociationforComputingMachinery(ACM)

💡
“We propose that a 2 month, 10 man study of artificial intelligence be carried out during the summer of 1956 at Dartmouth College in Hanover, New Hampshire. The study is to proceed on the basis of the conjecture that every aspect of learning or any other feature of intelligence can in principle be so precisely described that a machine can be made to simulate it. An attempt will be made to find how to make machines use language, form abstractions and concepts, solve kinds of problems now reserved for humans, and improve themselves. We think that a significant advance can be made in one or more of these problems if a carefully selected group of scientists work on it together for a summer.” McCarthy, John, Marvin L. Minsky, Nathaniel Rochester, and Claude E. Shannon. "A proposal for the dartmouth summer research project on artificial intelligence, august 31, 1955." AI magazine 27, no. 4 (2006): 12-12.

Related work

Cognitive science

Philosophy of mind

Worked examples

FAQ

I learned in the Theory of computation some problems are undecidable, but I see those problems solved with Artificial Intelligence, how is that?

AI Framework.

Intelligent Agents. Rational agents.

https://ai.facebook.com/blog/yann-lecun-advances-in-ai-research/

Agent class

👉🏼
Agent = Architecture + Program.
Artificial intelligence’s concern is the
program, an autonomous computional program.
graph TD
subgraph Architecture
  Information --> Program
  Program --> Decision
end

Worked examples

Further readings

Economic agents.

Solving problems by Searching.

💡
Searching problems are optimization problems.
graph TD
  S["S, environment"] -->|"cost(S,action1, A)"| A["A, new environment"]
  S -->|"cost(S,action2, B)"| B["B, new environment"]
  S -->|"cost(S,action3, C)"| C["C, new environment"]
  subgraph possible_solutions3
    C-->E["..."]
  end
 subgraph possible_solutions2
    B-->F["..."]
  end
  subgraph possible_solutions1
    A-->D["..."]
  end
  D-->Goal,["Goal, new environment == goal"]

Searching problem model

classDiagram
    class SearchProblem {
      heuristic()
      start_state()
      is_goal(state)
      expand(state)
      valid_actions_from(state)
      action_cost(state, action, next_state)
      next_state(state, action)
    }
    class State {
       distance_from_start_state
       previous_state
       environment
       build()
       relax()
       reconstruct_path()
    }
    class SearchingStrategy {
       findPlanFor(problem)
    }
    class Agent {
      searchingStrategy: SearchingStrategy
      problem: SearchProblem
      act(state)
    }
    Agent *-- SearchingStrategy
    Agent *-- SearchProblem
    SearchingStrategy <|-- BFS
    SearchingStrategy <|-- DFS
    SearchingStrategy <|-- AStar
    SearchingStrategy <|-- Dijistra
    SearchingStrategy <|-- LinearProgramming

Searching strategies

Searching strategies

Searching for solutions

Considerations.

We build a tree or graph on demand by searching strategies.

We have got to avoid cycles in order to keep away infinite loops.

Each new state saves the reference's previous state and we only choose valid action, so when our searching algorithm reaches the goal; it reconstructs the path from the start state.

SA...GoalS\to A\to ...\to Goal

Codification matters.

Uninformed search strategies

BFS

DFS

Informed Search Strategies

A*, Greedy Search, Hill Climbing, Simulated Annealing, Best-First Search

Heuristic Functions

A heuristic is a function that estimates the distance from the current state to the goal.

f(n)f(n) is the real or estimated cost of the nn solution.

g(n)g(n) is the cost to reach nn from the start state. i=0ncost(i,action,i+1)\sum_{i=0}^n cost(i,action,i+1)

h(n)h(n) is the estimated cost to reach the goal state from the nn state, so it uses the available information from the problem or environment state in order to estimate the cost.

h(n)h^*(n) is the real cost to reach the goal state from the nn state.

i=ngcost(i,action,i+1)\sum_{i=n}^g cost(i,action,i+1)

Note h(n)0h(n) \to0 and h(n)0h^*(n)\to 0.

Properties

Main idea: estimated heuristic \le actual costs.

Admissibility.

0h(x)costs to goal0 \le h(x) \le \text{costs to goal}

Consistency.

h(x)h(y)costs(x,y)h(x)-h(y) \le costs(x,y)

Dominance.

Optimal.

How do you find a heuristic function?

Relax your problems, use available information about the current state or the goal, use min, and max functions, or use some distance functions such as Manhattan distance, Euclidean distance, Hamming distance, … and norms.

Beyond Classical Search.

Heuristic. Safe steps.

Offline, Online.

Solving problems by searching are graph algorithms that generate new nodes by heuristic and safe steps and test them, wrong answers are rejected them.

Hands-on Projects

8-Queen solver

Hanoi tower

Maze solver

Graph Theory Visualizer: Maze. (2022, July 03). Retrieved from https://graph-theory.sanchezcarlosjr.com

Project 0 - Unix, Python and Autograder Tutorial - CS 188: Introduction to Artificial Intelligence, Spring 2021. (2022, September 29). Retrieved from https://inst.eecs.berkeley.edu/~cs188/sp21/project0/#question-1-addition

The farmer, fox, goose, and grain

Integral solver

Pacman

Project 1 - Search - CS 188: Introduction to Artificial Intelligence, Spring 2021. (2022, September 29). Retrieved from https://inst.eecs.berkeley.edu/~cs188/fa22/projects/proj1/

Worked examples

References

https://aimacode.github.io/aima-javascript/3-Solving-Problems-By-Searching/

How to solve it: Modern Heuristics by Zbigniew Michalewicz, David B. Fogel.

Adversarial Search

MinMax

https://www.youtube.com/watch?v=l-hh51ncgDI&ab_channel=SebastianLague

MonteCarlo

matchmaking algorithms

Hands-on Projects

Tic Tac Toe

Chess

Pacman v2

Notion – The all-in-one workspace for your notes, tasks, wikis, and databases. (2023, April 20). Retrieved from https://www.chessengines.org

Project 2. (2022, October 13). Retrieved from https://inst.eecs.berkeley.edu/~cs188/fa22/projects/proj2

Constrain satisfaction Problems

Knowledge, reasoning, and planning

Logical Agents

Knowledge, Syntax, Semantics

Prolog,

Relational databases, SQL, Datalog?

Knowledge base (domain-specif facts) + inference engine.

Syntax, set of possible worlds, truth condition.

Sound Algorithm.

Complete Algorithm.

Theorem-proving.

Model-checking.

https://www.youtube.com/watch?v=CAsq7hm3sbI&ab_channel=IITDelhiJuly2018

https://www.youtube.com/watch?v=xFpndTg7ZqA&t=1s&ab_channel=IITDelhiJuly2018

https://www.youtube.com/watch?v=h6zCkrZ8ehE&t=1s&ab_channel=RichNeapolitan

tammet. (2022, December 13). gkc. Retrieved from https://github.com/tammet/gkc

Program

class KnowledgeAgent:
  KB knowledge base
  t int
  act(environment):
    tell(KB, MakePerceptSentence(environment, t))
    action = ask(KB, MakeActionQuery(t))
    tell(KB, MakeActionSentence(action, t))
    t = t+1
    return action 

Inference machine

First-Order Logic

Inference in First-Order Logic

Worked examples

Projects

Card fraud detector

Make an online quiz system about Artificial Intelligence

8-eight queen

Pacman Finder

https://inst.eecs.berkeley.edu/~cs188/sp21/project3/

Wordle Solver

https://swi-prolog.discourse.group/t/wordle-solver/5124

https://cheatle.occasionallycogent.com/

Resources

You may also consider picking up some of the following books

  • Clocksin - Mellish: Programming in Prolog

FAQ

What are real-world projects where people use PROLOG?

https://www.quora.com/What-is-Prolog-used-for-today

https://www.cs.nmsu.edu/ALP/2011/03/natural-language-processing-with-prolog-in-the-ibm-watson-system/

https://www.drdobbs.com/parallel/the-practical-application-of-prolog/184405220

Classical Planning

Planning and Acting in the Real World

Knowledge representation

Uncertain knowledge and reasoning

Quantifying Uncertainty

Probabilistic Reasoning

Probabilistic Reasoning over Time

Making Simple Decisions

Making Complex Decisions

Machine Learning

Machine Learning is field of study that gives computers the ability to learn without being explicitly programmed Arthur Samuel (1959). Traditional programming and classic artificial intelligence involves writing rules that act on data to produce answers. But if you flip this approach, you get machine learning. In this case, we gather a large amount of data and answers, apply a learning algorithm, and as an output, we acquire rules or models. These models can then make predictions without being specifically programmed to perform the task.

We classify Machine Learning as Supervised learning, nonsupervised learning and reinforcement learning. The below table gives you an overview of learning algorithms.

Created by https://solclover.com with Plotly
Learning algorithmWhen to UseRelevant Metrics
Linear RegressionWhen there's a linear relationship between the input and output.Mean Squared Error (MSE), R-squared, Adjusted R-squared
Logistic RegressionFor binary classification problems.Accuracy, Precision, Recall, AUC-ROC
Decision TreesWhen there's a need to understand the decision-making process. Useful for both classification and regression.Gini Index, Information Gain for model construction; Accuracy, Precision, Recall for evaluation
Random ForestWhen model interpretability is less important and you need higher performance.Out-of-bag (OOB) error, Accuracy, Precision, Recall
K-Nearest NeighborsWhen instances of the same class are generally close to each other in the feature space.Accuracy, Precision, Recall, F1 Score
Support Vector MachinesWhen there's a clear margin of separation between classes.Accuracy, Precision, Recall, F1 Score
Neural NetworksFor complex problems like image recognition, speech recognition, and natural language processing.Depends on the task, but often includes Accuracy, Precision, Recall, AUC-ROC, and Loss metrics like Cross-Entropy Loss
XGBoostBest for heterogeneous structured datasets

Ensam

1.1 Introducción
1.2 Aplicaciones
1.3 Principales enfoques de aprendizaje automático
1.4 Paradigmas de aprendizaje automático
1.5 Conceptos básicos
1.6 Problemas fundamentales
1.7 Evaluació n de modelos aprendidos

2.1 Introducció n
2.2 Desarrollo histó rico del paradigma
2.3 Árboles de decisió n
2.4 Reglas de inducció n
2.5 Aplicaciones
2.6 Tó picos selectos

3.1 Introducció n
3.2 Desarrollo histó rico del paradigma
3.3 Algoritmos gené ticos
3.4 Programació n gené tica
3.5 Aplicaciones
3.6 Algoritmos bioinspirado

4.1 Introducció n
4.2 Desarrollo histó rico del paradigma
4.3 Teorema de Bayes
4.4 Ingenuo bayesiano
4.5 Aplicaciones
4.6 Modelos gráficos probabilistas

5.1 Introducció n
5.2 Desarrollo histó rico del paradigma
5.3 Redes Neuronales Artificiales (RNA)
5.4 Algoritmo de retro-propagació n
5.5 Aplicaciones
5.6 Revisió n de arquitecturas de RNA

6.1 Introducció n
6.2 Desarrollo histó rico del paradigma
6.3 K-vecinos más cercanos
6.4 Máquinas de soporte de vectores
6.5 Aplicaciones
6.6 Tó picos selectos

https://github.com/afshinea/stanford-cs-229-machine-learning/tree/master/en

https://course.fast.ai/

https://www.fast.ai/

https://realpython.com/python-ai-neural-network/

https://huggingface.co/

Libros

  1. François Fleuret’s Homepage. (n.d.). Retrieved June 16, 2023, from https://fleuret.org/francois/#lbdl

Notas

CM. (n.d.). International Conference on Machine Learning. https://icml.cc/


Domingos, P. (2017). The Master Algorithm. The MIT Press.

GECCO. (2022). The Genetic and Evolutionary Computation Conference. https://gecco-2022.sigevo.org/HomePage


Mitchell, T. (1997). Machine Learning. McGraw Hill.


Murphy, K. (2012). Machine Learning: A Probabilistic Perspective. The MIT Press.


NeurIPS (2021). Conference on Neural Information Processing Systems.
https://nips.cc/

Bishop, C. M. (2006). Pattern Recognition and Machine Learning. The MIT Press.


CVF. (n.d.). Computer Vision Foundation.
http://openaccess.thecvf.com/menu.py

Nunes, L. (2006). Fundamentals of Natural Computing: Basic Concepts, Algorithms, and Applications. Chapman & Hall/CRC. [Clásica].

Russell, S., & Norvig, P. (2009). Artificial Intelligence: A Modern Approach. Pearson. [Clásica].

Sucar, L. E. (2015). Probabilistic Graphical Models: Principles and Applications. Springer. [Clásica].

Tan, P., Steinbach, M., Karpatne, A. & Kumar, V. (2018). Introduction to Data Mining (2nd ed.). Pearson.

  1. Clasificación con Árboles de Decisión: el algoritmo CART | Codificando Bits. (n.d.). Retrieved June 16, 2023, from https://www.codificandobits.com/blog/clasificacion-arboles-decision-algoritmo-cart/
  1. Sanz, F. (2020, November 30). Cómo funciona el algoritmo XGBoost en Python. The Machine Learners. https://www.themachinelearners.com/xgboost-python/https://www.themachinelearners.com/xgboost-python/
  1. Graff, M. (2022) Aprendizaje Computacionalhttps://ingeotec.github.io/AprendizajeComputacional/

Learning from Examples

When you have dataset with features (X) and labels (Y), supervised learning means to find the relation mapping from XX to YY.

Your algorithm is going to learn from examples (aka supervised learning), that is, you have a training set, apply it some learning algorithm such as linear regression, and it gives an hypothesis (an model that maps input features to target). You might ask how can you get the training set, how can you deploy the hypothesis, how can you know what method apply, the answers are in .

graph TD
  TrainingSet --> LearningAlgorithm
  LearningAlgorithm -->  Hypothesis

Given the features χ\chi the hypothesis hh is a predictor to target yy. The features χ\chi denote the space of input values, and target yy the space of output values. So the supervised learning goal is to find a good predictor h:χyh: \chi \to y.

class Model
  fit(trainingset)
     apply a learning algorithm to training examples
     generates a hypothesis
  predict(instances)
      apply the hypothesis to instances

You call regression problems when yy is continous otherwise yy is discrete, so we call it a classification problem.

Suppose you have a linear regression problem, you may represent hh as h(x)=mx+bh(x)=mx+b, an affine function. More generally, h(x)=θxTh(\bold{x})=\bold{\theta} \bold{x}^T, θ=[θ0,θ1,θ2,...,θn],x=[x0,x1,...,1]\bold{\theta}=[\theta_0,\theta_1,\theta_2,...,\theta_n],\bold{x}=[x_0,x_1,...,1] where θ\bold{\theta} are the “parameters” that allows us to make good predictions.

Maximum likelikhood estimation

Maximum likehood estimation is thhe goal of training classifiers, that is, we’re finding the parameters θ\theta that maximize the probability for the actual observed data. p(y=1x;θ)p(y=1|x;\theta) refers to the conditional probability that the ouput is the class y=1y=1 given the input variables xx and the parameters of the model θ\theta.

independently and identically distributed

Types of problem

Pattern recognition

System recommentation

Types of data

  1. Vector Data: This is the most common and simplest form of data in machine learning. The dataset is a 2D tensor where each individual data point can be encoded as a vector. Examples can be anything from housing price prediction data (features being the number of rooms, location, size of the house, etc.) to text data (after applying some sort of vectorization like bag-of-words or TF-IDF).
  1. Natural language.
  1. Timeseries or Sequence Data: Time series data captures a series of data points recorded over regular time intervals. The order of data points is important here because the same set of data points in a different time order might mean something entirely different. Sequence data is very similar, but time isn't necessarily a factor here. Examples of these are stock price data, weather data, or any type of data where time plays a crucial role. For sequence data, a sentence or a DNA sequence would be a good example as the order of words or genes is important.
  1. Image Data: Images are represented as 3D tensors (height, width, color_depth). However, a batch of images used for training a model is stored in a 4D tensor (batch_size, height, width, color_depth). Deep learning models like Convolutional Neural Networks (CNNs) are designed to extract features from these 4D tensors and use them to classify images, detect objects, and more. Applications range from medical imaging (detecting diseases) to self-driving cars (identifying pedestrians, signs, etc.).
  1. Video Data: Video data can be thought of as a series of images, so naturally, this extends the image data tensor by one more dimension, the frame dimension. So a video dataset would be a 5D tensor (batch_size, frames, height, width, color_depth). Video data is used in various applications like activity recognition, video synthesis, and object tracking in videos.

Data labeling

https://github.com/HumanSignal/awesome-data-labeling

Fine-tuning

What is Transfer Learning?

Datasets

Notebooks

https://www.youtube.com/watch?v=T-fAkfU9j_o&ab_channel=Elpensamientoenllamas

Natural Computing

NACO

https://fcampelo.github.io/EC-Bestiary/

black hole algorithm

Mandelbrot set from scratch, Markov text-generation, and John Conway’s Game of Life ar

Pattern Recognition

  1. Bishop, C. M. (2006). Pattern Recognition and Machine Learning. Springer. Available at: Microsoft Research
  1. Duda, R. O., Stork, D. G., & Hart, P. E. (2001). Pattern Classification (2nd ed.). Wiley.
  1. Fu, K. (1974). Syntactic Methods in Pattern Recognition. Academic Press.
  1. Jürgen, M. & Matthias, N. (2018). Pattern Recognition: Introduction, Features, Classifiers and Principles. Berlin: De Gruyter Oldenbourg (De Gruyter Graduate). Available at: EBSCOhost.
  1. Koutroumbas, K. & Theodoridis, S. (2009). Pattern Recognition. Academic Press. Available at: EBSCOhost
  1. Murty, M. N. & Devi, V. S. (2011). Pattern Recognition: An Algorithmic Approach. Springer London. Available at: EBSCOhost
  1. Massachusetts Institute of technology. (n.d.). Mitopencourseware.
    Pattern Recognition and Analysis.
    https://ocw.mit.edu/courses/media-arts-and-sciences/mas-622j-pattern-recognition-and-analysis-fall-2006/syllabus/

Deep Learning

https://people.idsia.ch/~juergen/deep-learning-history.html

François Chollet - Deep Learning with Python-Manning Publications (2021)

Tensor

Automatic differentiation

La función de activación tiene como propósito incorporar no linealidades al modelo. Las redes neuronales están inspiradas en las redes neuronales reales.

Que es un gradiente? Metodo de optimización.

La regla de cadena me permite pasar los gradientes de la salida de una neurona hacía la entrada

Activation

Forward propagation

back propagation

https://playground.tensorflow.org

https://realpython.com/python-ai-neural-network/

Neural Networks and Deep Learning http://neuralnetworksanddeeplearning.com/

Lote (Batch)

Entrada (Input)

Activaciones (Activations)

Pesos (Weights)

Salida (Output)

Optimizers

Optimizers. Credits.

Hyperparameters

In a Keras model, hyperparameters such as optimizer, loss, and metrics have crucial roles in defining how the model will be trained and evaluated. Let's discuss each of these and the context in which they should be used:

  1. Optimizer: Optimizers in Keras help to adjust the attributes of your neural network such as weights and learning rate in order to reduce the losses. Different optimizers suit different kinds of problems and can significantly affect the model's performance and the speed of convergence.
    • SGD: Stochastic Gradient Descent, it's the most basic optimizer. It's robust but can be slow and sensitive to the learning rate choice.
    • RMSprop: Usually a good choice for recurrent neural networks.
    • Adam: A good default choice for many problems, it combines the advantages of RMSprop and SGD with momentum.
    • Adagrad, Adadelta, Adamax, Nadam: Other variants of optimizers, each with its strengths, but in most cases, Adam should suffice.
  1. Loss: Loss function or cost function is a method to calculate the disparity between the predicted output and the actual output. This is the function that the model will strive to minimize.
    • Mean Squared Error (MSE): Used for regression problems (predicting a continuous value).
    • Binary Cross-Entropy: Used for binary classification problems (predicting a yes/no outcome).
    • Categorical Cross-Entropy: Used for multi-class classification problems, where the outputs are one-hot encoded.
    • Sparse Categorical Cross-Entropy: Like categorical cross-entropy, but for integer targets.
    • Weighted cross-entropy loss: Used for unbalanced multi-class classification problems, where the ouputs are one-hot encoded.
  1. Metrics: Metrics are used to judge the performance of your model. Choosing the right metric is essential to judge your model accurately.
    • Accuracy: Suitable for classification problems, especially if the classes are balanced.
    • Precision, Recall, F1-score: These are more informative than accuracy for binary classification, especially if the classes are imbalanced.
    • MSE, RMSE, MAE (Mean Absolute Error): Suitable for regression problems.

Other hyperparameters include:

Choosing the right hyperparameters often involves trial and error and can be guided by experience, knowledge about the problem and the data, or hyperparameter tuning techniques such as grid search or random search.

Redes neuronales convuncionales

estructura de localidad

Redes neuronales recurrentes

  1. Las redes neuronales son aproximadores universales para funciones continuas: Funciones matemáticas
  1. Las redes neuronales recurrentes son equivalentes a máquinas de Turing: Algoritmos
  1. El algoritmo de backpropagation va encontrar una configuración de la red que imita el comportamiento de los datos

"Multilayer feedforward networks with a nonpolynomial activation function can approximate any function". Neural Networks. 6 (6): 861–867. Siegelmann, H. T., & Sontag, E. D. (1992, July). On the computational power of neural nets. In Proceedings of the fifth annual workshop on Computational learning theory (pp. 440-449).
Rumelhart, D. E., Hinton, G. E., & Williams, R. J. (1986). Learning representations by back-propagating errors. nature, 323(6088), 533-536.

[En inglés] Python AI: How to Build a Neural Network & Make Predictions

https://realpython.com/python-ai-neural-network/

[En español traducido por google] Python AI: Cómo construir una red neuronal y hacer predicciones https://realpython-com.translate.goog/python-ai-neural-network/?_x_tr_sl=en&_x_tr_tl=es&_x_tr_hl=en-US&_x_tr_pto=wapp

Sitios

  1. Lewis, O. (2023). Awesome Artificial Intelligence (AI).

    [Inglés]https://github.com/owainlewis/awesome-artificial-intelligence

Libros

  1. Intelligence, A. (2021). A Modern Approach, 4th US ed.[Inglés sitio oficia]https://aima.cs.berkeley.edu/
  1. Zhang, A., Lipton, Z. C., Li, M., & Smola, A. J. (2020). Dive into Deep Learning. [Inglés]https://d2l.ai/

Cursos

  1. DEEP LEARNING · Deep Learning. (n.d.). https://atcold.github.io/NYU-DLSP21/

Videos

  1. Irving Vasquez. (2022, August 23). Introducción a las redes neuronales - Presentación del curso. [Español]

Understanding LSTM Networks:

http://colah.github.io/posts/2015-08-Understanding-LSTMs/

Recurrent Neural Networks and LSTM explained:

https://purnasaigudikandula.medium.com/recurrent-neural-networks-and-lstm-explained-7f51c7f6bbb9

Recurrent Neural Networks (RNN) and Long Short-Term Memory (LSTM): https://www.youtube.com/watch?v=WCUNPb-5EYI

https://homes.cs.washington.edu/~pedrod/

Tensorflow

Tensorflow APIS for training

Tensorflow offers the input of the pipeline (tf.data), Keras and Estimator. Batch processing is doing with tf.distributte.

Why input pipeline? Because the data might not enough to fit in memory, efficient utilize hardware, decuple loading and preprepecessing. ETL are the typical stages for batch procesing, extract strage read from memory or remote storage, parse file format, Transform stage it performs specific domain transformations, load stage , transfer data to the accelator.

GPU/TPU processing power has a big gap with respect with CPU processing.

A typical batch processing for Deep Learning looks like

dataset = read data from storage as stream

dataset = 
   apply distributed pipe operators to dataset which is executed as a dataflow graph
build the architecture of the model with high level APIs
model.fit(dataset)

Optimizations are Software pipeline, Parallel transformatio, Parallel extraction.

Tensorflow Dataset is a project to onboard new novel users.

Unserpervised Learning - Clustering

💡
You may want to apply DBSCAN by default.
https://towardsdatascience.com/the-5-clustering-algorithms-data-scientists-need-to-know-a36d136ef68

Knowledge in Learning

Learning Probabilistic Models

Reinforcement Learning

* Communicating, perceiving, and acting

MACTI (Temporal)

Inducción

https://www.youtube.com/watch?v=9AwJrXAz9QA

https://www.youtube.com/watch?v=CDYLHa63ws4

https://www.youtube.com/watch?v=ERYgaGKaHoE

https://www.youtube.com/watch?v=KX4DdZeRAsI

https://www.youtube.com/watch?v=zdpDR_F2ovg

https://www.youtube.com/watch?v=eAmFytbeNTc

Morozov, E. (2023, April 3). Ni es inteligente ni es artificial: esa etiqueta es una herencia de la Guerra Fría. El País.https://elpais.com/ideas/2023-04-03/ni-es-inteligente-ni-es-artificial-esa-etiqueta-es-una-herencia-de-la-guerra-fria.html

Varios (2023, March 10). Declaración de Montevideo sobre Inteligencia Artificial y su impacto en América Latina.https://www.fundacionsadosky.org.ar/declaracion-de-montevideo-sobre-inteligencia-artificial-y-su-impacto-en-america-latina/

Podcast T4-E06-Sebastián Ramírez-Contribuyendo al Opensource • Saturdays.AI. (n.d.). Retrieved June 12, 2023

[Podcast] https://saturdays.ai/2022/09/07/podcast-t4-e06-sebastian-ramirez-contribuyendo-al-opensource/

GPT-3: La supernova del modelado del lenguaje | Ivan Vladimir Meza Ruiz. (2023). Blog personalhttps://turing.iimas.unam.mx/~ivanvladimir/posts/chat-gpt/

Programación en Python

Otros

https://github.com/ivanvladimir/Proyectos-MeIA/tree/main

Explainability of Complex Machine Learning Models

graph TD
  ExplanationMethods --> ExplainableByDesign --> ExplnationsForGlassBoxes
  ExplainableByDesign --> EngineerdExplanations
  ExplanationMethods --> PostHocExplanationsForBlackBoxModels
  PostHocExplanationsForBlackBoxModels --> Local
  Local --> Counterfactuals
  Local --> FeatureImportance
  FeatureImportance-->LIME
  FeatureImportance-->SHAP
  FeatureImportance-->DALEX
  FeatureImportance-->NAM
  FeatureImportance-->CIU
  FeatureImportance-->GRADCAM
  FeatureImportance-->IG
  Local --> Prototypes
  PostHocExplanationsForBlackBoxModels --> Global
  Global--> Prototypes
  Global-->SetOfLocalExplanations
  Global-->ModelDistillation

  

Post-hoc local feature importance, rule-based, prototypes, counterfactuales

Perception

Robotics

*Philosophical Foundations

Weak AI: Can Machines Act Intelligently?

Strong AI: Can Machines Really Think?

The Ethics and Risks of Developing Artificial Intelligence

AI: Present and Future

Build your product with Artificial Intelligence

Low level

Tensorflow

OpenCV

dlib

Mid level

https://developers.google.com/mediapipe

High level

face_recognition

https://github.com/steven2358/awesome-generative-ai

TODO

independently and identically distributed