 📊

# Probability

Tags Computer scienceMathPhilosophyScience @January 21, 2021 12:07 PM @June 12, 2022 10:35 AM

# Resources

 Definition and examples of outcome | define outcome - Probability - Free Math Dictionary Online. Icoachmath.com. Retrieved June 10, 2021 from http://www.icoachmath.com/math_dictionary/outcome.html

• Set theory

# Story

## Characters

Andrey Kolmogorov

# Key questions

## Why study probability?

Love of wisdom.

But if you want to win someone.

I am not much given to regret, so I puzzled over this one a while. Should have taken much more statistics in college, I think.”—Max Levchin?, Paypal Co-founder, Slide FounderQuote of the week from the Web site of the American Statistical Association on November 23, 2010

 book.

# Probability

Outcome. A possible result of an experiment.

Sample space. Set of all outcomes of an experiment. We called it $S$﻿.

Event. A subset of the sample space.

Population. Not yet.

Experiment. Activity.

What is the probability goal? Probability measures the chance that event A will occur, it detonates as $P(A)$﻿. Probability doesn't say what are good decisions and not predict the future!

How to think about elements of a sample? Remember each element of a sample ontologically are different, A is A -Law of identity-. For example, If have a set of books, each book has a characteristic for being different.

## A Set Theory Dictionary for probability problems 

no element of $A$﻿, implies at least a element of $A^c$﻿

## Probability approaches

### Classical Approach or Naive Probability

Where S is a finite sample space and an event $A\subseteq S$﻿ and with outcomes equally likely.

• Example. Birthday problem. There are $k$﻿ people in a room. Assume each person's birthday is equally likely to be any of the 365 days of the year (we exclude February 29), and that people's birthdays are independent (we assume there are no twins in the room). What is the probability that two or more people in the group have the same birthday?
• Frank, P.; Goldstein, S.; Kac, M.; Prager, W.; Szegö, G.; Birkhoff, G., eds. (1964). Selected Papers of Richard von Mises. 2. Providence, Rhode Island: Amer. Math. Soc. pp. 313–334.

Assume outcomes equally likely, then naive probability apply.

$S=\text{All possible k persons' birthday sequences in a room}\\=\{ (x_1,x_2,...,x_k) | x_1\in [1,365],x_2\in [1,365],...,x_k\in [1,365] \}$﻿

But to calculate $|\text{At least 1 birthday match}|$﻿ is hard, thus we calculate his complement.

We know that $\text{No birtday match}\\=\{ (x_1,x_2,...,x_k) | x_1\in [1,365],x_2\in [1,364],x_3\in [1,363]...,x_k\in [1,365-k+1] \}$﻿,his cardinality also called permutation is equal to $365!$﻿.

Therefore, Figure: Probability that in a room of k people, at least two were born on the same day. This probability first exceeds 0.5 when k=23. For k≥366 we are guaranteed to have a match .

## The Personal Opinion Approach

The worst knowledge.

## Relative Frequency Theory

Probability interpretation?

Statistics? Repeat experiment n times, that is S.

## Modern Approach or Axiomatic Probability

The general definition of probability. A probability space consists of a sample space $S$﻿; an event space, or a set of events such that $F\subseteq S$﻿; and a probability function, which takes an event $A$﻿ and returns $P(A)$﻿, satisfying the following axioms:

We can see that $P$﻿ assigns each event, a real number between 0 and 1, as output, i.e. $0\le P(A)\le1$﻿.

Theorem 1.1 Let be $A_1,A_2,...$﻿ mutually exclusive events, then

Theorem 1.2

• Proof.

Assume that $A$﻿ and $A^c$﻿ are disjoint events and their union is $S$﻿.

By second axiom and first theorem

Theorem 1.3

• Proof.

Theorem 1.4 (Inclusion-exclusion). For any events $A_1,...,A_n$﻿

• Proof.

## Worked examples.

• A die is a cube whose 6 sides are labeled with the integers from 1 to 6. The die is fair if all 6 sides are equally likely to come up on top when the die is rolled. The plural form of "die" is "dice". Why $P(\text{the total after rolling 4 fair dice is 21})>P(\text{the total after rolling 4 fair dice is 22})?$﻿

$\text{4 fair dice}=\{(d_1,d_2,d_3,d_4)|(d_1\in[1,6]),d_2\in[1,6],...,d_4\in[1,6]\}$﻿

$\text{4 fair dice results} = \{sum(\text{4 fair dice sequence})\in\mathbb{N}| 4\le sum(x)\le36\}$﻿

$\text{4 fair dice is 21}=\{sum(\text{4 fair dice sequence})=21|seq_1=\{(6,5,5,5),...\},seq_2=\{(6,6,6,3),...\},seq_3=\{(6,6,5,4),...\}\}$﻿

$\text{4 fair dice is 21}=|seq_1|+|seq_2|+|seq_3|=\dfrac{4!}{3!}+\dfrac{4!}{3!}+\dfrac{4!}{2}=4+4+12=20$﻿

$\text{4 fair dice is 22}=\{sum(\text{4 fair dice sequence})=22|seq_1=\{(6,6,6,4),...\},seq_2=\{(6,6,5,5),...\}\}$﻿

$\text{4 fair dice is 21}=|seq_1|+|seq_2|=\dfrac{4!}{3!}+\dfrac{4!}{2^2}=4+6=10$﻿

$P(\text{the total after rolling 4 fair dice is 21})=\dfrac{|\text{4 fair dice is 21}|}{|\text{4 fair dice results}|}=\dfrac{20}{32}$﻿

$P(\text{the total after rolling 4 fair dice is 22})=\dfrac{|\text{4 fair dice is 22}|}{|\text{4 fair dice results}|}=\dfrac{10}{32}$﻿

• A palindrome is an expression such as "A man, a plan, a canal: Panama" that reads the same backwards as forwards, ignoring spaces, capitalization, and punctuation. Assume for this problem that all words of the specified length are equally likely, that there are no spaces or punctuation, and that the alphabet consists of the lowercase letters a,b,…,z. Why$P(\text{a random 2-letter word is a palindrome})=P(\text{a random 3-letter word ia palindrome})?$﻿

$P(\text{a random 2-letter word is a palindrome})=\dfrac{|\text{2-letter palindrome}|}{|\text{2-letter word}|}=\dfrac{26}{26^2}=\dfrac{1}{26}$﻿

$P(\text{a random 3-letter word is a palindrome})=\dfrac{|\text{3-letter palindrome}|}{|\text{3-letter word}|}=\dfrac{26\cdot26}{26^3}=\dfrac{1}{26}$﻿

• Three people get into an empty elevator at the first floor of a building that has floors. Each presses the button for their desired floor (unless one of the others has already pressed that button). Assume that they are equally likely to want to go to floors through (independently of each other). What is the probability that the buttons for consecutive floors are pressed?

$\frac{14}{243}$﻿

• Why the probability that all 3 people in a group of 3 were born on January 1 is less than the probability that in a group of 3 people, one was born on January 1, another one was born on January 2, and the remaining one was born on January 3?
• Martin and Gale play an exciting game of "toss the coin," where they toss a fair coin until the pattern HH occurs (two consecutive Heads) or the pattern TH occurs (Tails followed immediately by Heads). Martin wins the game if and only if the first appearance of the pattern HH occurs before the first appearance of the pattern TH. Note that this game is scored with a 'moving window'; that is, in the event of TTHH on the first four flips, Gale wins, since TH appeared on flips two and three before HH appeared on flips three and four. Why is true that Martin is less likely to win because as soon as Tails is tossed, TH will definitely occur before HH?
• Elk dwell in a certain forest. There are Nelk, of which a simple random sample of size n are captured and tagged (“simple random sample" means that all $N \choose n$﻿ sets of n elk are equally likely). The captured elk are returned to the population, and then a new sample is drawn, this time with size $m$﻿. This is an important method that is widely used in ecology, known as capture-recapture. What is the probability that exactly $k$﻿ of the $m$﻿ elk in the new sample were previously tagged? (Assume that an elk that was captured before doesn’t become more or less likely to be captured again.)

• Montmort's matching problem.

## R, "Vector thinking".

If you want to create a vector.

# name <- c(values)
vector <- c(3,1,4,1,5,9)

It is a structured language.

fn(parameters)

If you want a get the largest value.

max(vector)

When you want to create simulation.

sample

# Conditional probability

## Definition.

Let A and B events with $P(B)>0$﻿, the conditional probability of A given that the event B has occurred or A given B is denoted by $P(A|B)$﻿, is defined as

The conditional probability is a learning process, where $P(A)$﻿ shows our knowledge of event A before the experiment takes place, that is a priori probability of A and B is the evidence we observe.

$A\cap B$﻿ means A and B happen simultaneously. But, $A|B$﻿ A happen by B.

## Worked examples

• Mr. Jones has two children. The older is a girl. What is the probability that both children are girls? Mr. Smith has two children. At least one of them is a boy. What is the probability that both children are boys? This was posed by Martin Gardner in Scientific American.

Assume that gender is binary, $P(boy)=P(girl)$﻿, and that the genders of two children are independent.

$\text{Children}=\{GG,GB,BG, BB\}$﻿, where position indicate what child is older.

$P(\text{both girls}|\text{elder is a girl})=\dfrac{P(\text{both girls }\cap \text{elder is a girl})}{P(\text{elder is girl})}=\dfrac{1/4}{2/4}=1/2$﻿

$P(\text{both boys}|\text{at least one boy})=\dfrac{P(\text{both boys }\cap \text{at least one boy})}{P(\text{at least one boy})}=\dfrac{1/4}{3/4}=1/3$﻿

• A spam filter is designed by looking at commonly occurring phrases in spam. Suppose that 80% of email is spam. In 10% of the spam emails, the phrase "free money" is used, whereas this phrase is only used in 1% of non-spam emails. A new email has just arrived, which does mention "free money". What is the probability that it is spam?

$A:\text{ event that an email is spam}$﻿

$B:\text{ event that an email mention "free money" }$﻿

• The screens used for a certain type of cell phone are manufactured by 3 companies, A, B, and C. The proportions of screens supplied by A, B, and C are 0.5, 0.3, and 0.2, respectively, and their screens are defective with probabilities 0.01, 0.02, and 0.03, respectively. Given that the screen on such a phone is defective, what is the probability that Company A manufactured it?

$P(A)=0.5,P(B)=0.3,P(C)=0.2$﻿

$P(D|A)=0.01,P(D|B)=0.02,P(D|C)=0.03$﻿

$P(A|D)=\dfrac{0.5\times0.01}{0.5\times0.01+0.3\times0.02+0.02\times0.3}=0.29411764705$﻿

A family has 3 children, creatively named $A$﻿, $B$﻿, and $C$﻿.

• Discuss intuitively whether the event "$A$﻿ is older than $B$﻿" is independent of the event "$A$﻿ is older than $C$﻿".

Maybe You think something as "They are independent events since their causality a priori is contingent, no necessary". But It is not true, They are dependent events. Empirically or a posterior relation We can see that if there are n children, call them $A_0,A_1,A_2,...,A_{n}$﻿, then writing $x>y$﻿ to mean that $x$﻿ is older than $y$﻿ and writing $A_0>A_1,A_2,A_3,A_4,...,A_{n-1}$﻿ to mean that $A_0$﻿ is the oldest than them.

1. Exists evidence about $A_0$﻿ fits into birth order first place, but no $A_n$﻿.
1. $A_0$﻿ is very old. It's hard than $A_n$﻿ is older.

Therefore, They are dependent events since $A_n$﻿ decreases the probability for being the highest age when $A_0$﻿ happens, that is exists a causality.

• Find the probability that $A$﻿ is older than $B$﻿, given that $A$﻿ is older than $C$﻿.

$R:\text{A is older than B}$﻿

$T:\text{A is older than C}$﻿

$S=\{ABC,ACB,BAC,BCA,CAB,CBA\}$﻿, where each element is a birth order.

$P(R|T)=\dfrac{P(R\cap T)}{P(T)}=\dfrac{\dfrac{|R\cap T|}{|S|}}{\dfrac{|T|}{|S|}}=\dfrac{|R \cap T|}{|T|}=\dfrac{2}{3}$﻿

$P(R\cap T)=P(\text{A is the eldest child})=\dfrac{|\text{the eldest child}|}{|childs|}=\dfrac{1}{3}$﻿

$P(T)=\dfrac{\text{|A is the eldest child|}}{|childs|}=\dfrac{1}{2}$﻿

# Overview and descriptive statistics

Data.

Statistician collects data.

Population. Set of measurements of interest to the experimenter.

What? Sample. Subset of population.

Variable. Characteristic that changes about experimental unit under experiment. Examples. Hair color.

Who? Experimental units. Objects on which a variable is measured. Blackbox is an active subject.

A measurement or datum results when a variable is actually observed on an experimental unit.

A set of measurements, called data, can be either a sample or a population. I.e. $\text{Measurements} = data = sample$﻿ or $\text{Measurements} = data = population$﻿.

Variables types. Qualitative measure a characteristic, Quantitative measure a numerical quantity: discrete or numerable and continuous or not numerable.

How many variables have you measured?

• Univariate data.
• Bivariate data.
• Multivariate data.

#### Statistics

Descriptive StatisticsInferential Statistics
We can enumerate the population easily.We cannot enumerate the population easily. So We choose a sample.
Describe population. No need for inference. You can get the conclusions.Inference (i.e. supposed conclusions) about the population from samples.

Get samples to inference population, then predict future about a black-box, guarantee a stable knowledge and make decisions. Remember past results are no guarantee of future performance. There are three kinds of lies….. Lies Damn Lies Statistics You need to make statistics work for you, not lie for you!

# Inferential statistics

1. Define the objective.
1. Design of the experiment.
1. Collect data with math standard.
1. Make inferences.
1. Determine reliability of the inference.

## Graphing Variables

Use a data distribution to describe:

• What values of the variables have been measured?
• How often each value has occurred?

# Descriptive Statistics

Measures of Location

Measures of Variability

Chebyshev theorem

Z-score

z-score (also called a standard score) gives you how far from the mean a data point by standard deviation.

Standard deviation.

# Probability

Definition. Probability is the logic of uncertainty.

## How to count?

If you want to count outcomes, how to count them?

Experiment.

Experiment random.

Sample spaces. All possible outcomes.

Event. Set of possible outcomes.

Elementary event. Events containing only one outcome are called elementary events and they are written interchangeably for simplicity.

$|S|=\text{amount of outcoumes}$﻿

Experimental probability is probability that is determined on the basis of the results of an experiment repeated many times. When we compute the probability of a future event based on our observations of past events.

Theoretical probability is probability that is determined on the basis of reasoning. Axiomatic.

Combination

Bayes

Condictional

# Random Variables

## Probability Distributions for Discrete Random Variables.

If X is a discrete random variable, the function given by f(x)=P(X=x) for each x within the range of X is called the probability distribution of X.

## Probability Distributions for Continuous Random Variables. Probability density function.

$F(x)=\int f(x)dx$﻿

F(x) probability distribution function. f(x) density function

# @import url('https://cdnjs.cloudflare.com/ajax/libs/KaTeX/0.13.2/katex.min.css')$\sigma_X=\sqrt{Var(X)}$﻿

sigma = (table,mu) => Math.sqrt(
table.reduce((acc, currentValue) =>
acc+=Math.pow(currentValue-mu, 2)*currentValue, 0)
)

NameTagsConcepto
Untitled

# Distributions

## Distribución Discreta Uniforme

### ¿Qué caracteriza o mide la variable aleatoria?

La distribución discreta uniforme se caracteriza por su constante probabilidad $1/(b-a)$﻿ con respecto a los $b-a$﻿ valores del dominio $x\in[a,b]$﻿ de una variable aleatoria discreta. A saber, sus parámetros son a y b.

### Fórmula y gráfica de la distribución.

Asi la función de masa de probabilidad o función de probabilidad de una variable aleatoria que es uniforme es: $f(x)=\dfrac{1}{b-a}=\dfrac{1}{n}$﻿, para $x\in [a,b]$﻿ donde $x_i \neq x_j$﻿ cuando $i \neq j$﻿. Su gráfica:

Su función acumulada: $i ∈ [a,b], \text{ } F(i;a,b)={\frac {\lfloor i\rfloor -a+1}{b-a+1}}$﻿. Donde $i$﻿ esl argumento de la función, $a$﻿ es el ínfimo del dominio y $b$﻿ el supremo del mismo. Gráfica:

https://dk81.github.io/dkmathstats_site/rmath-uniform-plots.html

$M_X(t)=\dfrac{e^{at}-e^{(b+1)t}}{n(1-e^t)}$﻿

### Media.

$E(X)=\dfrac{a+b}{2}$﻿

### Varianza.

$\mu=\dfrac{(b-a)^2}{12}$﻿

## Distribución Bernoulli

### ¿Qué caracteriza o mide la variable aleatoria?

Mide la probabilidad de exito $f$﻿ de un experimento con dos resultados posibles: "exito" y fracaso, sus probabilidades son $p$﻿ y $1-p$﻿ respectivamente. Tal que, el numero de exitos tiene un distribucion de Bernoulli. A saber, su parametro es $p.$﻿

### Fórmula y gráfica de la distribución.

$f(x,p)=p^x(1-p)^{1-x}$﻿ para $x=0,1$﻿. Tal que $P(X=1)=f(1,p)=p$﻿ y $P(X=0)=f(0,p)=1-p$﻿.

Grafica:

$F(k,p) = \begin{cases} 1-p \text{ } k \leq 0<1 \\ 1 \text{ } k \geq1 \end{cases}$﻿

Grafica:

$M_X(t)=q+pe^t$﻿

### Media.

$E(X)=p$﻿

### Varianza.

$Var(X)=pq$﻿

## Distribución Binomial

### ¿Qué caracteriza o mide la variable aleatoria?

1. n ensayos de Bernoulli.
1. Los cuales son identicos e independientes, es decir, probabilidad de éxito $p$﻿ permanece sin cambio de un ensayo a otro.
1. La variable aleatoria denota el numero de éxitos obtenidos en $n$﻿ ensayos.

### Fórmula y gráfica de la distribución.

$f(x;n,p)= {n \choose x}p^x(1-p)^{n-x}$﻿ donde n es el numero de ensayos y $x\in[0,n]$﻿.

$F(x,n,p)=\sum^ {\lfloor {x} \rfloor}_{i=0} {n\choose i}p^i(1-p)^{n-i}$﻿ (Funcion logistica).

$M_X(t)=(1-p+pe^t)^n$﻿

### Media.

$E(X)=np$﻿

### Varianza.

$Var(X)=npq$﻿

## Distribución Multinomial

### ¿Qué caracteriza o mide la variable aleatoria?

Se caracteriza por ser la generalizacion de una distribucion binomial para k categorias o eventos, en vez de 2 (exito o fracaso). A saber, sus parametros son $n>0$﻿ y $p_1,...,p_k$﻿ donde $\sum p_i=1$﻿.

### Fórmula y gráfica de la distribución.

$f(x)=\dfrac{n!}{x_1!x_2...x_k!}p_1^{x_1}...p_k^{x_k}$﻿

$M_X(t)=(\sum^k_{i=1}p_ie^{t_i})^n$﻿

### Media.

$E(X_i)=np_i$﻿

### Varianza.

$Var(X_i)=np_i(1-p_i)$﻿

## Distribución Geométrica

### ¿Qué caracteriza o mide la variable aleatoria?

1. n ensayos de Bernoulli.
1. Identicos e independientes, con la misma probabilidad de exito $p$﻿ (parametro), tal que $P(X=1)=P[\text{exito en el primero ensayo}]=p$﻿
1. La variable aleatoria denota el numero de ensayos $x$﻿ para obtener el primer exito. Su espacio muestral es $S=\{E,FE,FFE,FFFE,...\}$﻿

### Fórmula y gráfica de la distribución.

$f(x)=(1-p)^{x-1}p$﻿, donde $0﻿ y $x=1,2,3...$﻿

$F(x)=1-(1-p)^x$﻿

$M_X(t)=\dfrac{pe^t}{1-(1-p^t)},$﻿ para $t<-ln(1-p)$﻿.

### Media.

$E(X)=\dfrac{1}{p}$﻿

### Varianza.

$Var(X)=\dfrac{1-p}{p^2}$﻿

## Distribución Binomial Negativa

¿Qué caracteriza o mide la variable aleatoria?

1. n ensayos de Bernoulli. Identicos e independientes, con la misma probabilidad de exito $p$﻿ (parametro).
1. Los ensayos se observan hasta obtener exactamente $r$﻿ exitos. Donde el experimentador lo fija.
1. La variable aleatoria denota el numero de ensayos $x$﻿ para obtener $r$﻿ exitos.

### Fórmula y gráfica de la distribución.

$f(x)={x-1 \choose r-1}(1-p)^{x-r}p^r$﻿ para $r=1,2,3...$﻿ y $x=r,r+1,r+2$﻿.

$M_X(t)=(\dfrac{1-p}{1-pe^{t}})^r$﻿ para $t<-log(p)$﻿.

### Media.

$E(X)=\dfrac{pr}{1-p}$﻿

### Varianza.

$Var(X)=\dfrac{pr}{(1-p)^2}$﻿

## Distribución Poisson

### ¿Qué caracteriza o mide la variable aleatoria?

La variable aleatoria mide el numero de sucesos u ocurrencias $x$﻿ de un evento especificado en una unidad determinada de tiempo, longitud o espacio $s$﻿, durante el cual se puede esperar que ocurra un promedio $\lambda$﻿ de estos eventos o el radio de ocurrencias de un estos eventos. Los eventos ocurrent al azar e idependientes entre si. Nace de la necesidad para distribuciones binomiales grandes.

The Poisson distrution is often used in situations where we are couting the number of successes in a particular region o interval of time, an there are a large number of trials, with a small probability of success. The Poisson paradigm is also called the law of rare events. The interpretation of "rare" is that the $p_j$﻿ are small, not that $\lambda$﻿ is small.

### Fórmula y gráfica de la distribución.

$f(x)=\dfrac{e^{-k}k^x}{x!}$﻿ para $x=0,1,2,...$﻿ y $k>0$﻿.

$k=\lambda s$﻿, donde $\lambda$﻿ el promedio de casos del evento por unidad y $s$﻿ la magnitud o tamano del periodo de observacion.

Funcion acumalativa.

$M_X(t)=e^{k(e^k-1)}$﻿

### Media

$E(x)=k$﻿

## Distribución Hipergeométrica

### ¿Qué caracteriza o mide la variable aleatoria?

1. El experimento consiste en extraer de una muestra aleatoria de tamano n sin remplazo ni consideracion de su orden, de un conjunto de N objetos.
1. De los N objetos, r posee el rasgo (caracteristica) que interesa, mientras que los otros N-r objetos restantes no lo tienen.
1. La variable aleatoria es el numero de objetos de la muestra que posee el rasgo.

### Fórmula y gráfica de la distribución.

$f(x)=\dfrac{{r \choose x}{N-r \choose n-x}}{{N \choose n}}$﻿ donde N, r y n son enteros positivos y los parametros.

Tal que $max(0,n-(N-r))\le x \le min(n-r)$﻿

Funcion acumaltiva.

Donde $F_1$﻿ es la genaralizacion de la funcion hipergeometrica.

### Media.

$E(X)=n\dfrac{K}{N}$﻿

### Varianza.

$Var(X)=n\dfrac{K(N-K)(N-n)}{N^2(N-1)}$﻿

### Consideraciones.

• Si el número de unidades muestreado (n) es pequeño en relación con el de objetos del cual se extrae la muestra (N ), entonces es posible usar la distribución binomial para aproximar las probabilidades hipergeométricas.
• Una regla general es que la aproximación suele ser satisfactoria si n/N ≤ 0.05.
• Si n es pequeña en relación con N, la composición del grupo muestreado no cambia mucho de un ensayo a otro, pese a que se conserven los objetos muestreados. Así pues, la probabilidad de éxito tampoco se modifica considerablemente de un ensayo al siguiente y, para cualquier fin práctico, puede verse como una constante.
• De tal suerte, la distribución de X, el número de éxitos obtenidos en n intentos, puede aproximarse mediante la distribución binomial con parámetros n y p = r/N.

# Worked examples

1. Una variable aleatoria es:

Una función cuyo dominio es el espacio muestral.

2. Para que se produzca f(y) = P(Y = y) para toda y, la distribución de probabilidad para la variable discreta Y puede ser representada por

It is a graph, table or formula.

6.31 Se escoge un punto D en la línea AB, cuyo punto medio es C y cuya longitud es a. Si X , la distancia de D a A , es una variable aleatoria que tiene la densidad uniforme con α = 0 y β = a , ¿cuál es la probabilidad de que AD, BD y AC formarán un triángulo?Estrictamente menor para poder llamarlo triángulo. Si fuera menor o igual hablamos de una línea, lo cual claramente no cumple la característica principal: un polígono de tres lados.

Sabemos que:

En todo triángulo la suma de las longitudes de dos lados cualesquiera es siempre mayor a la longitud del lado restante.

Es decir:

Para nuestro caso particular:

Sustituimos de acuerdo a las condiciones del problema:

Así, X como variable aleatoria cumple con lo siguiente:

Por lo tanto, la probabilidad es:

# Accuracy (predictions, outcomes)

np.count_nonzero(predictions == outcomes) / len(predictions)
== np.mean(predictions == outcomes)

# Stochastic process

A stochastic process is a collection of random variables, indexed by an ordered time variable.