Probability distributions and special functions in scipy

I’ll cover basic special functions such as beta and gamma, and note how to use them in scipy.

github

  • The file in jupyter notebook format is here

google colaboratory

  • If you want to run it in google colaboratory here func/func_nb.ipynb)

Author’s environment

The author’s OS is macOS, and the options are different from those of Linux and Unix.

! sw_vers
ProductName: Mac OS X
ProductVersion: 10.14.6
BuildVersion: 18G95
Python -V
Python 3.5.5 :: Anaconda, Inc.

Import the basic libraries and check their versions.

%matplotlib inline
%config InlineBackend.figure_format = 'svg'

import matplotlib
import matplotlib.pyplot as plt
import scipy
import numpy as np

print('matplotlib version :', matplotlib.__version__)
print('scipy version :', scipy.__version__)
print('numpy version :', np.__version__)
matplotlib version : 2.2.2
scipy version : 1.4.1
numpy version : 1.18.1

beta functions

$$ B(\alpha, \beta) = \int_0^1x^{\alpha -1 } (1-x)^{\beta -1} dx $$

In general, the beta function $\alpha, \beta$ extends to complex numbers, but in data analysis, it is common to take integers greater than 1. I miss my school days. This is also called the first kind of Euler integral. The beta function is the beta distribution

$$ p(x|\alpha, \beta) = \frac{x^{\alpha -1 } (1-x)^{\beta -1} }{B(\alpha, \beta)} $$

This is used to consider the

We will note how to get the value of the beta function in sicpy.

from scipy.special import beta

# \alpha = 2, \beta = 2
print('beta(2,2) = {:.4f}'.format(beta(2,2)))

# \alpha = 3, \beta = 4
print('beta(3,4) = {:.4f}'.format(beta(3,4)))

# \alpha = 5, \beta = 2
print('beta(5,2) = {:.4f}'.format(beta(5,2)))
beta(2,2) = 0.1667
beta(3,4) = 0.0167
beta(5,2) = 0.0333

Gamma function

$$ \Gamma(x) = \int_0^\infty t^{x-1}e^{-x} dx $$

and is, as is often said, a generalization of factorial. It is also called Euler integral of the second kind.

If $x$ is a positive integer, then

$$ \Gamma(x) = (x-1)! $$

This is the case. Also, we have the following property. It’s persistent, but I missed it.

  • $ \displaystyle \Gamma(x+1) = x\Gamma(x)$
  • $ \displaystyle \Gamma\left(\frac{1}{2}\right) = \sqrt{\pi} $

The gamma distribution is used everywhere, but in the name connection, it is the gamma distribution.

$$ P(x) = \frac{\lambda^k x^{k-1}e^{-\lambda x}}{\Gamma(k)} $$

Details of the gamma distribution can be found here.

Logit function

Is this term “odds” the same as the odds often heard in horse racing? I don’t do horse racing, so I don’t know. Please tell me. Anyway, when the probability of an event happening is $p$, $$\frac{p}{1-p}$$ is said to be the odds. The logarithm of that $$\log p - \log(1-p)$$ is called the log odds.

$$ f(p) = \log \frac{p}{1-p} $$ is called the logit function.

from scipy.special import logit

x = np.linspace(0,1,100)
y = logit(x)

plt.grid()
plt.plot(x,y)
plt.show()

Logistic functions

In general, $$ f(x)= \frac{a}{1+e^{-k(x-x_0)}}}$$ is called a logistic function. Here, $a = k = 1, x_0 = 0$$ is called the sigmoid function, or S-curve. The sigmoid function (or its multivariate version, which also comes up more often than the logistic function.

Sigmoid function

The sigmoid function is as follows $$ f(x)=\frac{1}{1+e^{-x}}$$.

The graph is as follows. It has a nice S-shaped curve.

from scipy.special import expit

x = np.linspace(-8,8,100)
y = expit(x)

plt.grid()
plt.plot(x,y)
plt.show()