Neumann Series
In my graduate research, I struggled with expanding $(\mathbf{I} - a\mathbf{A})^{-1}$ and gained insights into the Neumann series, so I will summarize that content.
This article details the definition, properties, and applications of the Neumann series. It particularly focuses on examples of its use in machine learning and recommendation systems. Specific calculations are shown using mathematical formulas and Python code.
Source Code
GitHub
- The Jupyter Notebook file is available here .
Google Colaboratory
- To run it on Google Colaboratory, click here .
Execution Environment
The OS is macOS. Note that the options differ from those for Linux or Unix commands.
!sw_vers
ProductName: macOS
ProductVersion: 13.5.1
BuildVersion: 22G90
!python -V
Python 3.9.17
Import the basic libraries and use watermark to check their versions. Also, set the seed for random numbers.
import random
import numpy as np
seed = 123
random_state = 123
random.seed(seed)
np.random.seed(seed)
from watermark import watermark
print(watermark(python=True, watermark=True, iversions=True, globals_=globals()))
Python implementation: CPython
Python version : 3.9.17
IPython version : 8.17.2
numpy: 1.25.2
Watermark: 2.4.3
Neumann Series for $(\mathbf{I} - a\mathbf{A})^{-1}$
Using the Neumann series to represent $(\mathbf{I} - a\mathbf{A})^{-1}$ is a useful method for calculating the inverse of a matrix $\mathbf{A}$ with a constant $a$. This expansion is particularly effective when the spectral radius of $\mathbf{A}$ is small. As seen, the Neumann series is an extended version of the geometric series sum formula:
$$ \sum_{n=0}^{\infty} a^n = \frac{1}{1 - a} $$
for matrices.
Definition
For a constant $a$ and matrix $\mathbf{A}$, the Neumann series expansion of $(\mathbf{I} - a\mathbf{A})^{-1}$ is expressed as:
$$ (\mathbf{I} - a\mathbf{A})^{-1} = \sum_{n=0}^{\infty} (a\mathbf{A})^n $$
Convergence Condition
For this series to converge, the following condition is required:
$$ |a| \cdot \rho(\mathbf{A}) < 1 $$
where $\rho(\mathbf{A})$ is the spectral radius of the matrix $\mathbf{A}$, representing the maximum absolute value of $\mathbf{A}$’s eigenvalues.
Python Implementation Example
The following Python code is an example of approximating $(\mathbf{I} - a\mathbf{A})^{-1}$ using the Neumann series. To calculate the third-order Neumann series, define the matrix $\mathbf{A}$ and constant $a$, and compute the sum of the Neumann series.
Though it might seem redundant, it is included for completeness.
import numpy as np
from pprint import pprint
# Initial matrix A
A = np.array([[0.3, 0.2], [0.1, 0.4]])
# Constant
a = 0.8
# Calculate the spectral radius
eigenvalues = np.linalg.eig(A)[0]
spectral_radius = max(abs(eigenvalues))
print(f"spectral_radius : {spectral_radius}")
# Convergence condition
if abs(a) * spectral_radius < 1:
I = np.eye(A.shape[0])
# Calculate the inverse matrix approximation
n_iter = 3
A_inv_approx = np.zeros_like(A)
for n in range(n_iter):
A_inv_approx += np.linalg.matrix_power(a * A, n)
# Final inverse matrix approximation
A_inv_approx = np.dot(A_inv_approx, I)
print("=" * 20)
print("Approximated Inverse Matrix : ")
pprint(A_inv_approx.round(2))
else:
print("Convergence condition |a| * ρ(A) < 1 is not satisfied")
# Calculate the actual inverse matrix of A
A_inv = np.linalg.inv(I - a * A)
print("=" * 20)
print("Inverse Matrix : ")
pprint(A_inv.round(2))
spectral_radius : 0.5
====================
Approximated Inverse Matrix :
array([[1.31, 0.25],
[0.12, 1.44]])
====================
Inverse Matrix :
array([[1.35, 0.32],
[0.16, 1.51]])
Conclusion
The Neumann series is an efficient method for finding the inverse matrix under certain conditions. The Neumann series expansion of $(\mathbf{I} - a\mathbf{A})^{-1}$ is effective when the spectral radius of the matrix $\mathbf{A}$ is small and is widely applied in machine learning and linear algebra problems (though I was unaware of this before).
References
- Wikipedia: Neumann Series