[Graph Neural Networks] Convolution in the Spatial Domain and Element-wise Multiplication in the Frequency Domain in Graph Signal Processing

Convolution in the Spatial Domain and Element-wise Multiplication in the Frequency Domain in Graph Signal Processing

In graph signal processing, convolution in the spatial domain and element-wise multiplication in the frequency domain fundamentally represent the same concept. Conversely, element-wise multiplication in the spatial domain corresponds to convolution in the frequency domain.

The theorem can be expressed as follows:

Convolution in the Spatial Domain → Element-wise Multiplication in the Frequency Domain

$$ \mathcal{F}(f * g) = \mathcal{F}(f) \cdot \mathcal{F}(g) $$

$ f * g $ denotes convolution in the spatial domain.
$ \mathcal{F}(f) $ and $ \mathcal{F}(g) $ represent the Fourier transforms of $f$ and $g$, respectively.
$\cdot$ indicates element-wise multiplication in the frequency domain.

Element-wise Multiplication in the Frequency Domain → Convolution in the Spatial Domain

$$ \mathcal{F}^{-1}(\mathcal{F}(f) \cdot \mathcal{F}(g)) = f * g $$

Intuitive Explanation

Convolution in the spatial domain refers to operations such as smoothing signals (images, sounds, etc.) or extracting specific features.
Element-wise multiplication in the frequency domain involves filtering the frequency components of a signal.

This article explains the theory and Python-based implementation of convolution in the spatial domain and element-wise multiplication in the frequency domain to understand the basics of Graph Neural Networks (GNNs).

Source Code

GitHub

The Jupyter Notebook file is available here .

Google Colaboratory

To run the code on Google Colaboratory, click here .

Environment

The OS is macOS. Note that the options for Linux or Unix commands may differ.

!sw_vers

ProductName:		macOS
ProductVersion:		15.2
BuildVersion:		24C101

!python -V

Python 3.9.17

Basic libraries are imported, the version is verified using watermark, and random seeds are set.

%matplotlib inline
%config InlineBackend.figure_format = 'svg'

import random
import scipy
import numpy as np
import matplotlib
import matplotlib.pyplot as plt

seed = 123
random_state = 123

random.seed(seed)
np.random.seed(seed)

from watermark import watermark

print(watermark(python=True, watermark=True, iversions=True, globals_=globals()))

Python implementation: CPython
Python version       : 3.9.17
IPython version      : 8.17.2

matplotlib: 3.8.1
scipy     : 1.11.2
numpy     : 1.25.2

Watermark: 2.4.3

1. Fundamentals of Graph Signal Processing

1.1 Definition of Graphs and Signals

A graph $G$ consists of a set of vertices $V$ and edges $E$. If the number of vertices is $n = |V|$, the adjacency matrix $\mathbf{A}$ represents the connectivity between vertices. Here, $\mathbf{A}$ is an $n \times n$ matrix where entries typically take values like 1 for an existing edge and 0 otherwise (for weighted graphs, non-zero weights are assigned).

When a scalar or vector-valued signal $x_i$ is defined on each vertex $v_i$, it is referred to as a “graph signal.” If scalar signals are considered for each vertex, the signal can be represented as a vector $\mathbf{x}$ of length $n$: $$ \mathbf{x} = \begin{bmatrix} x_1 \\ x_2 \\ \vdots \\ x_n \end{bmatrix} $$

For such graph signals, we aim to define convolution operations. However, unlike traditional CNNs, convolution on graphs with their inherent structure is not straightforward. Thus, two approaches—spatial and frequency domain—have been developed.

1.2 Convolution in the Spatial Domain: Neighbor Aggregation

Convolution in the spatial domain involves direct interaction with neighboring vertices. For instance, the update rule in a typical GCN (Graph Convolutional Network) is given as: $$ \mathbf{X}^{(l+1)} = \sigma\left( \tilde{\mathbf{D}}^{-\frac12} \tilde{\mathbf{A}} \tilde{\mathbf{D}}^{-\frac12} \mathbf{X}^{(l)} \mathbf{W}^{(l)} \right) $$ where $\tilde{\mathbf{A}} = \mathbf{A} + \mathbf{I}$ is the adjacency matrix with self-loops, $\tilde{\mathbf{D}}$ is the degree matrix derived from $\tilde{\mathbf{A}}$, $\mathbf{X}^{(l)}$ represents the vertex feature vectors at the $l$-th layer, $\mathbf{W}^{(l)}$ is the learnable parameter matrix, and $\sigma$ is an activation function.

In essence, this method computes averages or weighted sums of neighboring vertices’ information using the adjacency matrix. While often referred to as “convolution in the spatial domain,” it closely resembles neighbor aggregation. The feature vector of a vertex $v_i$ integrates information from its surrounding vertices.

This approach is intuitive and easy to implement. Many practical GNN libraries adopt this spatial domain method. For example, analyzing social network data in industries often involves aggregating features from neighboring users.

1.3 Element-wise Multiplication in the Frequency Domain: Graph Fourier Transform

In the frequency domain, convolution utilizes the graph’s Fourier transform. Similar to how convolution in images or audio corresponds to element-wise multiplication in the frequency domain, the idea extends to graphs using the eigenvectors of the graph Laplacian $\mathbf{L}$: $$ \mathbf{L} = \mathbf{D} - \mathbf{A} $$ where $\mathbf{D}$ is the degree matrix. $\mathbf{L}$ is symmetric, making its eigenvectors orthogonal, forming a basis. A graph signal $\mathbf{x}$ can then be transformed as: $$ \hat{\mathbf{x}} = \mathbf{U}^T \mathbf{x} $$ where $\mathbf{U}$ is the matrix of eigenvectors of $\mathbf{L}$ arranged as columns.

$\hat{\mathbf{x}}$ represents the signal in the frequency domain. As in traditional Fourier transforms, convolution becomes element-wise multiplication in the frequency domain. Thus, convolution of graph signals can be implemented via element-wise multiplication. However, as eigenvectors differ across graphs, eigen-decomposition becomes computationally expensive for large graphs, posing a challenge for this approach.

2. Correspondence Between Spatial and Frequency Domains

2.1 Basic Properties of Convolution

In traditional discrete Fourier transforms (DFT) for 1D or 2D data, the following correspondence holds:

Convolution in the spatial domain: $$ \mathbf{x} * \mathbf{y} $$
Element-wise multiplication in the frequency domain: $$ \hat{\mathbf{x}} \odot \hat{\mathbf{y}} $$ where $\odot$ represents element-wise multiplication.

Using the eigenvectors of the graph Laplacian as the basis, a similar correspondence is established for graphs.

2.2 Approximation in GCN

Direct computation of convolution in the frequency domain requires eigen-decomposition, which is computationally expensive for large graphs. Therefore, many methods approximate the Laplacian using polynomial expansions. A well-known example is Chebyshev polynomials, enabling iterative convolution computations without explicit eigen-decomposition.

For instance, the simple GCN update rule can be interpreted as a low-order Chebyshev polynomial approximation. This demonstrates how spatial methods can be derived as approximations of frequency-domain convolution.

2.3 Summary of Advantages and Disadvantages

Spatial Domain Methods:
- Advantages: Intuitive, easy to implement, scalable for large graphs.
Frequency Domain Methods:
- Advantages: Precise definition of convolution.
- Disadvantages: Computationally expensive for large graphs.
Polynomial Approximation:
- Bridging the two domains, it avoids eigen-decomposition by using approximations.

3. Examples: Python Implementation

This section demonstrates the spatial and frequency domain methods using a simple sample graph. Python implementations are provided for better understanding.

3.1 Creating a Sample Graph

import numpy as np

# Number of vertices
n = 5

# Creating an adjacency matrix
A = np.array([[0, 1, 0, 0, 1],
              [1, 0, 1, 0, 0],
              [0, 1, 0, 1, 0],
              [0, 0, 1, 0, 1],
              [1, 0, 0, 1, 0]], dtype=np.float32)
A

array([[0., 1., 0., 0., 1.],
       [1., 0., 1., 0., 0.],
       [0., 1., 0., 1., 0.],
       [0., 0., 1., 0., 1.],
       [1., 0., 0., 1., 0.]], dtype=float32)

# Graph signal (scalar value on each vertex)
x = np.array([1, 2, 3, 4, 5], dtype=np.float32)
x

array([1., 2., 3., 4., 5.], dtype=float32)

Here, a simple graph with five vertices is created, where the vertices are connected in a ring structure. Each vertex has a scalar value as its signal.

3.2 Neighbor Aggregation in the Spatial Domain

Using the adjacency matrix, neighbor information is aggregated. A simplified example of GCN-like normalization is shown below:

# Degree matrix D
D = np.diag(np.sum(A, axis=1))

# Computing inverse square root of D for normalization
D_inv_sqrt = np.diag(1.0 / np.sqrt(np.sum(A, axis=1)))

# Adding self-loops
A_hat = A + np.eye(n)
D_hat = np.diag(np.sum(A_hat, axis=1))
D_hat_inv_sqrt = np.diag(1.0 / np.sqrt(np.sum(A_hat, axis=1)))

# Single layer operation in the spatial domain (simplified GCN)
W = np.ones((1, 1), dtype=np.float32)
Z = D_hat_inv_sqrt @ A_hat @ D_hat_inv_sqrt @ x.reshape(-1, 1) @ W
Z = Z.flatten()

print("Spatial Conv Result:", Z)

Spatial Conv Result: [2.66666667 2.         3.         4.         3.33333333]

In this example, $\mathbf{W}$ is set to all ones for simplicity. In practice, $\mathbf{W}$ is typically a higher-dimensional matrix learned during training. The result $Z$ represents the aggregated signal in the spatial domain.

3.3 Element-wise Multiplication in the Frequency Domain

The frequency domain method is demonstrated below. For small graphs, eigen-decomposition is feasible; however, this becomes impractical for large-scale graphs.

# Graph Laplacian
L = D - A

# Eigen-decomposition
Lambda, U = np.linalg.eig(L)
# Lambda: Eigenvalues
# U: Eigenvectors

# Fourier Transform
x_hat = U.T @ x

# Apply a simple low-pass filter in the frequency domain
h_hat = np.exp(-0.5 * Lambda)  # Filter function e^(-0.5 * λ)

# Element-wise multiplication in the frequency domain
y_hat = x_hat * h_hat

# Inverse Fourier Transform
y = U @ y_hat

print("Frequency Domain Conv Result:", y.real)

Frequency Domain Conv Result: [1.9138175 2.111436  3.3199701 4.0403714 3.6144037]

In this case, the filter $e^{-0.5 \lambda}$ attenuates the components corresponding to higher eigenvalues, acting as a low-pass filter. For large graphs, eigen-decomposition becomes a computational bottleneck, making it challenging to use frequency domain methods directly in industrial applications.

4. Mathematical Details and Numerical Examples

4.1 Definition of Convolution

Convolution on graphs can be understood as a mapping from one signal space to another: $$ \text{Conv}: \mathbb{R}^n \times \mathbb{R}^n \rightarrow \mathbb{R}^n $$ Here, $\mathbb{R}^n$ represents the space of scalar signals for graphs with $n$ vertices. Regardless of the domain—spatial or frequency—the result remains a signal in $\mathbb{R}^n$.

4.2 Example of Filter Operators

For a filter operator $g(\mathbf{L})$ expressed as a polynomial: $$ g(\mathbf{L}) = \sum_{k=0}^{K} \alpha_k \mathbf{L}^k $$ Applying it to a signal $\mathbf{x}$ results in: $$ g(\mathbf{L})\mathbf{x} = \sum_{k=0}^{K} \alpha_k \mathbf{L}^k \mathbf{x} $$ This allows the operation to be implemented in the spatial domain without explicit eigen-decomposition, reducing computational complexity.

4.3 Numerical Example

Consider a cycle graph with four vertices where each vertex has two connections. The Laplacian $\mathbf{L}$ is: $$ \mathbf{L} = \begin{bmatrix} 2 & -1 & 0 & -1 \\ -1 & 2 & -1 & 0 \\ 0 & -1 & 2 & -1 \\ -1 & 0 & -1 & 2 \end{bmatrix} $$

The eigenvalues of $\mathbf{L}$ are $0, 2, 4, 2$ (order may vary). Using these eigenvalues, a filter $g(\lambda) = e^{-0.5\lambda}$ can be applied to a signal $\mathbf{x}$, and the result $\mathbf{y}$ can be computed either in the frequency domain or approximated in the spatial domain.

5. Code Examples and Numerical Experiments

Here, we delve into more detailed numerical experiments, comparing spatial and frequency domain approaches. Additionally, we demonstrate polynomial approximations for applying filters.

5.1 Polynomial Approximation of the Laplacian

As an example, consider a simple implementation that uses a polynomial approximation of $\mathbf{L}^k$ instead of Chebyshev polynomials for simplicity.

def polynomial_filter(L, x, coeff_list):
    """
    Simple function to apply a polynomial filter of the form g(L) = sum(alpha_k * L^k)
    Parameters:
        L (ndarray): Graph Laplacian matrix
        x (ndarray): Graph signal
        coeff_list (list): List of coefficients [α0, α1, α2, ...]
    Returns:
        y (ndarray): Filtered graph signal
    """
    y = np.zeros_like(x)
    L_power = np.eye(L.shape[0])  # Start with L^0 = I
    for k, alpha in enumerate(coeff_list):
        if k > 0:
            L_power = L_power @ L
        y += alpha * (L_power @ x)
    return y

# Example: Applying g(L) = I - 0.1 * L (K = 1)
coeff_list = [1.0, -0.1]
y_poly = polynomial_filter(L, x, coeff_list)
print("Polynomial Filter Result:", y_poly)

Polynomial Filter Result: [1.5 2.  3.  4.  4.5]

Here, $g(\mathbf{L}) = \mathbf{I} - 0.1 \mathbf{L}$ acts as a simple first-order polynomial filter. In the frequency domain, this corresponds to $g(\lambda) = 1 - 0.1\lambda$. The results match when directly compared with frequency domain calculations.

5.2 Comparing Results and Discussion

Spatial Domain (Neighbor Aggregation): The result reflects the local structure of the graph (via the adjacency matrix). This approach efficiently scales to large graphs, making it suitable for industrial applications.
Frequency Domain (Element-wise Multiplication): The result leverages precise filtering based on eigenvalues and eigenvectors of the Laplacian. It is computationally expensive but provides theoretical rigor.
Polynomial Approximation: Serves as a bridge between the two domains, enabling spatial implementation of frequency-based ideas. By avoiding eigen-decomposition, it balances efficiency and accuracy, making it viable for practical applications.

6. Industrial Use Cases and Applications

Graph Neural Networks (GNNs) and graph signal processing find numerous applications in industries. Below are a few examples:

6.1 Recommender Systems

Recommender systems can model user-item relationships as graphs. Users and items are represented as vertices, and edges capture interactions or similarities. By applying GNNs:

Neighbor aggregation captures user preferences from nearby users.
Item similarity graphs can identify recommendations.

Typically, spatial domain methods dominate due to scalability. Polynomial approximations like Chebyshev polynomials further enhance the model’s expressive power without sacrificing computational efficiency.

Analyzing social networks often involves tracking relationships and interactions. GNNs are used to:

Aggregate information from neighboring nodes to infer user behavior.
Detect communities or influential nodes.

Spatial domain methods are preferred for their interpretability and scalability, while frequency domain insights help refine theoretical models.

6.3 Advertisement Targeting

In targeted advertising, graph structures model relationships between users and products. GNNs help:

Predict user interests based on similar users.
Recommend products by aggregating neighborhood interactions.

While frequency domain methods offer precision, most real-world systems adopt scalable spatial approximations.

7. Practical Considerations and Future Directions

7.1 Scaling to Large Graphs

Spatial domain methods scale well for large graphs, as they avoid eigen-decomposition, which requires $O(n^3)$ complexity. For graphs with millions of nodes, spectral methods are infeasible without approximations. Future research focuses on scalable spectral approximations and sampling techniques.

7.2 Heterogeneous and Dynamic Graphs

Real-world applications often involve heterogeneous graphs (multiple vertex/edge types) or dynamic graphs (evolving edges over time). These require:

Flexible neighbor aggregation rules for spatial methods.
Adaptations in Laplacian or adjacency matrix representations for frequency methods.

7.3 Research Trends

Emerging trends include:

Combining GNNs with Transformers or attention mechanisms.
Developing efficient training methods for large-scale distributed environments.
Exploring alternative approximations like random feature mapping or higher-order polynomials.

These advancements directly impact industries such as e-commerce, social media, and network security.

8. Conclusion

This article explained the correspondence between spatial and frequency domain methods in graph signal processing. While spatial methods dominate industrial use cases for their scalability, frequency methods provide a rigorous theoretical foundation. Polynomial approximations bridge the gap, making spectral insights practical for large-scale graphs.

Key Takeaways:

Spatial Domain: Intuitive, scalable, suitable for real-world graphs.
Frequency Domain: Precise but computationally expensive.
Polynomial Approximation: Combines the best of both worlds for practical applications.

References:

Kipf, T. N., & Welling, M. (2017). “Semi-Supervised Classification with Graph Convolutional Networks.” ICLR.
Bruna, J., Zaremba, W., Szlam, A., & LeCun, Y. (2014). “Spectral Networks and Locally Connected Networks on Graphs.” ICLR.
Defferrard, M., Bresson, X., & Vandergheynst, P. (2016). “Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering.” NIPS.