Bass Model: Diffusion Prediction of New Products and Implementation in Python

Introduction

In marketing, when a new smartphone, app, or EV (electric vehicle) enters the market, we often want to predict how it will spread among people.

One of the most famous methods in marketing and innovation research that mathematically models the diffusion process of new products is the Bass Diffusion Model.

This article explains the theoretical background of the Bass Model, introduces how to simulate the model’s behavior using Python, and how to fit it to actual (or simulated) data.

Source Code

GitHub

  • The Jupyter Notebook file is available here

Google Colaboratory

  • To run on Google Colaboratory, click here

Execution Environment

The OS is macOS. Note that the options may differ from Linux or Unix commands.

!sw_vers
ProductName:		macOS
ProductVersion:		15.5
BuildVersion:		24F74
!python -V
Python 3.14.0

Import basic libraries.

%matplotlib inline
%config InlineBackend.figure_format = 'svg'

import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit

# Graph style settings
plt.style.use('seaborn-v0_8-darkgrid')

1. Theory of Bass Model

The Bass Model was proposed by Frank Bass in 1969. This model assumes that the probability of a non-adopter adopting the product depends on the following two factors:

  1. Innovation Effect ($p$): The tendency to adopt spontaneously due to external influences (such as advertising).
  2. Imitation Effect ($q$): The tendency to adopt due to word-of-mouth or social pressure from those who have already adopted.

1.1 Differential Equation

Let $N(t)$ be the cumulative number of adopters at time $t$, and $M$ be the potential maximum market size. The number of new adopters $n(t) = \frac{dN(t)}{dt}$ is expressed by the following differential equation:

$$ \frac{dN(t)}{dt} = \left( p + q \frac{N(t)}{M} \right) (M - N(t)) $$

Here:

  • $p$: Coefficient of innovation
  • $q$: Coefficient of imitation
  • $M$: Market potential

This equation can be interpreted as “a certain proportion $(p + q \frac{N(t)}{M})$ of non-adopters $(M - N(t))$ will newly adopt”. As adoption progresses and $N(t)$ increases, the imitation effect $q \frac{N(t)}{M}$ becomes stronger, accelerating the diffusion.

1.2 Analytical Solution

Solving this differential equation under the condition $N(0)=0$, the cumulative number of adopters $N(t)$ is as follows:

$$ N(t) = M \frac{1 - e^{-(p+q)t}}{1 + \frac{q}{p}e^{-(p+q)t}} $$

Also, the number of new adopters $n(t)$ at each time point is:

$$ n(t) = M \frac{p(p+q)^2 e^{-(p+q)t}}{(p + q e^{-(p+q)t})^2} $$

The cumulative number of adopters $N(t)$ draws an S-shaped curve, and the number of new adopters $n(t)$ draws a bell-shaped curve.

1.3 Assumptions of the Bass Model

The Bass Model makes the following assumptions:

  • Market potential $M$ is fixed
  • Price and marketing strategies do not change over time
  • All consumers are homogeneous

2. Simulation with Python

Let’s implement the formulas as Python functions and check the behavior when parameters are changed.

def bass_cumulative(t, M, p, q):
    """Cumulative adopters N(t)"""
    exponent = np.exp(-(p + q) * t)
    return M * (1 - exponent) / (1 + (q / p) * exponent)

def bass_adoption(t, M, p, q):
    """New adopters n(t)"""
    exponent = np.exp(-(p + q) * t)
    numerator = M * p * (p + q)**2 * exponent
    denominator = (p + q * exponent)**2
    return numerator / denominator

# Parameter settings
M = 10000  # Market size
p = 0.03   # Innovation coefficient
q = 0.38   # Imitation coefficient (said to be an average value)

t = np.linspace(0, 20, 100)
N_t = bass_cumulative(t, M, p, q)
n_t = bass_adoption(t, M, p, q)

plt.figure(figsize=(12, 5))

# New Adopters (Adoption)
plt.subplot(1, 2, 1)
plt.plot(t, n_t, label='Adoption $n(t)$', color='tab:blue', lw=2)
plt.title('New Adopters per Time')
plt.xlabel('Time')
plt.ylabel('Count')
plt.legend()

# Cumulative Adopters (Cumulative)
plt.subplot(1, 2, 2)
plt.plot(t, N_t, label='Cumulative $N(t)$', color='tab:orange', lw=2)
plt.axhline(y=M, color='gray', linestyle='--', label='Market Potential $M$')
plt.title('Cumulative Adopters')
plt.xlabel('Time')
plt.ylabel('Count')
plt.legend()

plt.tight_layout()
plt.show()

The graph on the left corresponds to annual sales (new adoption), and the graph on the right corresponds to cumulative sales (cumulative adoption). You can see how it starts slowly, reaches a peak, and then decreases.

3. Fitting to Data

If you have actual sales data, you can use scipy.optimize.curve_fit to estimate the parameters $M, p, q$. Here, we create hypothetical data with some noise added and try to recover the parameters from it.

# Generate hypothetical data (True parameters: M=10000, p=0.02, q=0.4)
true_params = [10000, 0.02, 0.4]
time_points = np.arange(1, 16)
y_data = bass_adoption(time_points, *true_params)

# Add noise
np.random.seed(42)
noise = np.random.normal(0, 100, size=len(time_points))
y_data_noisy = y_data + noise
y_data_noisy = np.maximum(y_data_noisy, 0) # Set negative values to 0

# Parameter estimation
# Constrain p, q to be in 0~1 range, and M to be positive for stability
bounds = ([0, 0, 0], [np.inf, 1, 1])
popt, pcov = curve_fit(bass_adoption, time_points, y_data_noisy, p0=[5000, 0.01, 0.1], bounds=bounds)

M_est, p_est, q_est = popt

print(f"Estimated M: {M_est:.2f} (True: {true_params[0]})")
print(f"Estimated p: {p_est:.4f} (True: {true_params[1]})")
print(f"Estimated q: {q_est:.4f} (True: {true_params[2]})")
Estimated M: 10068.23 (True: 10000)
Estimated p: 0.0193 (True: 0.02)
Estimated q: 0.4281 (True: 0.4)

Let’s plot the predicted curve using the estimated parameters.

plt.figure(figsize=(8, 5))

# Actual data (Scatter plot)
plt.scatter(time_points, y_data_noisy, color='red', label='Observed Data')

# Estimated curve
t_smooth = np.linspace(0, 20, 100)
y_fit = bass_adoption(t_smooth, *popt)
plt.plot(t_smooth, y_fit, color='blue', label='Fitted Bass Model')

plt.title('Bass Model Fitting')
plt.xlabel('Time')
plt.ylabel('New Adopters')
plt.legend()
plt.show()

Conclusion

The Bass Model is a powerful tool that can explain the diffusion process of many products well, despite being based on very simple assumptions. It is particularly often used to predict future peak timing and market size from initial data immediately after launch.