[SQL] NULL Handling and COALESCE Function
In this article, I will explain how to handle NULL in SQL and the COALESCE function that conveniently processes it.
In SQL, NULL is a special marker that indicates “value does not exist” or “unknown”.
It is clearly distinguished from the numeric 0 or empty string '', and can cause unexpected behavior in calculations and comparisons.
The COALESCE function is essential for properly handling this NULL.
Preparation
In this site, I will use python libraries sqlite3 and ipython-sql to execute and explain SQL on Jupyter Notebook.
Also, I will use Polars to display dataframes.
github
- The jupyter notebook file on github is here
google colaboratory
- If you want to run it on google colaboratory, click here
Author’s Environment
Here is the author’s environment.
!sw_vers
ProductName: macOS
ProductVersion: 15.5
BuildVersion: 24F74
!python -V
Python 3.12.12
Import necessary libraries.
import sqlite3
import polars as pl
%load_ext sql
%config SqlMagic.feedback = True
%config SqlMagic.style = '_DEPRECATED_DEFAULT'
Data Preparation
As a sample, create a product table Products.
Assume that some products do not have price (price) or description (description) set (NULL).
import os
if os.path.exists("data.db"):
os.remove("data.db")
os.system("touch data.db")
os.system("chmod 664 data.db")
# Database connection
%sql sqlite:///data.db
'Connected: @data.db'
%%sql
CREATE TABLE Products (
id INTEGER PRIMARY KEY,
name TEXT,
price INTEGER,
description TEXT
);
INSERT INTO Products (name, price, description) VALUES
('商品A', 1000, '素晴らしい商品です'),
('商品B', NULL, '価格未定'),
('商品C', 500, NULL),
('商品D', NULL, NULL);
* sqlite:///data.db
Done.
4 rows affected.
[]
%%sql
SELECT * FROM Products;
* sqlite:///data.db
Done.
| id | name | price | description |
|---|---|---|---|
| 1 | 商品A | 1000 | 素晴らしい商品です |
| 2 | 商品B | None | 価格未定 |
| 3 | 商品C | 500 | None |
| 4 | 商品D | None | None |
NULL Problems
For example, suppose you want to calculate the amount with consumption tax (10%) added to the price.
What happens when you execute the normal calculation formula price * 1.1?
query = """
SELECT
name,
price,
price * 1.1 as price_with_tax
FROM
Products;
"""
with sqlite3.connect("data.db") as conn:
df = pl.read_database(query, connection=conn)
display(df)
shape: (4, 3)
| name | price | price_with_tax |
|---|---|---|
| “商品A” | 1000 | 1100.0 |
| “商品B” | null | null |
| “商品C” | 500 | 550.0 |
| “商品D” | null | null |
For rows where price is NULL, the calculation result also becomes NULL.
This is because SQL basically has a rule that all calculation results including NULL become NULL.
Utilizing COALESCE Function
COALESCE(argument1, argument2, ...) is a function that returns the first non-NULL value among the arguments.
Using this, let’s perform processing such as “treat as 0 if price is NULL” and “display ‘Unknown’ if description is NULL”.
query = """
SELECT
name,
-- Return 0 if price is NULL
COALESCE(price, 0) as safe_price,
-- Calculate using safe_price
COALESCE(price, 0) * 1.1 as price_with_tax,
-- Return 'Unknown' if description is NULL
COALESCE(description, '不明') as safe_description
FROM
Products;
"""
with sqlite3.connect("data.db") as conn:
df = pl.read_database(query, connection=conn)
display(df)
shape: (4, 4)
| name | safe_price | price_with_tax | safe_description |
|---|---|---|---|
| “商品A” | 1000 | 1100.0 | “素晴らしい商品です” |
| “商品B” | 0 | 0.0 | “価格未定” |
| “商品C” | 500 | 550.0 | “不明” |
| “商品D” | 0 | 0.0 | “不明” |
Explanation
COALESCE(price, 0): Returns the value ofpriceif it exists, otherwise returns0.COALESCE(description, '不明'): Returns the value ofdescriptionif it exists, otherwise returns'不明'(Unknown).
This allows you to prevent calculation results from becoming NULL and fill in missing values for display.
Summary
- In SQL calculations,
NULLmakes the resultNULL(infects). - By using the COALESCE function, you can replace
NULLwith a specific value (default value).
When aggregating data or creating reports, actively use COALESCE to prevent calculation errors due to unexpected NULL.