[sqlite3] How to Use EXISTS and IN

[SQL] How to Use EXISTS and IN

In this article, I will explain the difference between SQL EXISTS and IN. Both are often used to extract “rows that meet certain conditions” using subqueries, but there are differences in behavior and performance.

In particular, let’s look at how to use them from the perspectives of NULL handling and performance.

Preparation

In this site, I will use python libraries sqlite3 and ipython-sql to execute and explain SQL on Jupyter Notebook. Also, I will use Polars to display dataframes.

github

The jupyter notebook file on github is here

google colaboratory

If you want to run it on google colaboratory, click here

Author’s Environment

Here is the author’s environment.

!sw_vers

ProductName:		macOS
ProductVersion:		15.5
BuildVersion:		24F74

!python -V

Python 3.12.12

Import necessary libraries.

import sqlite3
import polars as pl

%load_ext sql
%config SqlMagic.feedback = True

%config SqlMagic.style = '_DEPRECATED_DEFAULT'

Data Preparation

As a sample, create a customer table Customers and an order table Orders.

import os

if os.path.exists("data.db"):
    os.remove("data.db")

os.system("touch data.db")
os.system("chmod 664 data.db")

# Database connection
%sql sqlite:///data.db

'Connected: @data.db'

%%sql
CREATE TABLE Customers (
    id INTEGER PRIMARY KEY,
    name TEXT
);

CREATE TABLE Orders (
    id INTEGER PRIMARY KEY,
    customer_id INTEGER,
    amount INTEGER
);

INSERT INTO Customers (name) VALUES
    ('Aさん'),
    ('Bさん'),
    ('Cさん'),
    ('Dさん');

INSERT INTO Orders (customer_id, amount) VALUES
    (1, 100),
    (1, 200),
    (3, 500);

 * sqlite:///data.db
Done.
Done.
4 rows affected.
3 rows affected.





[]

%%sql
SELECT * FROM Customers;

 * sqlite:///data.db
Done.

id	name
1	Aさん
2	Bさん
3	Cさん
4	Dさん

%%sql
SELECT * FROM Orders;

 * sqlite:///data.db
Done.

id	customer_id	amount
1	1	100
2	1	200
3	3	500

Task: Extract customers with order history

A-san and C-san have order history, but B-san and D-san do not. Let’s extract only customers with order history.

Method 1: Using IN

Use the IN clause to extract customers included in the list of customer_id in the Orders table.

query = """
SELECT
    *
FROM
    Customers
WHERE
    id IN (SELECT customer_id FROM Orders);
"""

with sqlite3.connect("data.db") as conn:
    df = pl.read_database(query, connection=conn)
    display(df)

shape: (2, 2)

id	name
1	“Aさん”
3	“Cさん”

Method 2: Using EXISTS

Use the EXISTS clause to check whether there are related rows in the Orders table.

query = """
SELECT
    *
FROM
    Customers c
WHERE
    EXISTS (
        SELECT
            1
        FROM
            Orders o
        WHERE
            c.id = o.customer_id
    );
"""

with sqlite3.connect("data.db") as conn:
    df = pl.read_database(query, connection=conn)
    display(df)