[SQL] SQL Execution Environment in Python and Basic SELECT

I will note down SQL (Structured Query Language) techniques, which are essential for data analysis and system development, for my own reference with actual examples.

In this article, I will explain how to build an environment to execute and verify SQL on Python, which will be used in this series, and the most basic SELECT statement.

In this series, I will use SQLite, which is lightweight and easy to handle, and Polars, a fast dataframe library, in combination.

Preparation

In this site, I will use python libraries sqlite3 and ipython-sql to execute and explain SQL on Jupyter Notebook. Also, I will use Polars to display dataframes.

github

  • The jupyter notebook file on github is here

google colaboratory

  • If you want to run it on google colaboratory, click here

Author’s Environment

Here is the author’s environment.

!sw_vers
ProductName:		macOS
ProductVersion:		15.5
BuildVersion:		24F74
!python -V
Python 3.12.12

Import necessary libraries.

import sqlite3
import polars as pl

%load_ext sql
%config SqlMagic.feedback = True

%config SqlMagic.style = '_DEPRECATED_DEFAULT'

Data Preparation

First, create a database file (SQLite) for executing SQL, and create a Users table as sample data.

import os

if os.path.exists("data.db"):
    os.remove("data.db")

os.system("touch data.db")
os.system("chmod 664 data.db")

# Database connection
%sql sqlite:///data.db
'Connected: @data.db'
%%sql
CREATE TABLE Users (
    id INTEGER PRIMARY KEY,
    name TEXT,
    age INTEGER,
    country TEXT
);

INSERT INTO Users (name, age, country) VALUES
    ('Alice', 25, 'USA'),
    ('Bob', 30, 'UK'),
    ('Charlie', 35, 'USA'),
    ('Dave', 20, 'Japan'),
    ('Eve', 28, 'Japan');
 * sqlite:///data.db
Done.
5 rows affected.





[]

SQL Execution and Result Confirmation

In this series, I adopt a method to receive SQL execution results as a Polars DataFrame and display them beautifully. By connecting with sqlite3 and using pl.read_database, you can directly convert SQL results into a DataFrame.

Basic SELECT

First, let’s retrieve all the contents of the table.

query = """
SELECT * FROM Users;
"""

with sqlite3.connect("data.db") as conn:
    df = pl.read_database(query, connection=conn)
    display(df)

shape: (5, 4)

idnameagecountry
1“Alice”25“USA”
2“Bob”30“UK”
3“Charlie”35“USA”
4“Dave”20“Japan”
5“Eve”28“Japan”

Condition Specification (WHERE) and Sorting (ORDER BY)

Next, let’s specify conditions to filter data and sort it. Example: Retrieve users whose country is ‘Japan’, in descending order of age (largest first).

query = """
SELECT
    name,
    age,
    country
FROM
    Users
WHERE
    country = 'Japan'
ORDER BY
    age DESC;
"""

with sqlite3.connect("data.db") as conn:
    df = pl.read_database(query, connection=conn)
    display(df)

shape: (2, 3)

nameagecountry
“Eve”28“Japan”
“Dave”20“Japan”

Summary

In this article, I introduced the setup of the SQL execution environment used in this series and how to execute basic queries.

  • SQLite: A file-based database that is easy to use.
  • Polars: A fast and easy-to-use dataframe library. Also useful for displaying SQL results.

From the next time onwards, I will explain more practical SQL techniques (Window functions, CTEs, joins, etc.) using this environment.