[SQL] String Concatenation by Group (GROUP_CONCAT)

In this article, I will explain the SQL GROUP_CONCAT function. This is an aggregate function that concatenates strings from rows grouped by GROUP BY into one string, such as comma-separated.

It is very convenient in cases such as “I want to display multiple tags associated with one article in one line”.

Preparation

In this site, I will use python libraries sqlite3 and ipython-sql to execute and explain SQL on Jupyter Notebook. Also, I will use Polars to display dataframes.

github

  • The jupyter notebook file on github is here

google colaboratory

  • If you want to run it on google colaboratory, click here

Author’s Environment

Here is the author’s environment.

!sw_vers
ProductName:		macOS
ProductVersion:		15.5
BuildVersion:		24F74
!python -V
Python 3.12.12

Import necessary libraries.

import sqlite3
import polars as pl

%load_ext sql
%config SqlMagic.feedback = True

%config SqlMagic.style = '_DEPRECATED_DEFAULT'

Data Preparation

As a sample, create a post table Posts and a tag table Tags.

import os

if os.path.exists("data.db"):
    os.remove("data.db")

os.system("touch data.db")
os.system("chmod 664 data.db")

# Database connection
%sql sqlite:///data.db
'Connected: @data.db'
%%sql
CREATE TABLE Posts (
    id INTEGER PRIMARY KEY,
    title TEXT
);

CREATE TABLE Tags (
    post_id INTEGER,
    tag_name TEXT
);

INSERT INTO Posts (id, title) VALUES
    (1, 'SQL入門'),
    (2, 'Pythonデータ分析');

INSERT INTO Tags (post_id, tag_name) VALUES
    (1, 'Database'),
    (1, 'Beginner'),
    (1, 'SQL'),
    (2, 'Python'),
    (2, 'DataScience'),
    (2, 'Pandas');
 * sqlite:///data.db
Done.
Done.
2 rows affected.
6 rows affected.





[]
%%sql
SELECT * FROM Posts;
 * sqlite:///data.db
Done.
idtitle
1SQL入門
2Pythonデータ分析
%%sql
SELECT * FROM Tags;
 * sqlite:///data.db
Done.
post_idtag_name
1Database
1Beginner
1SQL
2Python
2DataScience
2Pandas

Task: Display tags separated by commas for each post

If you JOIN normally, the number of rows increases by the number of tags. I want to consolidate this into one row per post.

query = """
SELECT
    p.title,
    GROUP_CONCAT(t.tag_name, ', ') as tags
FROM
    Posts p
JOIN
    Tags t ON p.id = t.post_id
GROUP BY
    p.id;
"""

with sqlite3.connect("data.db") as conn:
    df = pl.read_database(query, connection=conn)
    display(df)

shape: (2, 2)

titletags
“SQL入門”“Database, Beginner, SQL”
“Pythonデータ分析”“Python, DataScience, Pandas”

Explanation

  • GROUP_CONCAT(column name, delimiter): Concatenates the values of the specified column with the specified delimiter.
  • GROUP BY p.id: Group by post.

This allows you to express multiple values like a list in one row.

Note: Differences by Database

String concatenation functions have different names and syntax depending on the database product.

  • MySQL / SQLite: GROUP_CONCAT
  • PostgreSQL: STRING_AGG
  • Oracle: LISTAGG

Check the documentation of the database you are using.

Summary

  • Using GROUP_CONCAT, you can concatenate values within a group into one string.
  • It is very convenient when formatting data for report output or display on the application side.