Skip to main content

PostgreSQL JSON Types

Introduction

PostgreSQL, a powerful open-source relational database, offers robust support for storing and querying JSON (JavaScript Object Notation) data. JSON has become an essential data format for web applications and APIs, allowing for the storage of flexible, schema-less data within a traditional relational database.

PostgreSQL provides two specialized data types for working with JSON:

  1. json - Stores an exact copy of the input text
  2. jsonb - Stores data in a decomposed binary format

In this tutorial, you'll learn how to effectively use these JSON data types in PostgreSQL, enabling you to combine the flexibility of NoSQL databases with the reliability and feature richness of PostgreSQL.

Understanding JSON vs. JSONB

Before diving into examples, let's understand the key differences between the two JSON types PostgreSQL offers:

Featurejsonjsonb
Storage formatText (exact copy)Binary (decomposed)
ProcessingSlower for operationsFaster for operations
IndexingLimited supportFull support (GIN indexes)
Preserves whitespaceYesNo
Preserves key orderYesNo
Duplicate keysPreserves allKeeps last value only

The jsonb type is generally recommended for most use cases due to its better performance and indexing capabilities.

Creating Tables with JSON Types

Let's start by creating a table that uses JSON data types:

sql
CREATE TABLE product_catalog (
id SERIAL PRIMARY KEY,
name VARCHAR(100) NOT NULL,
details JSON,
attributes JSONB
);

In this example:

  • details uses the json type to store product details
  • attributes uses the jsonb type to store product attributes

Inserting JSON Data

You can insert JSON data in several ways:

sql
-- Insert using JSON string literals
INSERT INTO product_catalog (name, details, attributes)
VALUES (
'Smartphone X',
'{"brand": "TechCo", "model": "X10", "year": 2023}',
'{"color": "black", "storage": "128GB", "features": ["waterproof", "5G"]}'
);

-- Insert using PostgreSQL's JSON construction functions
INSERT INTO product_catalog (name, details, attributes)
VALUES (
'Laptop Pro',
json_build_object('brand', 'ComputerCo', 'model', 'Pro15', 'year', 2023),
jsonb_build_object('color', 'silver', 'ram', '16GB', 'features',
jsonb_build_array('touchscreen', 'backlit keyboard'))
);

Querying JSON Data

PostgreSQL provides powerful operators and functions for querying JSON data:

Basic Retrieval

Retrieve the entire JSON document:

sql
SELECT name, details, attributes 
FROM product_catalog
WHERE id = 1;

Output:

    name     |                   details                   |                             attributes
-------------+--------------------------------------------+---------------------------------------------------------------------
Smartphone X | {"brand": "TechCo", "model": "X10", "year": 2023} | {"color": "black", "storage": "128GB", "features": ["waterproof", "5G"]}

Extracting Values

Extract specific values using the -> and ->> operators:

sql
-- -> returns JSON value
-- ->> returns text value

-- Get brand as JSON
SELECT name, details->'brand' AS brand_json
FROM product_catalog;

-- Get brand as text
SELECT name, details->>'brand' AS brand_text
FROM product_catalog;

-- Access array elements (zero-based)
SELECT name, attributes->'features'->0 AS first_feature
FROM product_catalog;

Output:

    name     | brand_json
-------------+------------
Smartphone X | "TechCo"
Laptop Pro | "ComputerCo"

name | brand_text
-------------+------------
Smartphone X | TechCo
Laptop Pro | ComputerCo

name | first_feature
-------------+---------------
Smartphone X | "waterproof"
Laptop Pro | "touchscreen"

Filtering by JSON Values

You can filter based on JSON field values:

sql
-- Find products from TechCo
SELECT name, details
FROM product_catalog
WHERE details->>'brand' = 'TechCo';

-- Find products with waterproof feature
SELECT name
FROM product_catalog
WHERE attributes->'features' @> '"waterproof"'::jsonb;

The @> operator checks if the left JSON contains the right JSON as a subset.

Working with JSONB Indexes

One of the biggest advantages of jsonb over json is its support for efficient indexing:

sql
-- Create a GIN index on the entire attributes field
CREATE INDEX idx_product_attributes ON product_catalog USING GIN (attributes);

-- Create a GIN index specifically for the features array
CREATE INDEX idx_product_features ON product_catalog USING GIN ((attributes->'features'));

These indexes significantly speed up queries that:

  • Check for containment (@>)
  • Use path expressions (->, ->>)
  • Search for keys (?, ?&, ?|)

JSON Modification Functions

PostgreSQL provides functions to modify JSON values:

sql
-- Add or update fields in a JSONB document
UPDATE product_catalog
SET attributes = attributes || '{"warranty": "2 years", "price": 899.99}'::jsonb
WHERE id = 1;

-- Remove a field from a JSONB document
UPDATE product_catalog
SET attributes = attributes - 'price'
WHERE id = 1;

-- Remove multiple fields
UPDATE product_catalog
SET attributes = attributes - '{price, warranty}'::text[]
WHERE id = 1;

Aggregating JSON Data

PostgreSQL can aggregate JSON data from multiple rows:

sql
-- Combine JSON objects from different rows
SELECT jsonb_agg(attributes) AS all_attributes
FROM product_catalog;

-- Combine specific JSON values into an array
SELECT jsonb_agg(attributes->'features') AS all_features
FROM product_catalog;

Output:

all_attributes
----------------------------------------------------------------
[{"color": "black", "storage": "128GB", "features": ["waterproof", "5G"]}, {"ram": "16GB", "color": "silver", "features": ["touchscreen", "backlit keyboard"]}]

all_features
----------------------------------------------------------------
[["waterproof", "5G"], ["touchscreen", "backlit keyboard"]]

Practical Use Case: Product Catalog API

Let's explore a practical example for an e-commerce API:

sql
-- Create a more realistic product table
CREATE TABLE products (
id SERIAL PRIMARY KEY,
sku VARCHAR(50) UNIQUE NOT NULL,
name VARCHAR(100) NOT NULL,
base_price DECIMAL(10, 2) NOT NULL,
metadata JSONB
);

-- Insert some products
INSERT INTO products (sku, name, base_price, metadata)
VALUES
('PHONE-X10', 'Smartphone X', 699.99,
'{"brand": "TechCo",
"specs": {"screen": "6.1 inch", "processor": "OctaCore 2.5GHz", "camera": "48MP"},
"variants": [
{"color": "black", "storage": "128GB", "price_adjustment": 0},
{"color": "black", "storage": "256GB", "price_adjustment": 100},
{"color": "silver", "storage": "128GB", "price_adjustment": 0},
{"color": "silver", "storage": "256GB", "price_adjustment": 100}
],
"tags": ["smartphone", "5G", "new arrival"],
"in_stock": true
}'),

('LAPTOP-PRO15', 'Laptop Pro', 1299.99,
'{"brand": "ComputerCo",
"specs": {"screen": "15.6 inch", "processor": "QuadCore 3.2GHz", "ram": "16GB"},
"variants": [
{"color": "silver", "storage": "512GB SSD", "price_adjustment": 0},
{"color": "silver", "storage": "1TB SSD", "price_adjustment": 200},
{"color": "space gray", "storage": "512GB SSD", "price_adjustment": 0},
{"color": "space gray", "storage": "1TB SSD", "price_adjustment": 200}
],
"tags": ["laptop", "high-performance", "thin"],
"in_stock": true
}');

Now let's perform some API-like queries:

sql
-- Get all available colors for a product
SELECT DISTINCT jsonb_array_elements(metadata->'variants')->>'color' AS available_colors
FROM products
WHERE sku = 'PHONE-X10';

-- Find products with specific tags
SELECT name, base_price
FROM products
WHERE metadata->'tags' @> '["5G"]'::jsonb;

-- Get pricing for all variants of a product
SELECT
name,
(v->>'color') AS color,
(v->>'storage') AS storage,
base_price + COALESCE((v->>'price_adjustment')::numeric, 0) AS price
FROM products,
jsonb_array_elements(metadata->'variants') AS v
WHERE sku = 'LAPTOP-PRO15'
ORDER BY color, storage;

Output for the last query:

    name    |    color    |  storage   |  price  
------------+-------------+------------+---------
Laptop Pro | silver | 512GB SSD | 1299.99
Laptop Pro | silver | 1TB SSD | 1499.99
Laptop Pro | space gray | 512GB SSD | 1299.99
Laptop Pro | space gray | 1TB SSD | 1499.99

Performance Considerations

When working with JSON types in PostgreSQL, keep these tips in mind:

  1. Use jsonb instead of json for better performance in most cases
  2. Create appropriate indexes for common query patterns
  3. Consider extracting frequently queried fields into regular columns
  4. Use containment operators (@>) with indexes for the best performance
  5. Be aware that very large JSON documents may impact performance

Summary

PostgreSQL's JSON types provide a powerful way to work with semi-structured data within a relational database. The json type preserves the exact input format, while the jsonb type offers better performance and indexing capabilities.

Key takeaways:

  • Use json when you need to preserve exact formatting and order
  • Use jsonb for better query performance and indexing support
  • Take advantage of operators like ->, ->>, and @> for querying JSON data
  • Use GIN indexes to speed up queries on JSONB columns
  • JSON types allow for flexible schema designs without sacrificing the benefits of a relational database

Exercises

  1. Create a table to store user profiles with a mix of structured and unstructured data using both regular columns and a JSONB column.
  2. Write a query to find all products that have both "5G" and "new arrival" tags.
  3. Modify the product metadata to add a new field called "reviews" that contains an array of review objects.
  4. Write a query that calculates the average price of all variants for each product.
  5. Create an appropriate index to optimize a query that frequently searches for products by their specifications.

Additional Resources



If you spot any mistakes on this website, please let me know at feedback@compilenrun.com. I’d greatly appreciate your feedback! :)