PostgreSQL JSON Types
Introduction
PostgreSQL, a powerful open-source relational database, offers robust support for storing and querying JSON (JavaScript Object Notation) data. JSON has become an essential data format for web applications and APIs, allowing for the storage of flexible, schema-less data within a traditional relational database.
PostgreSQL provides two specialized data types for working with JSON:
json
- Stores an exact copy of the input textjsonb
- Stores data in a decomposed binary format
In this tutorial, you'll learn how to effectively use these JSON data types in PostgreSQL, enabling you to combine the flexibility of NoSQL databases with the reliability and feature richness of PostgreSQL.
Understanding JSON vs. JSONB
Before diving into examples, let's understand the key differences between the two JSON types PostgreSQL offers:
Feature | json | jsonb |
---|---|---|
Storage format | Text (exact copy) | Binary (decomposed) |
Processing | Slower for operations | Faster for operations |
Indexing | Limited support | Full support (GIN indexes) |
Preserves whitespace | Yes | No |
Preserves key order | Yes | No |
Duplicate keys | Preserves all | Keeps last value only |
The jsonb
type is generally recommended for most use cases due to its better performance and indexing capabilities.
Creating Tables with JSON Types
Let's start by creating a table that uses JSON data types:
CREATE TABLE product_catalog (
id SERIAL PRIMARY KEY,
name VARCHAR(100) NOT NULL,
details JSON,
attributes JSONB
);
In this example:
details
uses thejson
type to store product detailsattributes
uses thejsonb
type to store product attributes
Inserting JSON Data
You can insert JSON data in several ways:
-- Insert using JSON string literals
INSERT INTO product_catalog (name, details, attributes)
VALUES (
'Smartphone X',
'{"brand": "TechCo", "model": "X10", "year": 2023}',
'{"color": "black", "storage": "128GB", "features": ["waterproof", "5G"]}'
);
-- Insert using PostgreSQL's JSON construction functions
INSERT INTO product_catalog (name, details, attributes)
VALUES (
'Laptop Pro',
json_build_object('brand', 'ComputerCo', 'model', 'Pro15', 'year', 2023),
jsonb_build_object('color', 'silver', 'ram', '16GB', 'features',
jsonb_build_array('touchscreen', 'backlit keyboard'))
);
Querying JSON Data
PostgreSQL provides powerful operators and functions for querying JSON data:
Basic Retrieval
Retrieve the entire JSON document:
SELECT name, details, attributes
FROM product_catalog
WHERE id = 1;
Output:
name | details | attributes
-------------+--------------------------------------------+---------------------------------------------------------------------
Smartphone X | {"brand": "TechCo", "model": "X10", "year": 2023} | {"color": "black", "storage": "128GB", "features": ["waterproof", "5G"]}
Extracting Values
Extract specific values using the ->
and ->>
operators:
-- -> returns JSON value
-- ->> returns text value
-- Get brand as JSON
SELECT name, details->'brand' AS brand_json
FROM product_catalog;
-- Get brand as text
SELECT name, details->>'brand' AS brand_text
FROM product_catalog;
-- Access array elements (zero-based)
SELECT name, attributes->'features'->0 AS first_feature
FROM product_catalog;
Output:
name | brand_json
-------------+------------
Smartphone X | "TechCo"
Laptop Pro | "ComputerCo"
name | brand_text
-------------+------------
Smartphone X | TechCo
Laptop Pro | ComputerCo
name | first_feature
-------------+---------------
Smartphone X | "waterproof"
Laptop Pro | "touchscreen"
Filtering by JSON Values
You can filter based on JSON field values:
-- Find products from TechCo
SELECT name, details
FROM product_catalog
WHERE details->>'brand' = 'TechCo';
-- Find products with waterproof feature
SELECT name
FROM product_catalog
WHERE attributes->'features' @> '"waterproof"'::jsonb;
The @>
operator checks if the left JSON contains the right JSON as a subset.
Working with JSONB Indexes
One of the biggest advantages of jsonb
over json
is its support for efficient indexing:
-- Create a GIN index on the entire attributes field
CREATE INDEX idx_product_attributes ON product_catalog USING GIN (attributes);
-- Create a GIN index specifically for the features array
CREATE INDEX idx_product_features ON product_catalog USING GIN ((attributes->'features'));
These indexes significantly speed up queries that:
- Check for containment (
@>
) - Use path expressions (
->
,->>
) - Search for keys (
?
,?&
,?|
)
JSON Modification Functions
PostgreSQL provides functions to modify JSON values:
-- Add or update fields in a JSONB document
UPDATE product_catalog
SET attributes = attributes || '{"warranty": "2 years", "price": 899.99}'::jsonb
WHERE id = 1;
-- Remove a field from a JSONB document
UPDATE product_catalog
SET attributes = attributes - 'price'
WHERE id = 1;
-- Remove multiple fields
UPDATE product_catalog
SET attributes = attributes - '{price, warranty}'::text[]
WHERE id = 1;
Aggregating JSON Data
PostgreSQL can aggregate JSON data from multiple rows:
-- Combine JSON objects from different rows
SELECT jsonb_agg(attributes) AS all_attributes
FROM product_catalog;
-- Combine specific JSON values into an array
SELECT jsonb_agg(attributes->'features') AS all_features
FROM product_catalog;
Output:
all_attributes
----------------------------------------------------------------
[{"color": "black", "storage": "128GB", "features": ["waterproof", "5G"]}, {"ram": "16GB", "color": "silver", "features": ["touchscreen", "backlit keyboard"]}]
all_features
----------------------------------------------------------------
[["waterproof", "5G"], ["touchscreen", "backlit keyboard"]]
Practical Use Case: Product Catalog API
Let's explore a practical example for an e-commerce API:
-- Create a more realistic product table
CREATE TABLE products (
id SERIAL PRIMARY KEY,
sku VARCHAR(50) UNIQUE NOT NULL,
name VARCHAR(100) NOT NULL,
base_price DECIMAL(10, 2) NOT NULL,
metadata JSONB
);
-- Insert some products
INSERT INTO products (sku, name, base_price, metadata)
VALUES
('PHONE-X10', 'Smartphone X', 699.99,
'{"brand": "TechCo",
"specs": {"screen": "6.1 inch", "processor": "OctaCore 2.5GHz", "camera": "48MP"},
"variants": [
{"color": "black", "storage": "128GB", "price_adjustment": 0},
{"color": "black", "storage": "256GB", "price_adjustment": 100},
{"color": "silver", "storage": "128GB", "price_adjustment": 0},
{"color": "silver", "storage": "256GB", "price_adjustment": 100}
],
"tags": ["smartphone", "5G", "new arrival"],
"in_stock": true
}'),
('LAPTOP-PRO15', 'Laptop Pro', 1299.99,
'{"brand": "ComputerCo",
"specs": {"screen": "15.6 inch", "processor": "QuadCore 3.2GHz", "ram": "16GB"},
"variants": [
{"color": "silver", "storage": "512GB SSD", "price_adjustment": 0},
{"color": "silver", "storage": "1TB SSD", "price_adjustment": 200},
{"color": "space gray", "storage": "512GB SSD", "price_adjustment": 0},
{"color": "space gray", "storage": "1TB SSD", "price_adjustment": 200}
],
"tags": ["laptop", "high-performance", "thin"],
"in_stock": true
}');
Now let's perform some API-like queries:
-- Get all available colors for a product
SELECT DISTINCT jsonb_array_elements(metadata->'variants')->>'color' AS available_colors
FROM products
WHERE sku = 'PHONE-X10';
-- Find products with specific tags
SELECT name, base_price
FROM products
WHERE metadata->'tags' @> '["5G"]'::jsonb;
-- Get pricing for all variants of a product
SELECT
name,
(v->>'color') AS color,
(v->>'storage') AS storage,
base_price + COALESCE((v->>'price_adjustment')::numeric, 0) AS price
FROM products,
jsonb_array_elements(metadata->'variants') AS v
WHERE sku = 'LAPTOP-PRO15'
ORDER BY color, storage;
Output for the last query:
name | color | storage | price
------------+-------------+------------+---------
Laptop Pro | silver | 512GB SSD | 1299.99
Laptop Pro | silver | 1TB SSD | 1499.99
Laptop Pro | space gray | 512GB SSD | 1299.99
Laptop Pro | space gray | 1TB SSD | 1499.99
Performance Considerations
When working with JSON types in PostgreSQL, keep these tips in mind:
- Use
jsonb
instead ofjson
for better performance in most cases - Create appropriate indexes for common query patterns
- Consider extracting frequently queried fields into regular columns
- Use containment operators (
@>
) with indexes for the best performance - Be aware that very large JSON documents may impact performance
Summary
PostgreSQL's JSON types provide a powerful way to work with semi-structured data within a relational database. The json
type preserves the exact input format, while the jsonb
type offers better performance and indexing capabilities.
Key takeaways:
- Use
json
when you need to preserve exact formatting and order - Use
jsonb
for better query performance and indexing support - Take advantage of operators like
->
,->>
, and@>
for querying JSON data - Use GIN indexes to speed up queries on JSONB columns
- JSON types allow for flexible schema designs without sacrificing the benefits of a relational database
Exercises
- Create a table to store user profiles with a mix of structured and unstructured data using both regular columns and a JSONB column.
- Write a query to find all products that have both "5G" and "new arrival" tags.
- Modify the product metadata to add a new field called "reviews" that contains an array of review objects.
- Write a query that calculates the average price of all variants for each product.
- Create an appropriate index to optimize a query that frequently searches for products by their specifications.
Additional Resources
If you spot any mistakes on this website, please let me know at feedback@compilenrun.com. I’d greatly appreciate your feedback! :)