Skip to main content

MySQL Localization

Introduction

In today's globally connected world, applications often need to support multiple languages and regional formats. MySQL's localization features allow you to build applications that properly handle various languages, character sets, and regional settings. This is crucial for developing international applications that work correctly across different countries and cultures.

Localization in MySQL involves several key components:

  • Character sets - Define which symbols are available for use
  • Collations - Determine how strings are compared and sorted
  • Time zones - Handle different regional time representations
  • Locale-specific date and time formats - Display dates in culturally appropriate ways

This guide will help you understand and implement MySQL localization features in your applications, ensuring they can be used effectively worldwide.

Understanding Character Sets and Collations

Character Sets Basics

A character set is a collection of symbols and encodings. MySQL supports a wide variety of character sets to accommodate different languages and writing systems.

Viewing Available Character Sets

You can see all available character sets in MySQL using:

sql
SHOW CHARACTER SET;

Example output:

+----------+-----------------------------+---------------------+--------+
| Charset | Description | Default collation | Maxlen |
+----------+-----------------------------+---------------------+--------+
| armscii8 | ARMSCII-8 Armenian | armscii8_general_ci | 1 |
| ascii | US ASCII | ascii_general_ci | 1 |
| big5 | Big5 Traditional Chinese | big5_chinese_ci | 2 |
| binary | Binary pseudo charset | binary | 1 |
| cp1250 | Windows Central European | cp1250_general_ci | 1 |
| cp1251 | Windows Cyrillic | cp1251_general_ci | 1 |
| utf8 | UTF-8 Unicode | utf8_general_ci | 3 |
| utf8mb4 | UTF-8 Unicode | utf8mb4_0900_ai_ci | 4 |
| ... | ... | ... | ... |
+----------+-----------------------------+---------------------+--------+

Collations Explained

A collation is a set of rules that determine how character strings are compared and sorted. Each character set in MySQL has at least one collation, and many have several.

Viewing Collations for a Character Set

To see all collations for a specific character set:

sql
SHOW COLLATION WHERE Charset = 'utf8mb4';

Example output:

+----------------------------+---------+-----+---------+----------+---------+
| Collation | Charset | Id | Default | Compiled | Sortlen |
+----------------------------+---------+-----+---------+----------+---------+
| utf8mb4_0900_ai_ci | utf8mb4 | 255 | Yes | Yes | 0 |
| utf8mb4_0900_as_ci | utf8mb4 | 305 | | Yes | 0 |
| utf8mb4_bin | utf8mb4 | 46 | | Yes | 1 |
| utf8mb4_croatian_ci | utf8mb4 | 245 | | Yes | 8 |
| ... | ... | ... | ... | ... | ... |
+----------------------------+---------+-----+---------+----------+---------+

Naming Conventions for Collations

Collation names follow patterns that tell you about their behavior:

  • _bin: Binary comparison (case-sensitive, based on code points)
  • _ci: Case-insensitive comparison
  • _cs: Case-sensitive comparison
  • _ai: Accent-insensitive comparison
  • _as: Accent-sensitive comparison

For example, utf8mb4_general_ci is case-insensitive, while utf8mb4_bin is case-sensitive.

Setting Up Character Sets and Collations

Server-Level Configuration

You can configure default character sets and collations in the MySQL configuration file (my.cnf or my.ini):

ini
[mysqld]
character-set-server = utf8mb4
collation-server = utf8mb4_unicode_ci

Database-Level Configuration

When creating a new database:

sql
CREATE DATABASE multilingual_app
CHARACTER SET utf8mb4
COLLATE utf8mb4_unicode_ci;

To modify an existing database:

sql
ALTER DATABASE multilingual_app
CHARACTER SET utf8mb4
COLLATE utf8mb4_unicode_ci;

Table-Level Configuration

When creating tables:

sql
CREATE TABLE product_descriptions (
id INT AUTO_INCREMENT PRIMARY KEY,
language_code CHAR(2),
description TEXT
) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;

Column-Level Configuration

You can also set character sets for individual columns:

sql
CREATE TABLE customer_feedback (
id INT AUTO_INCREMENT PRIMARY KEY,
comment TEXT CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci,
rating TINYINT
);

Connection-Level Configuration

To set character sets for your current connection:

sql
SET NAMES 'utf8mb4' COLLATE 'utf8mb4_unicode_ci';

Checking Current Settings

To check the current character set and collation settings:

sql
SHOW VARIABLES LIKE 'character\_set\_%';
SHOW VARIABLES LIKE 'collation\_%';

Practical Example: Multi-language Product Catalog

Let's create a simple multi-language product catalog to demonstrate MySQL localization:

sql
-- Create database with proper character set and collation
CREATE DATABASE product_catalog
CHARACTER SET utf8mb4
COLLATE utf8mb4_unicode_ci;

USE product_catalog;

-- Products table
CREATE TABLE products (
product_id INT AUTO_INCREMENT PRIMARY KEY,
price DECIMAL(10, 2),
image_url VARCHAR(255)
);

-- Product translations table
CREATE TABLE product_translations (
translation_id INT AUTO_INCREMENT PRIMARY KEY,
product_id INT,
language_code CHAR(2),
product_name VARCHAR(100),
description TEXT,
FOREIGN KEY (product_id) REFERENCES products(product_id),
UNIQUE KEY (product_id, language_code)
);

-- Insert sample product
INSERT INTO products (price, image_url)
VALUES (19.99, 'coffee_maker.jpg');

-- Insert translations
INSERT INTO product_translations
(product_id, language_code, product_name, description)
VALUES
(1, 'en', 'Coffee Maker', 'A premium coffee maker with temperature control'),
(1, 'es', 'Cafetera', 'Una cafetera premium con control de temperatura'),
(1, 'fr', 'Cafetière', 'Une cafetière haut de gamme avec contrôle de température'),
(1, 'de', 'Kaffeemaschine', 'Eine Premium-Kaffeemaschine mit Temperaturregelung'),
(1, 'zh', '咖啡机', '一款带温度控制的高级咖啡机');

Retrieving Localized Content

To get product information in a specific language:

sql
SELECT p.product_id, p.price, t.product_name, t.description
FROM products p
JOIN product_translations t ON p.product_id = t.product_id
WHERE t.language_code = 'es';

Result:

+-----------+-------+-------------+-------------------------------------------+
| product_id | price | product_name | description |
+-----------+-------+-------------+-------------------------------------------+
| 1 | 19.99 | Cafetera | Una cafetera premium con control de temperatura |
+-----------+-------+-------------+-------------------------------------------+

Working with Time Zones

Setting the Server Time Zone

In MySQL configuration file (my.cnf or my.ini):

ini
[mysqld]
default-time-zone = '+00:00'

Or using SQL:

sql
SET GLOBAL time_zone = '+00:00';

Setting Session Time Zone

sql
SET time_zone = '+02:00';  -- Set to Eastern European Time

Checking Available Time Zones

MySQL maintains a time zone table that must be populated:

sql
SELECT * FROM mysql.time_zone_name LIMIT 5;

If empty, you need to populate it using the mysql_tzinfo_to_sql utility.

Storing and Converting Dates

Creating a Table with Timestamp

sql
CREATE TABLE global_events (
event_id INT AUTO_INCREMENT PRIMARY KEY,
event_name VARCHAR(100),
event_time TIMESTAMP,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

Converting Between Time Zones

sql
-- Insert an event (stored in the server's time zone)
INSERT INTO global_events (event_name, event_time)
VALUES ('Product Launch', '2023-06-15 14:30:00');

-- Retrieve the event time in different time zones
SELECT
event_name,
event_time,
CONVERT_TZ(event_time, '+00:00', '+09:00') AS japan_time,
CONVERT_TZ(event_time, '+00:00', '-05:00') AS new_york_time
FROM global_events;

Result:

+---------------+---------------------+---------------------+---------------------+
| event_name | event_time | japan_time | new_york_time |
+---------------+---------------------+---------------------+---------------------+
| Product Launch| 2023-06-15 14:30:00 | 2023-06-15 23:30:00 | 2023-06-15 09:30:00 |
+---------------+---------------------+---------------------+---------------------+

Locale-Specific Date Formatting

MySQL provides functions to format dates according to locale preferences:

sql
SELECT 
event_name,
DATE_FORMAT(event_time, '%W, %D %M %Y %H:%i') AS formatted_en,
DATE_FORMAT(event_time, '%a %d %b %Y %H:%i') AS formatted_short
FROM global_events;

Result:

+---------------+--------------------------------------+-------------------------+
| event_name | formatted_en | formatted_short |
+---------------+--------------------------------------+-------------------------+
| Product Launch| Thursday, 15th June 2023 14:30 | Thu 15 Jun 2023 14:30 |
+---------------+--------------------------------------+-------------------------+

Best Practices for MySQL Localization

  1. Use UTF-8MB4: Always use utf8mb4 instead of utf8 character set. It supports the full range of Unicode characters including emoji.

  2. Set Character Sets at All Levels: Configure character sets at server, database, table, and column levels for consistent behavior.

  3. Always Store UTC Times: Store timestamps in UTC and convert to local time zones only for display purposes.

  4. Use Prepared Statements: Use prepared statements to prevent character encoding issues when submitting queries.

  5. Test with Real Multilingual Data: Test your application with real multilingual data, including complex scripts like Arabic, Thai, or Chinese.

  6. Use Appropriate Collations: Choose collations based on your sorting needs (case-sensitive vs. insensitive, accent-sensitive vs. insensitive).

Troubleshooting Common Localization Issues

Garbled Characters

If you see garbled characters (often appearing as ???? or ):

  1. Check all character set variables:

    sql
    SHOW VARIABLES LIKE 'char%';
  2. Ensure your application connection uses the same character set:

    sql
    SET NAMES utf8mb4;

Sorting Issues

If strings aren't sorted as expected for a particular language:

  1. Verify you're using the correct collation for the language:
    sql
    -- For German sorting
    ALTER TABLE german_books MODIFY title VARCHAR(100) COLLATE utf8mb4_german2_ci;

Incorrect Time Zone Conversion

If time zones aren't converting properly:

  1. Check if the time zone tables are loaded:

    sql
    SELECT COUNT(*) FROM mysql.time_zone_name;
  2. Ensure you're using the CONVERT_TZ function correctly:

    sql
    -- Use named time zones instead of offsets for DST handling
    SELECT CONVERT_TZ('2023-01-01 12:00:00', 'UTC', 'Europe/Paris');

Real-World Application: International E-commerce Platform

Let's design a simplified schema for an international e-commerce platform:

sql
CREATE DATABASE global_shop
CHARACTER SET utf8mb4
COLLATE utf8mb4_unicode_ci;

USE global_shop;

CREATE TABLE products (
product_id INT AUTO_INCREMENT PRIMARY KEY,
sku VARCHAR(20) UNIQUE,
price DECIMAL(10, 2),
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

CREATE TABLE product_translations (
id INT AUTO_INCREMENT PRIMARY KEY,
product_id INT,
language_code CHAR(2),
product_name VARCHAR(100),
description TEXT,
FOREIGN KEY (product_id) REFERENCES products(product_id),
UNIQUE KEY (product_id, language_code)
);

CREATE TABLE currencies (
currency_code CHAR(3) PRIMARY KEY,
exchange_rate DECIMAL(10, 6) -- Rate compared to base currency
);

CREATE TABLE orders (
order_id INT AUTO_INCREMENT PRIMARY KEY,
user_id INT,
order_time TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
currency_code CHAR(3),
total_amount DECIMAL(10, 2),
FOREIGN KEY (currency_code) REFERENCES currencies(currency_code)
);

Handling Multiple Languages and Currencies

sql
-- Add products
INSERT INTO products (sku, price) VALUES ('WIDGET-1', 29.99);

-- Add translations
INSERT INTO product_translations
(product_id, language_code, product_name, description)
VALUES
(1, 'en', 'Premium Widget', 'High-quality widget for all your needs'),
(1, 'fr', 'Widget Premium', 'Widget de haute qualité pour tous vos besoins'),
(1, 'ja', 'プレミアムウィジェット', '高品質のウィジェットであなたのすべてのニーズに対応');

-- Add currencies
INSERT INTO currencies (currency_code, exchange_rate) VALUES
('USD', 1.0),
('EUR', 0.85),
('JPY', 110.25);

-- Create order with localized currency
INSERT INTO orders (user_id, currency_code, total_amount)
VALUES (42, 'EUR', 25.49);

Query to Retrieve Product in User's Preferred Language and Currency

sql
SET @user_language := 'fr';
SET @user_currency := 'EUR';

SELECT
p.product_id,
t.product_name,
t.description,
p.price AS base_price,
ROUND(p.price * c.exchange_rate, 2) AS localized_price,
@user_currency AS currency
FROM
products p
JOIN
product_translations t ON p.product_id = t.product_id AND t.language_code = @user_language
JOIN
currencies c ON c.currency_code = @user_currency;

Result:

+-----------+----------------+---------------------------------------------+------------+----------------+----------+
| product_id | product_name | description | base_price | localized_price | currency |
+-----------+----------------+---------------------------------------------+------------+----------------+----------+
| 1 | Widget Premium | Widget de haute qualité pour tous vos besoins | 29.99 | 25.49 | EUR |
+-----------+----------------+---------------------------------------------+------------+----------------+----------+

Summary

MySQL's localization features enable you to build truly international applications. In this guide, we've covered:

  • Character sets and collations for handling different languages and writing systems
  • Configuring localization settings at various levels (server, database, table, column)
  • Working with time zones for global applications
  • Displaying dates in locale-specific formats
  • Building multi-language applications with translation tables
  • Managing multiple currencies for international e-commerce

By implementing these localization techniques, your MySQL-based applications can provide a seamless experience for users worldwide, regardless of their language, region, or cultural preferences.

Additional Resources and Exercises

Resources for Further Learning

Practice Exercises

  1. Multi-language Blog System: Create a schema for a blog that supports posts in multiple languages, with the ability to show the same content in the user's preferred language.

  2. Time Zone Conversion: Build a meeting scheduler that stores meetings in UTC but displays times in each participant's local time zone.

  3. Collation Comparison: Create a table with the same text in different collations and write queries to compare sorting results.

  4. Currency Converter: Implement a system that keeps track of daily currency exchange rates and allows for converting product prices between different currencies.

  5. Locale-Specific Formatting: Create a function that formats dates differently based on regional preferences (MM/DD/YYYY for US, DD/MM/YYYY for Europe, etc.).

By mastering MySQL localization, you'll be able to develop more inclusive, user-friendly applications that work seamlessly across different cultures and languages.



If you spot any mistakes on this website, please let me know at [email protected]. I’d greatly appreciate your feedback! :)