Every column in a database is, in a sense, a labelled box. The label tells the engine what may be stored inside, how much room to reserve, and how the contents should behave when compared, sorted, or summed. Choosing the right label — the right data type — is one of the quietest yet most consequential decisions in SQL design.
If you have just learned how to write a SELECT statement, data types might feel like
background noise: a VARCHAR here, an INT there, and the database seems to
figure things out. But the moment your application grows — handling money, dates across time zones,
or millions of rows — those quiet decisions begin to speak loudly. Storage costs balloon. Queries slow
down. Numbers round in ways nobody asked for. Most of these problems trace back to a column whose
label did not quite fit its contents.
This article is a guided tour of the SQL data type landscape. We will walk through the main families of types, see how the major database systems implement them differently, and finish with practical advice for picking the right label for the right box.
Why Data Types Matter
Imagine moving house. You could throw everything into identical, oversized boxes and worry about it later — but unpacking would be chaos. A glass vase shoved next to a cast-iron pan is a recipe for breakage; books packed into a box meant for a refrigerator waste space and strain backs. Sensible movers label boxes: kitchen / fragile / books / electronics. The label is a contract. It tells everyone how to handle the contents.
SQL data types serve the same function, for three reasons:
-
Data integrity. A column declared as
DATErejects the string"banana"outright. The database refuses to store nonsense, which means your queries can trust what they read. -
Storage efficiency. A 2-byte
SMALLINTuses a quarter of the disk space of an 8-byteBIGINT. Multiply that by a billion rows and the difference becomes hardware. - Performance. Indexes on tightly typed columns are smaller and faster. Comparisons between two integers are faster than comparisons between two strings that happen to look like integers. The query optimiser can plan more aggressively when it knows exactly what it is dealing with.
A Map of the SQL Type Landscape
NULL.Numeric Types
Numeric types split cleanly into two camps: exact and approximate. Exact types store every digit you give them. Approximate types store a binary approximation that is fast to compute but can drift in the lower decimal places.
The integer family — SMALLINT (2 bytes), INTEGER / INT
(4 bytes), and BIGINT (8 bytes) — covers whole numbers. Pick the smallest one that
comfortably holds the largest value you expect, with a little headroom.
DECIMAL(p, s) (also written NUMERIC) stores fixed-precision decimals
exactly. FLOAT, REAL, and DOUBLE PRECISION use the IEEE 754
standard — excellent for scientific calculations, dangerous for money. The classic warning:
0.1 + 0.2 in floating point is not exactly 0.3, and a million such tiny
errors can silently corrupt a financial report.
Character and String Types
Three names dominate: CHAR, VARCHAR, and TEXT.
CHAR(n) is fixed length — store cat in a CHAR(10) and the engine
pads it to ten characters with spaces. It is efficient when every value really is the same length,
such as country codes or currency codes. VARCHAR(n) is variable length up to n
characters. TEXT handles free-form, potentially large strings with no practical length
cap in most engines.
Date and Time Types
Four conceptual types cover almost everything. DATE stores a calendar date with no
time. TIME stores a time of day with no date. TIMESTAMP stores both, and
many databases offer TIMESTAMP WITH TIME ZONE — essential for any application whose
users live in more than one place. INTERVAL, natively supported in PostgreSQL,
represents a duration: "3 days", "2 hours 15 minutes".
Boolean, Binary and Modern Types
A BOOLEAN column stores TRUE, FALSE, or NULL.
Conceptually the simplest type, yet one of the most fragmented in practice: not every engine has a
native boolean. Binary types hold raw bytes — images, encrypted blobs, file contents. PostgreSQL
calls this BYTEA; MySQL and SQLite use BLOB; SQL Server uses
VARBINARY. For large media files it is usually better to store the file in object
storage and keep only the URL in the database.
On the modern frontier, JSON columns store JSON documents as text and let you query
inside them. PostgreSQL adds JSONB, a binary-encoded variant that supports indexing.
Amazon Redshift uses its own SUPER type for semi-structured data. PostgreSQL also
supports native ARRAY types, letting a single column hold an ordered list of same-type
values — powerful when used carefully, problematic when used to avoid proper schema design.
The Same Idea, Five Different Dialects
Standard SQL provides the vocabulary, but every engine has its own nuances. The table below highlights the most common differences a beginner is likely to encounter.
| Type family | PostgreSQL | MySQL | SQL Server | SQLite | Redshift |
|---|---|---|---|---|---|
| Integers | SMALLINT, INTEGER, BIGINT | TINYINT, SMALLINT, INT, BIGINT | TINYINT, SMALLINT, INT, BIGINT | INTEGER (flexible size) | SMALLINT, INTEGER, BIGINT |
| Exact decimal | NUMERIC / DECIMAL | DECIMAL / NUMERIC | DECIMAL / NUMERIC | NUMERIC affinity only | DECIMAL / NUMERIC |
| Text | VARCHAR(n), TEXT (unlimited) | VARCHAR, TEXT (~64 KB), LONGTEXT (4 GB) | VARCHAR(n), NVARCHAR (Unicode) | TEXT (limits ignored) | VARCHAR(n), max 65,535 bytes |
| Boolean | BOOLEAN (native) | TINYINT(1) alias | BIT (0 / 1 / NULL) | INTEGER 0 / 1 | BOOLEAN (native) |
| Date & time | DATE, TIMESTAMP, TIMESTAMPTZ, INTERVAL | DATE, DATETIME, TIMESTAMP (UTC) | DATE, DATETIME2, DATETIMEOFFSET | No native type; use TEXT or INTEGER | DATE, TIMESTAMP, TIMESTAMPTZ |
| JSON / semi-structured | JSON, JSONB (indexable) | JSON (text, no indexing) | JSON via NVARCHAR functions | TEXT + JSON functions | SUPER (schemaless, PartiQL) |
The Special Case of NULL
NULL deserves its own section because it is not a value but a marker for the
absence of a value. It is valid in every type — a missing integer, a missing date, a missing string
— and it follows three-valued logic: NULL = NULL does not return true; it returns
NULL. That single rule confuses more beginners than perhaps any other feature of SQL.
When designing a column, decide consciously whether NULL should be allowed. Adding
NOT NULL turns an implicit "we'll figure it out" into an explicit, enforced contract.
Do not use sentinel values like -1, the empty string, or
1900-01-01 to fake missing data — that is exactly what NULL is for.
Choosing the Right Type
The guiding principle is simple: be as specific as possible, and no more so. The type system is documentation that the database itself enforces. Vague types invite vague data.
-
Avoid
VARCHAR(255)as a default. It is a relic from old MySQL row-format limits. A two-letter country code isCHAR(2). An email address isVARCHAR(254)per RFC 5321. A free-form comment isTEXT. -
Use
DECIMALfor money. Currency calculations must be exact to the cent.FLOATandREALwill eventually round in ways accountants will not forgive.DECIMAL(12, 2)handles values up to ten billion with two decimal places. -
Prefer
TIMESTAMP WITH TIME ZONEfor events. Anything that happens — an order placed, a message sent — should be stored with timezone awareness. Storing local time without an offset is how teams discover their data has shifted by an hour twice a year. -
Pick the smallest integer that fits. A flag with values 0–10 does not need a
BIGINT. A user ID for a million-user app does not need one either — though an Internet-scale service may eventually. -
Reach for JSON sparingly. If a field has structure you query against regularly,
model it as columns. Reserve
JSON/JSONB/SUPERfor genuinely variable, document-shaped data.
A Well-Typed CREATE TABLE
Putting it all together, here is a CREATE TABLE for a hypothetical e-commerce
orders table. Notice how each column tells a clear, honest story about what it holds.
Every column carries an honest label. order_id is a BIGINT because we
expect a large number of orders. order_code is fixed-length because business codes are.
Money is DECIMAL. Booleans are BOOLEAN. Timestamps know their time zone.
JSONB covers the inevitable "we'll add more attributes later" case without polluting
the schema. The next developer who reads this table will understand the data without asking a
single question.
Main References
-
PostgreSQL Global Development Group. PostgreSQL Documentation: Data Types.
https://www.postgresql.org/docs/current/datatype.html -
Oracle Corporation. MySQL 8.0 Reference Manual: Data Types.
https://dev.mysql.com/doc/refman/8.0/en/data-types.html -
Microsoft. Data types (Transact-SQL) — SQL Server Documentation.
https://learn.microsoft.com/en-us/sql/t-sql/data-types/data-types-transact-sql -
SQLite Consortium. Datatypes In SQLite.
https://www.sqlite.org/datatype3.html -
Amazon Web Services. Amazon Redshift Database Developer Guide: Data types.
https://docs.aws.amazon.com/redshift/latest/dg/c_Supported_data_types.html