
开发者SQL速查表:2026年从基础到高级查询
📷 Myburgh Roux / Pexels开发者SQL速查表:2026年从基础到高级查询
涵盖SELECT、JOIN、子查询、窗口函数、CTE、索引和优化技巧的全面SQL参考指南。
SQL在2026年仍然是开发者最重要的技能之一。无论你是在构建REST API、处理分析管道,还是调试生产数据,编写和理解SQL都是不可避免的。本速查表超越了基础内容,提供了从简单的SELECT语句到高级窗口函数和查询优化策略的全面实用参考。
基本SELECT查询
每段SQL之旅都从SELECT语句开始。它是所有数据检索操作的基础。
选择列
-- Select specific columns
SELECT first_name, last_name, email
FROM users;
-- Select all columns (avoid in production code)
SELECT *
FROM users;
-- Aliasing columns for clarity
SELECT
first_name AS "First Name",
last_name AS "Last Name",
created_at AS "Registration Date"
FROM users;
-- Selecting distinct values
SELECT DISTINCT department
FROM employees;
使用WHERE过滤
WHERE子句允许你根据条件过滤行。理解运算符优先级和NULL处理对于编写正确的查询至关重要。
-- Basic comparison
SELECT * FROM products
WHERE price > 50.00;
-- Multiple conditions with AND / OR
SELECT * FROM orders
WHERE status = 'shipped'
AND total_amount > 100
AND created_at >= '2026-01-01';
-- IN operator for multiple values
SELECT * FROM users
WHERE country IN ('US', 'CA', 'GB', 'DE');
-- BETWEEN for ranges (inclusive)
SELECT * FROM events
WHERE event_date BETWEEN '2026-01-01' AND '2026-12-31';
-- Pattern matching with LIKE
SELECT * FROM products
WHERE name LIKE '%wireless%'; -- contains 'wireless'
-- Handling NULLs (= does NOT work with NULL)
SELECT * FROM users
WHERE phone IS NULL;
SELECT * FROM users
WHERE phone IS NOT NULL;
一个常见的错误是写WHERE column = NULL而不是WHERE column IS NULL。前者总是计算为UNKNOWN,不返回任何行。
使用ORDER BY排序
-- Single column sort
SELECT * FROM products
ORDER BY price DESC;
-- Multiple column sort
SELECT * FROM employees
ORDER BY department ASC, salary DESC;
-- Sort by column position (less readable, but valid)
SELECT first_name, last_name, hire_date
FROM employees
ORDER BY 3 DESC;
-- NULLS FIRST / NULLS LAST (PostgreSQL, Oracle)
SELECT * FROM tasks
ORDER BY due_date ASC NULLS LAST;
限制结果
-- PostgreSQL / MySQL
SELECT * FROM logs
ORDER BY created_at DESC
LIMIT 100;
-- With offset for pagination
SELECT * FROM products
ORDER BY id
LIMIT 20 OFFSET 40; -- Page 3 (20 items per page)
-- SQL Server
SELECT TOP 100 * FROM logs
ORDER BY created_at DESC;
-- Standard SQL (FETCH FIRST)
SELECT * FROM logs
ORDER BY created_at DESC
FETCH FIRST 100 ROWS ONLY;
聚合函数和GROUP BY
聚合函数将多行折叠为单个汇总值。它们几乎总是与GROUP BY一起使用。
-- Common aggregate functions
SELECT
department,
COUNT(*) AS employee_count,
AVG(salary) AS avg_salary,
MIN(salary) AS min_salary,
MAX(salary) AS max_salary,
SUM(salary) AS total_payroll
FROM employees
GROUP BY department;
-- Filtering groups with HAVING
SELECT
category,
COUNT(*) AS product_count,
AVG(price) AS avg_price
FROM products
GROUP BY category
HAVING COUNT(*) > 10
ORDER BY avg_price DESC;
-- COUNT variations
SELECT
COUNT(*) AS total_rows, -- counts all rows including NULLs
COUNT(email) AS has_email, -- counts non-NULL emails
COUNT(DISTINCT city) AS cities -- counts unique cities
FROM users;
关键区别:WHERE在聚合之前过滤单个行,而HAVING在聚合之后过滤分组。你不能在WHERE子句中引用聚合函数。
JOIN:组合表
JOIN是SQL真正变得强大的地方。理解不同的类型对于处理关系数据至关重要。
INNER JOIN
只返回两个表中都有匹配值的行。
SELECT
o.id AS order_id,
o.order_date,
c.name AS customer_name,
c.email
FROM orders o
INNER JOIN customers c ON o.customer_id = c.id;
LEFT JOIN(LEFT OUTER JOIN)
返回左表的所有行和右表的匹配行。不匹配的右侧列返回NULL。
-- Find all customers and their orders (including customers with no orders)
SELECT
c.name,
c.email,
o.id AS order_id,
o.total_amount
FROM customers c
LEFT JOIN orders o ON c.id = o.customer_id;
-- Find customers who have never placed an order
SELECT c.name, c.email
FROM customers c
LEFT JOIN orders o ON c.id = o.customer_id
WHERE o.id IS NULL;
RIGHT JOIN和FULL OUTER JOIN
-- RIGHT JOIN: all rows from the right table
SELECT
e.name AS employee,
d.name AS department
FROM employees e
RIGHT JOIN departments d ON e.department_id = d.id;
-- FULL OUTER JOIN: all rows from both tables
SELECT
s.name AS student,
c.title AS course
FROM students s
FULL OUTER JOIN enrollments e ON s.id = e.student_id
FULL OUTER JOIN courses c ON e.course_id = c.id;
CROSS JOIN
生成两个表的笛卡尔积。第一个表的每一行与第二个表的每一行组合。
-- Generate all possible size-color combinations
SELECT
s.size_name,
c.color_name
FROM sizes s
CROSS JOIN colors c;
Self JOIN
表与自身连接,适用于层次结构或比较数据。
-- Find employees and their managers
SELECT
e.name AS employee,
m.name AS manager
FROM employees e
LEFT JOIN employees m ON e.manager_id = m.id;
子查询
子查询是嵌套在其他查询中的查询。它们可以出现在SELECT、FROM、WHERE和HAVING子句中。
WHERE中的子查询
-- Find products priced above the average
SELECT name, price
FROM products
WHERE price > (SELECT AVG(price) FROM products);
-- Find customers who placed orders in the last 30 days
SELECT name, email
FROM customers
WHERE id IN (
SELECT DISTINCT customer_id
FROM orders
WHERE order_date >= CURRENT_DATE - INTERVAL '30 days'
);
相关子查询
相关子查询引用外部查询的列,每个外部行执行一次。它们很强大,但在大数据集上可能比JOIN慢。
-- Find employees who earn more than their department average
SELECT e.name, e.salary, e.department_id
FROM employees e
WHERE e.salary > (
SELECT AVG(e2.salary)
FROM employees e2
WHERE e2.department_id = e.department_id
);
-- EXISTS check
SELECT c.name
FROM customers c
WHERE EXISTS (
SELECT 1
FROM orders o
WHERE o.customer_id = c.id
AND o.total_amount > 1000
);
FROM中的子查询(派生表)
SELECT
dept_stats.department,
dept_stats.avg_salary,
dept_stats.employee_count
FROM (
SELECT
department,
AVG(salary) AS avg_salary,
COUNT(*) AS employee_count
FROM employees
GROUP BY department
) AS dept_stats
WHERE dept_stats.employee_count > 5
ORDER BY dept_stats.avg_salary DESC;
公用表表达式(CTE)
CTE提供了一种编写模块化、可读SQL的方法。它们定义在单个查询执行期间存在的临时命名结果集。
基本CTE
WITH active_customers AS (
SELECT
c.id,
c.name,
c.email,
COUNT(o.id) AS order_count,
SUM(o.total_amount) AS total_spent
FROM customers c
JOIN orders o ON c.id = o.customer_id
WHERE o.order_date >= '2026-01-01'
GROUP BY c.id, c.name, c.email
)
SELECT *
FROM active_customers
WHERE total_spent > 500
ORDER BY total_spent DESC;
多个CTE
WITH
monthly_revenue AS (
SELECT
DATE_TRUNC('month', order_date) AS month,
SUM(total_amount) AS revenue
FROM orders
WHERE order_date >= '2025-01-01'
GROUP BY DATE_TRUNC('month', order_date)
),
revenue_with_growth AS (
SELECT
month,
revenue,
LAG(revenue) OVER (ORDER BY month) AS prev_month_revenue,
ROUND(
(revenue - LAG(revenue) OVER (ORDER BY month))
/ LAG(revenue) OVER (ORDER BY month) * 100, 2
) AS growth_pct
FROM monthly_revenue
)
SELECT *
FROM revenue_with_growth
ORDER BY month;
递归CTE
递归CTE非常适合处理层次数据,如组织架构图、分类树或图遍历。
WITH RECURSIVE org_chart AS (
-- Base case: top-level managers (no manager)
SELECT id, name, manager_id, 1 AS level
FROM employees
WHERE manager_id IS NULL
UNION ALL
-- Recursive case: employees with a manager already in the result
SELECT e.id, e.name, e.manager_id, oc.level + 1
FROM employees e
JOIN org_chart oc ON e.manager_id = oc.id
)
SELECT
REPEAT(' ', level - 1) || name AS org_tree,
level
FROM org_chart
ORDER BY level, name;
窗口函数
窗口函数对与当前行相关的一组行执行计算,但不会将它们折叠为单个输出行。它们是现代SQL中最强大的功能之一。
ROW_NUMBER、RANK和DENSE_RANK
SELECT
name,
department,
salary,
ROW_NUMBER() OVER (ORDER BY salary DESC) AS row_num,
RANK() OVER (ORDER BY salary DESC) AS rank,
DENSE_RANK() OVER (ORDER BY salary DESC) AS dense_rank
FROM employees;
区别:如果两名员工在第2名并列,ROW_NUMBER会任意分配2和3,RANK会给两者都分配2然后跳到4,DENSE_RANK会给两者都分配2然后继续到3。
分区窗口函数
-- Top 3 earners per department
WITH ranked AS (
SELECT
name,
department,
salary,
ROW_NUMBER() OVER (
PARTITION BY department
ORDER BY salary DESC
) AS dept_rank
FROM employees
)
SELECT * FROM ranked
WHERE dept_rank <= 3;
LAG和LEAD
这些函数让你能够访问前一行或后一行的值。
SELECT
order_date,
total_amount,
LAG(total_amount, 1) OVER (ORDER BY order_date) AS prev_day_amount,
LEAD(total_amount, 1) OVER (ORDER BY order_date) AS next_day_amount,
total_amount - LAG(total_amount, 1) OVER (ORDER BY order_date) AS daily_change
FROM daily_sales;
累计总和和移动平均
SELECT
order_date,
total_amount,
SUM(total_amount) OVER (
ORDER BY order_date
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
) AS running_total,
AVG(total_amount) OVER (
ORDER BY order_date
ROWS BETWEEN 6 PRECEDING AND CURRENT ROW
) AS seven_day_avg
FROM daily_sales;
NTILE分桶
-- Divide customers into 4 quartiles by spending
SELECT
name,
total_spent,
NTILE(4) OVER (ORDER BY total_spent DESC) AS spending_quartile
FROM customer_summary;
索引策略
编写高效的SQL不仅仅是查询语法的问题,还要确保数据库能够快速执行你的查询。索引是实现这一目标的主要机制。
索引类型
-- B-tree index (default, good for equality and range queries)
CREATE INDEX idx_users_email ON users (email);
-- Composite index (column order matters)
CREATE INDEX idx_orders_customer_date
ON orders (customer_id, order_date DESC);
-- Unique index (also enforces uniqueness)
CREATE UNIQUE INDEX idx_users_email_unique ON users (email);
-- Partial index (PostgreSQL - index only a subset of rows)
CREATE INDEX idx_orders_pending
ON orders (created_at)
WHERE status = 'pending';
-- GIN index for full-text search (PostgreSQL)
CREATE INDEX idx_articles_search
ON articles USING GIN (to_tsvector('english', title || ' ' || body));
何时创建索引
索引加速读取但减慢写入。在WHERE子句、JOIN条件和ORDER BY子句中频繁出现的列上创建索引。除非使用部分索引,否则避免在基数非常低的列(如boolean列)上建索引。
(a, b, c)上的复合索引可以满足对a、a AND b或a AND b AND c的过滤查询,但不能满足单独的b或c。列的顺序很重要。
使用EXPLAIN检查查询性能
-- PostgreSQL
EXPLAIN ANALYZE
SELECT * FROM orders
WHERE customer_id = 42
AND order_date >= '2026-01-01';
-- MySQL
EXPLAIN
SELECT * FROM orders
WHERE customer_id = 42
AND order_date >= '2026-01-01';
在大表上看到顺序扫描(Seq Scan)表示缺少索引。你希望看到的是Index Scan或Index Only Scan。
查询优化技巧
1. 避免SELECT *
始终指定你需要的列。这减少了I/O、内存使用,并可能启用仅索引扫描。
-- Bad
SELECT * FROM users WHERE id = 42;
-- Good
SELECT id, name, email FROM users WHERE id = 42;
2. 对大型子查询使用EXISTS而不是IN
-- Slower with large subquery results
SELECT * FROM customers
WHERE id IN (SELECT customer_id FROM orders);
-- Faster: stops at first match
SELECT * FROM customers c
WHERE EXISTS (SELECT 1 FROM orders o WHERE o.customer_id = c.id);
3. 避免在索引列上使用函数
-- Bad: index on created_at is not used
SELECT * FROM orders
WHERE YEAR(created_at) = 2026;
-- Good: index-friendly range scan
SELECT * FROM orders
WHERE created_at >= '2026-01-01'
AND created_at < '2027-01-01';
4. 尽可能使用UNION ALL代替UNION
UNION去除重复(需要排序),而UNION ALL不会。如果你知道结果集已经是唯一的,始终优先使用UNION ALL。
-- Slower (deduplicates)
SELECT name FROM employees
UNION
SELECT name FROM contractors;
-- Faster (no deduplication)
SELECT name FROM employees
UNION ALL
SELECT name FROM contractors;
5. 对大数据使用批量操作
-- Instead of deleting millions of rows at once
-- Delete in batches to avoid locking the table
DELETE FROM logs
WHERE created_at < '2025-01-01'
LIMIT 10000;
-- Repeat until 0 rows affected
6. 使用适当的数据类型
更小的数据类型意味着内存和索引页中可以容纳更多行。在范围允许时使用INTEGER而不是BIGINT。对索引列使用适当长度的VARCHAR而不是TEXT。
实用SQL模式
UPSERT (INSERT ON CONFLICT)
-- PostgreSQL
INSERT INTO user_settings (user_id, theme, language)
VALUES (42, 'dark', 'en')
ON CONFLICT (user_id)
DO UPDATE SET
theme = EXCLUDED.theme,
language = EXCLUDED.language;
-- MySQL
INSERT INTO user_settings (user_id, theme, language)
VALUES (42, 'dark', 'en')
ON DUPLICATE KEY UPDATE
theme = VALUES(theme),
language = VALUES(language);
使用CASE进行数据透视
SELECT
product_id,
SUM(CASE WHEN month = 1 THEN revenue ELSE 0 END) AS jan,
SUM(CASE WHEN month = 2 THEN revenue ELSE 0 END) AS feb,
SUM(CASE WHEN month = 3 THEN revenue ELSE 0 END) AS mar
FROM monthly_revenue
GROUP BY product_id;
生成日期序列
-- PostgreSQL: generate a series of dates
SELECT generate_series(
'2026-01-01'::date,
'2026-12-31'::date,
'1 day'::interval
)::date AS date;
-- Use with LEFT JOIN to fill gaps in time series data
SELECT
d.date,
COALESCE(s.total, 0) AS daily_total
FROM generate_series('2026-01-01'::date, '2026-03-31'::date, '1 day') AS d(date)
LEFT JOIN daily_sales s ON s.sale_date = d.date
ORDER BY d.date;
快速参考表
| 操作 | 语法 |
|---|---|
| 过滤行 | WHERE condition |
| 排序结果 | ORDER BY column ASC/DESC |
| 限制行数 | LIMIT n 或 FETCH FIRST n ROWS ONLY |
| 去除重复 | SELECT DISTINCT |
| 分组和聚合 | GROUP BY ... HAVING |
| 组合表 | JOIN ... ON |
| 临时结果集 | WITH cte AS (...) |
| 行编号 | ROW_NUMBER() OVER (...) |
| 前一行/下一行 | LAG() / LEAD() OVER (...) |
| 累计总和 | SUM() OVER (ORDER BY ... ROWS ...) |
| 插入或更新 | INSERT ... ON CONFLICT DO UPDATE |
总结
SQL是一门广泛的语言,但掌握本速查表中的模式将涵盖你作为开发者会遇到的绝大多数真实场景。从扎实的SELECT和JOIN基础开始,然后逐步将CTE和窗口函数添加到你的工具包中。特别是随着数据量的增长,始终将索引和查询性能放在心上。
试试我们的SQL格式化工具来美化你的查询。
收藏此页面,需要快速参考时随时回来。内化SQL的最佳方式是定期练习,无论是对自己的数据库编写查询、使用SQLite内存沙箱等平台,还是探索开放数据集。你今天建立的模式将在整个职业生涯中受益。