Formatting Standards
These guidelines should be followed when writing SQL.
Write keywords in ALL CAPS
SELECT id,
name,
email
FROM app_users
WHERE is_active = TRUE;
Create vertical alignment using tabs
There should be a minimum of one tab after the longest command. Other items may need to be tabbed more than that to maintain vertical alignment.
SELECT u.age,
COUNT(u.id) AS num_of_users
FROM app_users AS u
WHERE u.is_active = TRUE
GROUP BY u.age
HAVING COUNT(u.id) > 50
ORDER BY u.age;
Nest JOIN conditions under the table name
SELECT u.age,
COUNT(u.id) AS num_of_users
FROM app_users AS u
JOIN address AS a
ON u.id = a.user_id AND
a.is_shipping = TRUE AND
a.city = 'Chicago'
WHERE u.is_active = TRUE AND
u.age > 25
GROUP BY u.age;
Explicitly list fields
When selecting fields it's important to only list the fields you need, especially when joining tables. Always reference the full table name (app_users.id
) or use an alias (u.id
) as show in the example below. It's possible for a field with the same name to exist in multiple tables. If your query only uses one table, there is no need to use an alias.
SELECT u.id,
u.name,
u.email,
u.age,
a.city,
a.zip_code
FROM app_users AS u
JOIN address AS a
ON u.id = a.user_id AND
a.is_shipping = TRUE
WHERE u.is_active = TRUE;
There is an exception to this rule when using a CTE. Since you would have already listed the fields in the CTE definition, using the *
is ok.
WITH active_users AS (
SELECT u.id,
u.name,
u.email,
u.age,
a.city,
a.zip_code
FROM app_users AS u
JOIN address AS a
ON u.id = a.user_id AND
a.is_shipping = TRUE
WHERE u.is_active = TRUE
)
SELECT *
FROM active_users;
Using Stored Procedures
When your project favors raw SQL over an ORM, your code can quickly become littered with queries in string variables. A better approach is to organize your queries as stored procedures in the database. This results in a clean separation between application code and database operations, as well as an increase in performance because the database server can store pre-compiled functions.
The syntax for creating stored procedures is very similar across all flavors of SQL, however each will have their own specific differences. Here are links to documentation for the three most common databases we encounter:
When favoring this approach over an ORM, you will still track database migrations. This process will be a little more manual, but will ultimately grant you fine-grained control over all database schema changes during a project. FlywayDB is our preferred migration tool for this approach.
With FlywayDB, you will maintain a top level directory in your project repo: /sql
. Each time you need to create or alter a database entity, you will add a new SQL file containing those instructions in this directory. The FlywayDB CLI will execute each migration file in order based on the the tool’s file naming requirements, as well as maintain migration history in a schema_version
database table.
Examples
The following is an example of creating a simple stored procedure in PostgreSQL:
CREATE OR REPLACE FUNCTION sp_get_employees_for_city(city VARCHAR)
RETURNS TABLE(id INTEGER, name VARCHAR) AS $$
BEGIN
RETURN QUERY
SELECT e.id,
e.name
FROM employees AS e
WHERE lower(e.city) = lower(city)
ORDER BY e.name;
END; $$
LANGUAGE PLPGSQL;
The following is an example of calling the stored procedure from Python:
import psycopg2
conn = psycopg2.connect(database="test", user="postgres", password="secret")
cur = conn.cursor()
cur.callproc('sp_get_employees_for_city', ['Chicago'])
rows = cur.fetchall()
for r in rows:
print(r)