SQL, PostgreSQL, Postgres, and psql

SQL stands for Stuctured Query Language. It's the most common language for relational database management systems.

A relational database organizes data into relations, which...well, here's how the vocabulary breaks down in practice:

relation = table
field, attribute = column
record, tuple, instance = row

If you're wondering, "Is there some other way to organize data besides tables?" Yes. Although you see tables come up in the other models, they're treated differently — not as relations but as objects or something else.

PostgreSQL is an object-relational database management system. Postgres is just a nickname for PostgreSQL. An object-relational database tries to combine the good parts of the relational and the object-oriented models. For example, you can define your own data types and write simpler code to execute SQL queries.

psql lets you interact with PostgreSQL in a terminal. This is where you'd do SQL things like create tables and make queries. psql has its own commands, apart from SQL. The common psql commands are:

\l : list databases
\d : list relations (tables) in the current database
\d table_name : describe the named table
\c db_name : connect (switch to) the named database
\q : quit psql

And here are some common SQL commands:

Create an empty table:

CREATE TABLE table_name();

Delete a table:

DROP TABLE table_name;

Create a defined table (one of the columns has to be labeled as the PRIMARY KEY):

CREATE TABLE table_name(
             column1_name datatype,
             column2_name datatype
);

Don't forget the semi-colon to close a SQL command. Also, when creating a table, don't forget the parentheses and commas to separate columns.

An example of creating a defined table:

CREATE TABLE employees(
             id          int         PRIMARY KEY     NOT NULL,
             name        char(50)    NOT NULL,
             age         int         NOT NULL,
             address     text,
             salary      real
);

For more info on data types, see the docs or TutorialsPoint.

SQL keywords are case-insensitive, so you don't have to type them in uppercase. It's just common to write them that way.

Here are some common SQL queries.

Installing PostgreSQL on Mac OS X

All of the code below is run in Terminal.

To install without Homebrew, see PostgreSQL Documentation.

Uninstall Old Versions

Back up data.

Uninstall Homebrew version(s) of PostgreSQL. --force uninstalls all versions.

brew rm postgresql --force

Uninstall other things manually.

russbrooks.com had a couple other commands that didn't work for me. current is the PostgreSQL version number. Change as appropriate.

sudo rm -rf /Applications/PostgreSQL\ current
sudo rm -rf /etc/postgres-reg.ini
sudo rm -rf /Library/StartupItems/postgresql-current
sudo rm -rf /Library/PostgreSQL/9.6

rm by itself removes (deletes) a file. rm -r recursively deletes a directory. Adding -f (rm -rf) forces the deletion.

Restart and make sure there are no postgres processes running in Activity Monitor.

Install PostgreSQL

Skip Homebrew update if you updated recently.

brew update
brew install postgresql

Initialize the database. One person said postgres is the default user account. PostgreSQL says postgres is the PostgreSQL server. Same difference? Who knows. Confusing.

initdb /usr/local/var/postgres

If it says that the directory exists, then remove it first before initializing.

rm -rf /usr/local/var/postgres
initdb /usr/local/var/postgres

Or just install Postgres.app, but you might have to delete the old postgres directory if it already exists.

Start PostgreSQL Server

If the initialization is successful, it'll say to start the database server using:

pg_ctl -D /usr/local/var/postgres -l logfile start

Or (if you haven't already) just install Postgres.app and double-click to start a server. Postgres.app runs on port 5432. There might be some yelling between the app and Terminal if they want to run on the same port or if there's already a server running.

Start psql.

If you want to interact with Postgres at the # prompt (like connect to databases, create tables, list anything, ...):

psql postgres

postgres is the database that was initialized earlier. If you want to connect to a different database when you open psql, then use that database name instead.

After psql starts, you'll see postgres=# as the prompt. When you enter \l to list all the databases — and assuming postgres is the only database that was initialized since installing PostgreSQL — there should be 3 databases listed: postgres, template0, template1.

For more psql goodness, see the docs.

If you get error messages about a role (user) or a database not existing...

Create User

To create a superuser that can create databases:

createuser -P -s -e -d some_username

-P will prompt you for a password for the new user. -s makes the new user a superuser. -e echoes the commands that createuser generates and sends them to the server (I don't know what this means). -d allows the new user to create databases. See PostgreSQL Documentation for more details.

Create Database

To create a database and assign it to an owner (ideally one with POWAH!):

createdb some_database -O some_username

-O assigns a user as the owner of the database. Add -w if you want don't want a password required.

See PostgreSQL Documentation for more details.

Delete Database

dropdb some_database

Other Randomness

To find out the username a database is running under:

ps aux | grep database_name

Resources

Homebrew: package manager for OS X

Importing a CSV File

First, create a table with columns that match the ones from the csv file. The column names don't have to match, but the column data should.

For example, here are the contents of a file called pets.csv. The first line is the header row with the names of the columns:

name,age,type
Lassie,4,dog
Dory,1,fish
Leonardo,12,turtle
Tom,7,cat
Jerry,1,mouse
Kiara,6,cat
Toto,4,dog
Fluffy,3,rabbit
Clifford,8,dog
Snoopy,1,dog
Garfield,5,cat

Here's one way to create a corresponding table. Notice how name, age, and type follow the same order as the columns in the csv file.

CREATE TABLE pets(
             id     serial         NOT NULL   PRIMARY KEY,
             name   varchar(30)    NOT NULL,
             age    int            NOT NULL,
             type   varchar(20)    NOT NULL
);

Then, to import the csv file, use the following pattern:

COPY table_name(columns, to, copy, into) FROM '/path/to/file/file-name.csv' DELIMITER ',' CSV HEADER;

To find the file path, navigate to the file's directory and type pwd (print working directory). The complete file path should include the file name (with the extension) at the end.

So for our pets example, the command would be:

COPY pets(name, age, type) FROM '/Users/kathychang/Desktop/pets.csv' DELIMITER ',' CSV HEADER;

The column names listed in the parentheses are the columns in table you're copying into, not the column names in the data file you're copying from. In other words, COPY pets(name, age, type) is short for "copy into the columns name, age, type in the table pets".

You don't need HEADER if the csv file doesn't include a header row.

You also don't need to write out all the column names if there is a one-to-one match between the columns in the table and the columns in the file. In our pets example, we wrote out the column names because there is a mismatch: the table includes an id column, while the csv file doesn't.

For more info about COPY, see the docs.

Common SQL Queries

Some example queries (in green) and their outputs (in white). The commands are entered in psql (so the prompt looks something like postgres=#).

For more info on queries, see the docs.

SELECT: General Format

SELECT column_name(s) FROM table_name WHERE expression_to_match;

Select Everything

Select all of the columns for all of the pets (the rows) in the pets table.

SELECT * FROM pets;

 id |   name   | age |  type
----+----------+-----+--------
  1 | Lassie   |   4 | dog
  2 | Dory     |   1 | fish
  3 | Leonardo |  12 | turtle
  4 | Tom      |   7 | cat
  5 | Jerry    |   1 | mouse
  6 | Kiara    |   6 | cat
  7 | Toto     |   4 | dog
  8 | Fluffy   |   3 | rabbit
  9 | Clifford |   8 | dog
 10 | Snoopy   |   1 | dog
 11 | Garfield |   5 | cat
(11 rows)

Select Specific Things

Select all of the columns for the pets that are dogs (the value in the type column matches 'dog').

SELECT * FROM pets WHERE type = 'dog';

 id |   name   | age | type
----+----------+-----+------
  1 | Lassie   |   4 | dog
  7 | Toto     |   4 | dog
  9 | Clifford |   8 | dog
 10 | Snoopy   |   1 | dog
(4 rows)

Select only the name, age, and id columns (in that order) for the pets that are dogs.

SELECT name, age, id FROM pets WHERE type = 'dog';

   name   | age | id
----------+-----+----
 Lassie   |   4 |  1
 Toto     |   4 |  7
 Clifford |   8 |  9
 Snoopy   |   1 | 10
(4 rows)

Select only the name, type, and age columns for the pets that are not dogs and are older than 3. Also, sort the results by age in descending order first, then by name in ascending order.

SELECT name, type, age FROM pets WHERE type != 'dog' AND age > 3 ORDER BY age DESC, name;

   name   |  type  | age
----------+--------+-----
 Leonardo | turtle |  12
 Tom      | cat    |   7
 Kiara    | cat    |   6
 Garfield | cat    |   5
(4 rows)

Ascending order is the default, but you can add ASC after the column name if you'd like. For example, ORDER BY age DESC, name ASC.

Select Things Using Aggregate Functions

Common SQL Aggregate Functions:

COUNT()
MIN()
MAX()
AVG()
SUM()

General format:

SELECT aggregate_function(column_name) FROM table_name;

Example:

SELECT AVG(age) FROM pets;

        avg
--------------------
 4.7272727272727273
(1 row)

Count all of the rows (even if they include null values) in the pets table.

SELECT COUNT(*) FROM pets;

 count
-------
    11
(1 row)

Count the number of dogs and label the result 'NumberOfDogs'. Specifically, count the number of values (excluding null) in the id column in the pets table for only dogs.

SELECT COUNT(id) AS NumberOfDogs FROM pets WHERE type = 'dog';

 numberofdogs
--------------
            4
(1 row)

Count the number of pets who are younger than the average age of all the pets in the table.

SELECT COUNT(*) AS NumberOfYoungerPets FROM pets WHERE age < (SELECT AVG(age) FROM pets);

 numberofyoungerpets
---------------------
                   6
(1 row)

Similar idea, except this time show the name, age, and type columns and sort the results by age.

SELECT name, age, type FROM pets WHERE age < (SELECT AVG(age) FROM pets) ORDER BY age;

  name  | age |  type
--------+-----+--------
 Dory   |   1 | fish
 Jerry  |   1 | mouse
 Snoopy |   1 | dog
 Fluffy |   3 | rabbit
 Lassie |   4 | dog
 Toto   |   4 | dog
(6 rows)

To get the familiar kind of output that aggregates data into groups (instead of showing every record in the group), SELECT and GROUP BY the column you want to collapse into groups. You can also get the counts for each group if you run the COUNT function.

SELECT type, COUNT(id) AS NumberOfPets FROM pets GROUP BY type ORDER BY NumberOfPets DESC;

  type  | numberofpets
--------+--------------
 dog    |            4
 cat    |            3
 fish   |            1
 turtle |            1
 rabbit |            1
 mouse  |            1
(6 rows)

Same thing, except there's a LIMIT. The limit is applied after the sorting.

SELECT type, COUNT(id) AS NumberOfPets FROM pets GROUP BY type ORDER BY NumberOfPets DESC LIMIT 3;

 type | numberofpets
------+--------------
 dog  |            4
 cat  |            3
 fish |            1
(3 rows)

see the docs.

SQL + Files = A Good Time

Running Queries from a SQL File

A sql file contains queries as is. It's like copying the queries you'd normally write at a prompt and pasting them into a file.

For example, here are the contents of a sql file called pets.sql :

SELECT * FROM pets;

SELECT name, type, age FROM pets WHERE type != 'dog' AND age > 3 ORDER BY age DESC, name;

To run the queries in a sql file from a psql prompt (like postgres=#):

\i '/path/to/file/file-name.sql'

Example:

\i '/Users/kathychang/Desktop/pets.sql'

From a terminal prompt (a shell, like ~$):

psql -f 'path/to/file/file-name.sql' database_name

Example:

psql -f '/Users/kathychang/Desktop/pets.sql' postgres

Saving the Output from Queries to a File

From the psql prompt (like postgres=#):

COPY '/path/to/file/file-name.csv' DELIMITER ',' CSV HEADER;

Example:

COPY (SELECT name, age, id FROM pets WHERE type = 'dog') TO '/Users/kathychang/Desktop/dogs.csv' DELIMITER ',' CSV HEADER;

Use \COPY instead of COPY if you need admin privileges.

For more info on COPY, see the docs and Stack Overflow.

REFERENCES

PostgreSQL Subqueries

Table of Contents

SQL, PostgreSQL, Postgres, and psql

Installing PostgreSQL on Mac OS X

Uninstall Old Versions

Install PostgreSQL

Start PostgreSQL Server

Start psql.

Create User

Create Database

Delete Database

Other Randomness

Resources

Importing a CSV File

Common SQL Queries

SELECT: General Format

Select Everything

Select Specific Things

Select Things Using Aggregate Functions

SQL + Files = A Good Time

Running Queries from a SQL File

Saving the Output from Queries to a File

REFERENCES