But first...what is SQL, PostgreSQL, Postgres, and psql?
Table of Contents
SQL, PostgreSQL, Postgres, and psql
SQL stands for Stuctured Query Language. It's the most common language for relational database management systems.
A relational database organizes data into relations, which...well, here's how the vocabulary breaks down in practice:
- relation = table
- field, attribute = column
- record, tuple, instance = row
If you're wondering, "Is there some other way to organize data besides tables?" Yes. Although you see tables come up in the other models, they're treated differently — not as relations but as objects or something else.
PostgreSQL is an object-relational database management system. Postgres is just a nickname for PostgreSQL. An object-relational database tries to combine the good parts of the relational and the object-oriented models. For example, you can define your own data types and write simpler code to execute SQL queries.
psql lets you interact with PostgreSQL in a terminal. This is where you'd do SQL things like create tables and make queries. psql has its own commands, apart from SQL. The common psql commands are:
\l
: list databases\d
: list relations (tables) in the current database\d table_name
: describe the named table\c db_name
: connect (switch to) the named database\q
: quit psql
And here are some common SQL commands:
Create an empty table:
CREATE TABLE table_name();
Delete a table:
DROP TABLE table_name;
Create a defined table (one of the columns has to be labeled as the PRIMARY KEY):
CREATE TABLE table_name( column1_name datatype, column2_name datatype );
Don't forget the semi-colon to close a SQL command. Also, when creating a table, don't forget the parentheses and commas to separate columns.
An example of creating a defined table:
CREATE TABLE employees( id int PRIMARY KEY NOT NULL, name char(50) NOT NULL, age int NOT NULL, address text, salary real );
For more info on data types, see the docs or TutorialsPoint.
SQL keywords are case-insensitive, so you don't have to type them in uppercase. It's just common to write them that way.
Here are some common SQL queries.
Installing PostgreSQL on Mac OS X
All of the code below is run in Terminal.
To install without Homebrew, see PostgreSQL Documentation.
Uninstall Old Versions
Uninstall Homebrew version(s) of PostgreSQL. --force
uninstalls all versions.
brew rm postgresql --force
Uninstall other things manually.
russbrooks.com had a couple other commands that didn't work for me. current is the PostgreSQL version number. Change as appropriate.
sudo rm -rf /Applications/PostgreSQL\ current sudo rm -rf /etc/postgres-reg.ini sudo rm -rf /Library/StartupItems/postgresql-current sudo rm -rf /Library/PostgreSQL/9.6
rm
by itself removes (deletes) a file. rm -r
recursively deletes a directory. Adding -f
(rm -rf
) forces the deletion.
Restart and make sure there are no postgres processes running in Activity Monitor.
Install PostgreSQL
Skip Homebrew update if you updated recently.
brew update brew install postgresql
Initialize the database. One person said postgres is the default user account. PostgreSQL says postgres is the PostgreSQL server. Same difference? Who knows. Confusing.
initdb /usr/local/var/postgres
If it says that the directory exists, then remove it first before initializing.
rm -rf /usr/local/var/postgres initdb /usr/local/var/postgres
Or just install Postgres.app, but you might have to delete the old postgres directory if it already exists.
Start PostgreSQL Server
If the initialization is successful, it'll say to start the database server using:
pg_ctl -D /usr/local/var/postgres -l logfile start
Or (if you haven't already) just install Postgres.app and double-click to start a server. Postgres.app runs on port 5432. There might be some yelling between the app and Terminal if they want to run on the same port or if there's already a server running.
Start psql.
If you want to interact with Postgres at the # prompt (like connect to databases, create tables, list anything, ...):
psql postgres
postgres is the database that was initialized earlier. If you want to connect to a different database when you open psql, then use that database name instead.
After psql starts, you'll see postgres=#
as the prompt. When you enter \l
to list all the databases — and assuming postgres is the only database that was initialized since installing PostgreSQL — there should be 3 databases listed: postgres, template0, template1.
For more psql goodness, see the docs.
If you get error messages about a role (user) or a database not existing...
Create User
To create a superuser that can create databases:
createuser -P -s -e -d some_username
-P
will prompt you for a password for the new user. -s
makes the new user a superuser. -e
echoes the commands that createuser generates and sends them to the server (I don't know what this means). -d
allows the new user to create databases. See PostgreSQL Documentation for more details.
Create Database
To create a database and assign it to an owner (ideally one with POWAH!):
createdb some_database -O some_username
-O
assigns a user as the owner of the database. Add -w
if you want don't want a password required.
See PostgreSQL Documentation for more details.
Delete Database
dropdb some_database
Other Randomness
To find out the username a database is running under:
ps aux | grep database_name
Resources
- Homebrew: package manager for OS X
Importing a CSV File
First, create a table with columns that match the ones from the csv file. The column names don't have to match, but the column data should.
For example, here are the contents of a file called pets.csv. The first line is the header row with the names of the columns:
name,age,type Lassie,4,dog Dory,1,fish Leonardo,12,turtle Tom,7,cat Jerry,1,mouse Kiara,6,cat Toto,4,dog Fluffy,3,rabbit Clifford,8,dog Snoopy,1,dog Garfield,5,cat
Here's one way to create a corresponding table. Notice how name, age, and type follow the same order as the columns in the csv file.
CREATE TABLE pets( id serial NOT NULL PRIMARY KEY, name varchar(30) NOT NULL, age int NOT NULL, type varchar(20) NOT NULL );
Then, to import the csv file, use the following pattern:
COPY table_name(columns, to, copy, into) FROM '/path/to/file/file-name.csv' DELIMITER ',' CSV HEADER;
To find the file path, navigate to the file's directory and type pwd
(print working directory). The complete file path should include the file name (with the extension) at the end.
So for our pets example, the command would be:
COPY pets(name, age, type) FROM '/Users/kathychang/Desktop/pets.csv' DELIMITER ',' CSV HEADER;
The column names listed in the parentheses are the columns in table you're copying into, not the column names in the data file you're copying from. In other words, COPY pets(name, age, type)
is short for "copy into the columns name, age, type in the table pets".
You don't need HEADER
if the csv file doesn't include a header row.
You also don't need to write out all the column names if there is a one-to-one match between the columns in the table and the columns in the file. In our pets example, we wrote out the column names because there is a mismatch: the table includes an id column, while the csv file doesn't.
For more info about COPY
, see the docs.
Common SQL Queries
Some example queries (in green) and their outputs (in white). The commands are entered in psql (so the prompt looks something like postgres=#
).
For more info on queries, see the docs.
SELECT: General Format
SELECT column_name(s) FROM table_name WHERE expression_to_match;
Select Everything
Select all of the columns for all of the pets (the rows) in the pets table.
SELECT * FROM pets;
id | name | age | type ----+----------+-----+-------- 1 | Lassie | 4 | dog 2 | Dory | 1 | fish 3 | Leonardo | 12 | turtle 4 | Tom | 7 | cat 5 | Jerry | 1 | mouse 6 | Kiara | 6 | cat 7 | Toto | 4 | dog 8 | Fluffy | 3 | rabbit 9 | Clifford | 8 | dog 10 | Snoopy | 1 | dog 11 | Garfield | 5 | cat (11 rows)
Select Specific Things
Select all of the columns for the pets that are dogs (the value in the type column matches 'dog').
SELECT * FROM pets WHERE type = 'dog';
id | name | age | type ----+----------+-----+------ 1 | Lassie | 4 | dog 7 | Toto | 4 | dog 9 | Clifford | 8 | dog 10 | Snoopy | 1 | dog (4 rows)
Select only the name, age, and id columns (in that order) for the pets that are dogs.
SELECT name, age, id FROM pets WHERE type = 'dog';
name | age | id ----------+-----+---- Lassie | 4 | 1 Toto | 4 | 7 Clifford | 8 | 9 Snoopy | 1 | 10 (4 rows)
Select only the name, type, and age columns for the pets that are not dogs and are older than 3. Also, sort the results by age in descending order first, then by name in ascending order.
SELECT name, type, age FROM pets WHERE type != 'dog' AND age > 3 ORDER BY age DESC, name;
name | type | age ----------+--------+----- Leonardo | turtle | 12 Tom | cat | 7 Kiara | cat | 6 Garfield | cat | 5 (4 rows)
Ascending order is the default, but you can add ASC after the column name if you'd like. For example, ORDER BY age DESC, name ASC
.
Select Things Using Aggregate Functions
Common SQL Aggregate Functions:
COUNT()
MIN()
MAX()
AVG()
SUM()
General format:
SELECT aggregate_function(column_name) FROM table_name;
Example:
SELECT AVG(age) FROM pets;
avg -------------------- 4.7272727272727273 (1 row)
Count all of the rows (even if they include null values) in the pets table.
SELECT COUNT(*) FROM pets;
count ------- 11 (1 row)
Count the number of dogs and label the result 'NumberOfDogs'. Specifically, count the number of values (excluding null) in the id column in the pets table for only dogs.
SELECT COUNT(id) AS NumberOfDogs FROM pets WHERE type = 'dog';
numberofdogs -------------- 4 (1 row)
Count the number of pets who are younger than the average age of all the pets in the table.
SELECT COUNT(*) AS NumberOfYoungerPets FROM pets WHERE age < (SELECT AVG(age) FROM pets);
numberofyoungerpets --------------------- 6 (1 row)
Similar idea, except this time show the name, age, and type columns and sort the results by age.
SELECT name, age, type FROM pets WHERE age < (SELECT AVG(age) FROM pets) ORDER BY age;
name | age | type --------+-----+-------- Dory | 1 | fish Jerry | 1 | mouse Snoopy | 1 | dog Fluffy | 3 | rabbit Lassie | 4 | dog Toto | 4 | dog (6 rows)
To get the familiar kind of output that aggregates data into groups (instead of showing every record in the group), SELECT
and GROUP BY
the column you want to collapse into groups. You can also get the counts for each group if you run the COUNT
function.
SELECT type, COUNT(id) AS NumberOfPets FROM pets GROUP BY type ORDER BY NumberOfPets DESC;
type | numberofpets --------+-------------- dog | 4 cat | 3 fish | 1 turtle | 1 rabbit | 1 mouse | 1 (6 rows)
Same thing, except there's a LIMIT
. The limit is applied after the sorting.
SELECT type, COUNT(id) AS NumberOfPets FROM pets GROUP BY type ORDER BY NumberOfPets DESC LIMIT 3;
type | numberofpets ------+-------------- dog | 4 cat | 3 fish | 1 (3 rows)
SQL + Files = A Good Time
Running Queries from a SQL File
A sql file contains queries as is. It's like copying the queries you'd normally write at a prompt and pasting them into a file.
For example, here are the contents of a sql file called pets.sql :
SELECT * FROM pets; SELECT name, type, age FROM pets WHERE type != 'dog' AND age > 3 ORDER BY age DESC, name;
To run the queries in a sql file from a psql prompt (like postgres=#
):
\i '/path/to/file/file-name.sql'
Example:
\i '/Users/kathychang/Desktop/pets.sql'
From a terminal prompt (a shell, like ~$
):
psql -f 'path/to/file/file-name.sql' database_name
Example:
psql -f '/Users/kathychang/Desktop/pets.sql' postgres
Saving the Output from Queries to a File
From the psql prompt (like postgres=#
):
COPY '/path/to/file/file-name.csv' DELIMITER ',' CSV HEADER;
Example:
COPY (SELECT name, age, id FROM pets WHERE type = 'dog') TO '/Users/kathychang/Desktop/dogs.csv' DELIMITER ',' CSV HEADER;
Use \COPY
instead of COPY
if you need admin privileges.
For more info on COPY
, see the docs and Stack Overflow.