PostgreSQL: A Comprehensive Guide to SELECT Statement

PostgreSQL: A Comprehensive Guide to SELECT Statement

·

23 min read

Table of contents

Introduction

When working with databases, one of the most essential and frequently used operations is retrieving data. In PostgreSQL, the SELECT statement allows you to retrieve data from one or more tables based on specified conditions. This powerful statement forms the backbone of data querying in PostgreSQL. In this blog post, we will explore the SELECT statement in depth, covering its syntax, usage, and various examples.

Syntax

The basic syntax of the SELECT statement is as follows:

SELECT column1, column2, ...
FROM table_name
WHERE condition;
  • column1, column2, ...: The columns you want to retrieve data from. Use * to select all columns.

  • table_name: The name of the table from which data will be retrieved.

  • condition: (Optional) The condition(s) that specify which rows to select based on specific criteria.

Examples

Let's dive into some practical examples to understand how the SELECT statement works.

1. Retrieving all columns from a table

To retrieve all columns from a table, use the * wildcard:

SELECT *
FROM employees;

This query will fetch all records from the employees table, including all columns.

2. Selecting specific columns

To select specific columns, list their names after the SELECT keyword:

SELECT first_name, last_name, email
FROM employees;

This query will retrieve only the first_name, last_name, and email columns from the employees table.

3. Adding aliases

You can use aliases to provide temporary names to columns:

SELECT first_name AS "First Name", last_name AS "Last Name", email AS "Email"
FROM employees;

The query will display the columns with aliases, making the output more descriptive.

4. Filtering data with WHERE

The WHERE clause allows you to filter data based on specific conditions:

SELECT first_name, last_name, hire_date
FROM employees
WHERE department = 'HR';

This query will retrieve the first_name, last_name, and hire_date columns from employees who belong to the 'HR' department.

5. Combining conditions

You can combine multiple conditions using logical operators such as AND and OR:

SELECT first_name, last_name, hire_date
FROM employees
WHERE department = 'HR' AND salary > 50000;

This query will fetch employees from the 'HR' department with a salary greater than $50,000.

6. Sorting results

You can sort the query results using the ORDER BY clause:

SELECT first_name, last_name, hire_date
FROM employees
WHERE department = 'HR'
ORDER BY hire_date DESC;

This query will retrieve 'HR' department employees sorted in descending order based on their hire dates.

7. Limiting the number of results

Use the LIMIT clause to restrict the number of rows returned:

SELECT first_name, last_name
FROM employees
LIMIT 10;

This query will retrieve only the first 10 rows from the employees table.

Grouping Data with GROUP BY

In addition to retrieving individual records, the SELECT statement in PostgreSQL also allows us to perform aggregate functions on data. The GROUP BY clause is used to group rows based on one or more columns, and then aggregate functions can be applied to these groups to obtain summary information. Let's explore how this works with examples.

8. Using GROUP BY with Aggregate Functions

Consider a scenario where we have a sales table with columns product, category, and price. We can use the GROUP BY clause to find the total sales for each product category:

SELECT category, SUM(price) AS total_sales
FROM sales
GROUP BY category;

This query will group the records by the category column and calculate the total sales for each category using the SUM aggregate function.

9. Filtering Groups with HAVING

The HAVING clause works similarly to the WHERE clause, but it is used specifically with the GROUP BY clause to filter groups based on aggregate results:

SELECT category, AVG(price) AS average_price
FROM sales
GROUP BY category
HAVING AVG(price) > 100;

This query will calculate the average price for each product category and then filter out those categories with an average price greater than $100.

10. Grouping by Multiple Columns

You can also use multiple columns in the GROUP BY clause to create more granular groups:

SELECT category, product, COUNT(*) AS total_sales
FROM sales
GROUP BY category, product;

This query will group the data by both category and product, and then count the number of sales for each unique combination.

11. Using GROUP BY with ORDER BY

Combining GROUP BY with ORDER BY allows you to sort the groups based on aggregate values:

SELECT department, AVG(salary) AS avg_salary
FROM employees
GROUP BY department
ORDER BY avg_salary DESC;

This query will group employees by department, calculate the average salary for each department, and then sort the results in descending order based on the average salary.

12. Grouping by Date or Time Intervals

In some cases, you might want to group data by date or time intervals. PostgreSQL allows you to use functions like DATE_TRUNC for this purpose:

SELECT DATE_TRUNC('month', order_date) AS month, SUM(total_amount) AS monthly_sales
FROM orders
GROUP BY DATE_TRUNC('month', order_date);

This query will group the orders by the month in which they were placed and calculate the total sales for each month.

Distinct Records with DISTINCT

When querying data from a database, you might encounter scenarios where you want to eliminate duplicate records from the result set. The DISTINCT keyword in the SELECT statement allows you to retrieve only unique records from the specified columns. Let's explore how to use DISTINCT with some examples.

13. Retrieving Distinct Values

Consider a colors table with a column named color_name. If you want to retrieve all unique color names from this table, you can use the DISTINCT keyword as follows:

SELECT DISTINCT color_name
FROM colors;

This query will fetch only the distinct color names from the colors table, removing any duplicate entries.

14. Combining DISTINCT with Multiple Columns

You can use DISTINCT with multiple columns to retrieve unique combinations of values:

SELECT DISTINCT first_name, last_name
FROM employees;

This query will return only the unique combinations of first_name and last_name from the employees table.

15. Using DISTINCT with ORDER BY

You can combine DISTINCT with ORDER BY to sort the distinct values:

SELECT DISTINCT department
FROM employees
ORDER BY department;

This query will retrieve the distinct department names from the employees table and sort them in alphabetical order.

16. Limiting the Number of Distinct Results

If you want to retrieve a specific number of distinct records, you can use LIMIT in combination with DISTINCT:

SELECT DISTINCT product_name
FROM sales
LIMIT 5;

This query will fetch the first 5 distinct product names from the sales table.

17. Using DISTINCT with Aggregate Functions

DISTINCT can be used in conjunction with aggregate functions to perform calculations on unique values:

SELECT department, COUNT(DISTINCT job_title) AS unique_job_titles
FROM employees
GROUP BY department;

This query will count the number of unique job titles within each department and present the results grouped by department.

Combining Data with JOIN

In real-world scenarios, data is often distributed across multiple tables in a relational database. The JOIN operation allows you to combine data from two or more tables based on related columns. This enables you to retrieve and analyze data across multiple entities. Let's dive into various types of joins with examples.

18. Inner Join

The INNER JOIN retrieves only the rows that have matching values in both tables.

Consider two tables: employees and departments. To retrieve a list of employees along with their corresponding department names, you can use an inner join as follows:

SELECT employees.first_name, employees.last_name, departments.department_name
FROM employees
INNER JOIN departments ON employees.department_id = departments.department_id;

This query will combine data from both tables based on the department_id, displaying the first_name and last_name of employees along with their respective department_name.

19. Left Join (or Left Outer Join)

The LEFT JOIN retrieves all rows from the left table and the matching rows from the right table. If there's no match in the right table, NULL values are returned for the columns of the right table.

Continuing from the previous example, suppose you want to retrieve all employees and their corresponding department names. However, you also want to include employees who have not been assigned to any department. You can use a left join as follows:

SELECT employees.first_name, employees.last_name, departments.department_name
FROM employees
LEFT JOIN departments ON employees.department_id = departments.department_id;

This query will return all employees, along with their department names if available. Employees without a department assignment will have NULL in the department_name column.

20. Right Join (or Right Outer Join)

The RIGHT JOIN is the opposite of the LEFT JOIN. It retrieves all rows from the right table and the matching rows from the left table. If there's no match in the left table, NULL values are returned for the columns of the left table.

For instance, if you want to see a list of departments and their employees, including departments without any employees, you can use a right join:

SELECT departments.department_name, employees.first_name, employees.last_name
FROM employees
RIGHT JOIN departments ON employees.department_id = departments.department_id;

This query will return all departments along with their employees if available. Departments without any employees will have NULL values in the first_name and last_name columns.

21. Full Outer Join

The FULL OUTER JOIN retrieves all rows from both tables and combines them. If there's no match in one of the tables, NULL values are returned for the columns of the table without a match.

For example, if you want to see a comprehensive list of all employees and all departments, including employees without a department assignment and departments without any employees, you can use a full outer join:

SELECT employees.first_name, employees.last_name, departments.department_name
FROM employees
FULL OUTER JOIN departments ON employees.department_id = departments.department_id;

This query will return all employees and all departments, merging the data from both tables. Any employee without a department and any department without an employee will have NULL values in the corresponding columns.

Utilizing Subqueries

In SQL, a subquery is a query nested inside another query, allowing you to retrieve data from one or more tables based on the results of an inner query. Subqueries are a powerful tool for writing complex and efficient queries. Let's delve into subqueries with examples using sample tables.

Sample Tables

Consider two sample tables: orders and customers.

orders Table

order_idcustomer_idtotal_amount
1101150.00
2102200.00
310175.00
4103300.00
510250.00

customers Table

customer_idcustomer_name
101John
102Jane
103Smith
104Emily

22. Basic Subquery

A simple subquery can be used to retrieve data based on a condition from another table. For example, to find all orders made by customer 'Jane', you can use a subquery like this:

SELECT order_id, total_amount
FROM orders
WHERE customer_id = (SELECT customer_id FROM customers WHERE customer_name = 'Jane');

Output:

order_idtotal_amount
2200.00
550.00

23. Subquery with Aggregate Function

You can also use subqueries with aggregate functions to perform calculations based on related data. To find the total amount spent by each customer, you can use a subquery like this:

SELECT customer_id, customer_name, 
  (SELECT SUM(total_amount) FROM orders WHERE orders.customer_id = customers.customer_id) AS total_spent
FROM customers;

Output:

customer_idcustomer_nametotal_spent
101John225.00
102Jane250.00
103Smith300.00
104EmilyNULL

24. Subquery with IN Operator

The IN operator allows you to check if a value exists within a set of values returned by a subquery. For instance, to find orders made by any of the top-spending customers, you can use a subquery with the IN operator:

SELECT order_id, total_amount
FROM orders
WHERE customer_id IN (SELECT customer_id FROM customers ORDER BY total_spent DESC LIMIT 2);

Output:

order_idtotal_amount
2200.00
4300.00
550.00

25. Subquery with EXISTS Operator

The EXISTS operator allows you to check if a subquery returns any rows. For example, to find customers who have placed at least one order, you can use the EXISTS operator:

SELECT customer_id, customer_name
FROM customers
WHERE EXISTS (SELECT 1 FROM orders WHERE orders.customer_id = customers.customer_id);

Output:

customer_idcustomer_name
101John
102Jane
103Smith

Uniting Data with the UNION Operator

In SQL, the UNION operator allows you to combine the results of two or more SELECT queries into a single result set. It is used to unite data from different tables or queries that have the same column structure. Let's explore the UNION operator with examples using sample tables.

Sample Tables

Consider two sample tables: employees and contractors.

employees Table

emp_idfirst_namelast_namedepartment
1JohnSmithHR
2JaneDoeMarketing
3MichaelJohnsonFinance

contractors Table

contractor_idfirst_namelast_namedepartment
101EmilyBrownIT
102JamesLeeHR
103LilyWangMarketing

26. Basic UNION

The basic usage of the UNION operator is to combine the results of two similar queries. For instance, if you want to retrieve a list of all employees and contractors, you can use a UNION like this:

SELECT emp_id, first_name, last_name, department
FROM employees
UNION
SELECT contractor_id, first_name, last_name, department
FROM contractors;

Output:

emp_idfirst_namelast_namedepartment
1JohnSmithHR
2JaneDoeMarketing
3MichaelJohnsonFinance
101EmilyBrownIT
102JamesLeeHR
103LilyWangMarketing

27. UNION with Order and Limit

You can also use ORDER BY and LIMIT with UNION to sort the final result and restrict the number of rows returned. For example, to get a list of all employees and contractors sorted alphabetically by their last names and limited to the top 4, you can use the following query:

SELECT emp_id, first_name, last_name, department
FROM employees
UNION
SELECT contractor_id, first_name, last_name, department
FROM contractors
ORDER BY last_name
LIMIT 4;

Output:

emp_idfirst_namelast_namedepartment
2JaneDoeMarketing
1JohnSmithHR
101EmilyBrownIT
103LilyWangMarketing

28. UNION with Different Columns

To use the UNION operator, the number and data types of columns in all queries must match. If the columns differ, you can use NULL values to fill in the missing columns. For instance, if the employees table has an additional column called position, and the contractors table lacks this column, you can still use UNION with NULL values:

SELECT emp_id, first_name, last_name, department, position
FROM employees
UNION
SELECT contractor_id, first_name, last_name, department, NULL
FROM contractors;

Output:

emp_idfirst_namelast_namedepartmentposition
1JohnSmithHRNULL
2JaneDoeMarketingNULL
3MichaelJohnsonFinanceNULL
101EmilyBrownITNULL
102JamesLeeHRNULL
103LilyWangMarketingNULL

Handling NULL Values with the IS NULL Operator

In SQL, the IS NULL operator allows you to check for the presence of NULL values in columns. NULL represents the absence of a value, and it is essential to handle NULL values properly in your queries to avoid unexpected results. Let's explore how to use the IS NULL operator with examples using sample tables.

Sample Table

Consider a sample table employees with the following data:

emp_idfirst_namelast_namedepartmentsalary
1JohnSmithHR50000
2JaneDoeMarketing60000
3MichaelJohnsonNULL55000
4EmilyBrownFinanceNULL
5JamesLeeHR48000

29. Basic IS NULL

The basic usage of the IS NULL operator is to check for NULL values in a specific column. For example, to find employees with missing department information, you can use the following query:

SELECT emp_id, first_name, last_name
FROM employees
WHERE department IS NULL;

Output:

emp_idfirst_namelast_name
3MichaelJohnson

30. Combining IS NULL and IS NOT NULL

You can combine IS NULL and IS NOT NULL to retrieve records with and without NULL values. For instance, to find employees with and without salary information, you can use the following query:

SELECT emp_id, first_name, last_name
FROM employees
WHERE salary IS NULL OR salary IS NOT NULL;

This query will return all records from the employees table because it includes both employees with known salaries and employees without salary information.

31. Handling NULL Values with COALESCE

The COALESCE function allows you to replace NULL values with a specified default value. For example, to display the department names of employees and use 'N/A' for those without a department, you can use the following query:

SELECT emp_id, first_name, last_name, COALESCE(department, 'N/A') AS department
FROM employees;

Output:

emp_idfirst_namelast_namedepartment
1JohnSmithHR
2JaneDoeMarketing
3MichaelJohnsonN/A
4EmilyBrownFinance
5JamesLeeHR

Utilizing the BETWEEN Operator for Range Queries

In SQL, the BETWEEN operator allows you to retrieve data that falls within a specified range of values. This operator is particularly useful for querying data that meets specific criteria based on numerical or date ranges. Let's explore how to use the BETWEEN operator with examples using sample tables.

Sample Table

Consider a sample table employees with the following data:

emp_idfirst_namelast_namehire_datesalary
1JohnSmith2021-03-1550000
2JaneDoe2020-12-0160000
3MichaelJohnson2022-02-1055000
4EmilyBrown2023-01-0575000
5JamesLee2023-06-2048000

32. Basic BETWEEN

The basic usage of the BETWEEN operator is to find records that fall within a specified range. For example, to retrieve employees with salaries between $50,000 and $60,000, you can use the following query:

SELECT emp_id, first_name, last_name, salary
FROM employees
WHERE salary BETWEEN 50000 AND 60000;

Output:

emp_idfirst_namelast_namesalary
1JohnSmith50000
2JaneDoe60000
3MichaelJohnson55000

33. Using NOT BETWEEN

You can also use the NOT BETWEEN operator to retrieve records that fall outside the specified range. For example, to find employees with salaries not between $50,000 and $60,000, you can use the following query:

SELECT emp_id, first_name, last_name, salary
FROM employees
WHERE salary NOT BETWEEN 50000 AND 60000;

Output:

emp_idfirst_namelast_namesalary
4EmilyBrown75000
5JamesLee48000

34. BETWEEN with Dates

The BETWEEN operator is also useful for querying date ranges. For example, to retrieve employees hired between January 1, 2022, and December 31, 2022, you can use the following query:

SELECT emp_id, first_name, last_name, hire_date
FROM employees
WHERE hire_date BETWEEN '2022-01-01' AND '2022-12-31';

Output:

emp_idfirst_namelast_namehire_date
3MichaelJohnson2022-02-10

35. BETWEEN with Dates and Times

You can also use the BETWEEN operator with date and time data types. For example, to retrieve employees hired between January 1, 2022, 00:00:00, and January 1, 2022, 23:59:59, you can use the following query:

SELECT emp_id, first_name, last_name, hire_date
FROM employees
WHERE hire_date BETWEEN '2022-01-01 00:00:00' AND '2022-01-01 23:59:59';

Output:

emp_idfirst_namelast_namehire_date
3MichaelJohnson2022-02-10

Using the LIKE Operator for Pattern Matching

In SQL, the LIKE operator allows you to perform pattern matching on string values. It is particularly useful for searching for specific patterns or substrings within columns. Let's explore how to use the LIKE operator with examples using sample tables.

Sample Table

Consider a sample table products with the following data:

product_idproduct_name
1Apples
2Bananas
3Oranges
4Pineapple
5Strawberry Jam
6Orange Juice
7Apple Pie

36. Basic LIKE

The basic usage of the LIKE operator is to find records that match a specific pattern. The % symbol represents zero, one, or multiple characters, and the _ symbol represents a single character. For example, to retrieve products that start with "App", you can use the following query:

SELECT product_id, product_name
FROM products
WHERE product_name LIKE 'App%';

Output:

product_idproduct_name
1Apples
7Apple Pie

37. Using NOT LIKE

Similar to the NOT BETWEEN operator, you can use the NOT LIKE operator to retrieve records that do not match a specific pattern. For example, to find products that do not contain the word "Juice" in their name, you can use the following query:

SELECT product_id, product_name
FROM products
WHERE product_name NOT LIKE '%Juice%';

Output:

product_idproduct_name
1Apples
2Bananas
3Oranges
4Pineapple
5Strawberry Jam
7Apple Pie

38. Combining LIKE with other Operators

You can combine the LIKE operator with other operators to perform more complex searches. For example, to find products that start with "O" and have exactly seven characters in their name, you can use the following query:

SELECT product_id, product_name
FROM products
WHERE product_name LIKE 'O______' AND product_name NOT LIKE '% %';

Output:

product_idproduct_name
3Oranges

Replacing NULL Values with COALESCE

In SQL, the COALESCE function is used to replace NULL values with a specified default value. This function is handy when you want to display meaningful data instead of NULLs in your query results. Let's explore how to use the COALESCE function with examples using sample tables.

Sample Table

Consider a sample table employees with the following data:

emp_idfirst_namelast_namedepartmentsalary
1JohnSmithHR50000
2JaneDoeMarketing60000
3MichaelJohnsonNULLNULL
4EmilyBrownFinanceNULL
5JamesLeeHR48000

39. Basic Usage of COALESCE

The basic usage of the COALESCE function is to replace NULL values with a default value of your choice. For example, to display the department name and replace NULL values with "Not Assigned," you can use the following query:

SELECT emp_id, first_name, last_name, COALESCE(department, 'Not Assigned') AS department
FROM employees;

Output:

emp_idfirst_namelast_namedepartment
1JohnSmithHR
2JaneDoeMarketing
3MichaelJohnsonNot Assigned
4EmilyBrownFinance
5JamesLeeHR

40. Using COALESCE with Multiple Columns

You can use the COALESCE function with multiple columns. The function returns the first non-NULL value from the list. For example, if you have a backup email address column and you want to display the backup email if the primary email is NULL, you can use the following query:

SELECT emp_id, first_name, last_name, COALESCE(primary_email, backup_email) AS email
FROM employees;

Assuming we have an additional email table with the following data:

Output:

emp_idfirst_namelast_nameemail
1JohnSmithjohn@example.com
2JaneDoejane.doe@example.com
3MichaelJohnsonmichael@example.com
4EmilyBrownemily@example.com
5JamesLeejames@example.com

41. Using COALESCE with Aggregate Functions

You can also use COALESCE in combination with aggregate functions to handle NULL values during calculations. For example, to find the average salary of employees and replace NULL values with 0, you can use the following query:

SELECT COALESCE(AVG(salary), 0) AS avg_salary
FROM employees;

Output:

avg_salary
42600

Finding the Greatest and Least Values with GREATEST() and LEAST()

In SQL, the GREATEST() and LEAST() functions are used to find the highest and lowest values, respectively, from a list of expressions or column values. These functions are particularly useful when you need to compare multiple values and determine the maximum or minimum among them. Let's explore how to use the GREATEST() and LEAST() functions with examples using sample tables.

Sample Table

Consider a sample table employees with the following data:

emp_idfirst_namelast_nameagesalary
1JohnSmith3050000
2JaneDoe2860000
3MichaelJohnson3555000
4EmilyBrown2775000
5JamesLee3248000

42. Using GREATEST()

The GREATEST() function returns the highest value from a list of expressions or column values. For example, to find the highest salary among the employees, you can use the following query:

SELECT GREATEST(salary) AS highest_salary
FROM employees;

Output:

highest_salary
75000

43. Using LEAST()

The LEAST() function, on the other hand, returns the lowest value from a list of expressions or column values. To find the lowest age among the employees, you can use the following query:

SELECT LEAST(age) AS lowest_age
FROM employees;

Output:

lowest_age
27

44. Using GREATEST() and LEAST() Together

You can also use both GREATEST() and LEAST() functions in the same query to find multiple extreme values. For instance, to determine the oldest and youngest employees based on their ages, you can use the following query:

SELECT GREATEST(age) AS oldest_age, LEAST(age) AS youngest_age
FROM employees;

Output:

oldest_ageyoungest_age
3527

45. Using GREATEST() with Multiple Columns

The GREATEST() function can be used with multiple columns to find the highest value among different columns. For example, to find the maximum age and salary among the employees, you can use the following query:

SELECT GREATEST(age, salary) AS max_age_or_salary
FROM employees;

Output:

max_age_or_salary
50000
60000
55000
75000
48000