SQL DISTINCT Clause
In this tutorial you will learn how to remove duplicate values from a result set.
Retrieving Distinct Values
When fetching data from a database table, the result set may contain duplicate rows or values. If you want to remove these duplicate values you can specify the keyword DISTINCT
directly after the SELECT
keyword, as demonstrated below:
Syntax
The DISTINCT
clause is used to remove duplicate rows from the result set:
column_list
FROM table_name
;Here, column_list is a comma separated list of column or field names of a database table (e.g. name, age, country, etc.) whose values you want to fetch.
Note: The DISTINCT
clause behaves similar to the UNIQUE
constraint, except in the way it treats nulls. Two NULL
values are considered unique, while at the same time they are not considered distinct from each other.
Let's check out some examples that demonstrate how it actually works.
Suppose we've a customers table in our database with the following records:
+---------+--------------------+-----------+-------------+ | cust_id | cust_name | city | postal_code | +---------+--------------------+-----------+-------------+ | 1 | Maria Anders | Berlin | 12209 | | 2 | Fran Wilson | Madrid | 28023 | | 3 | Dominique Perrier | Paris | 75016 | | 4 | Martin Blank | Turin | 10100 | | 5 | Thomas Hardy | Portland | 97219 | | 6 | Christina Aguilera | Madrid | 28001 | +---------+--------------------+-----------+-------------+
Now execute the following statement which returns all the rows from the city column of this table.
Example
Try this code »SELECT city FROM customers;
After execution, you'll get the output something like this:
+-----------+ | city | +-----------+ | Berlin | | Madrid | | Paris | | Turin | | Portland | | Madrid | +-----------+
If you see the output carefully, you'll find the city "Madrid" appears two times in our result set, which is not good. Well, let's fix this problem.
Removing Duplicate Data
The following statement uses DISTINCT
to generate a list of all city in the customers table.
Example
Try this code »SELECT DISTINCT city FROM customers;
After executing the above command, you'll get the output something like this:
+-----------+ | city | +-----------+ | Berlin | | Madrid | | Paris | | Turin | | Portland | +-----------+
As you see this time there is no duplicate values in our result set.
Note: If you use the SELECT DISTINCT
statement for a column that has multiple NULL values, Then SQL keeps one NULL
value and removes others from the result set, because DISTINCT
treats all the NULL
values as the same value.