Pandas Unique – pd.Series.unique()

Have you ever wondered what the distinct (unique) values are within your pandas DataFrame column? That is where Pandas Unique comes into play.

pandas.Series.unique()

I have two main uses for finding distinct values:

  1. When I’m trying to visually see what values lie within a pandas column. I will often go between .unique() and .value_counts() to get a feel for my data.
  2. When I’m trying to iterate through a DataFrame. I will loop through the unique values of a column, then filter a DataFrame by it’s unique value. Check out the examples below for an instance of this.

Pseudo Code: Look at a Pandas Series, then return only the distinct values. No showing multiple values allowed.

Pandas Unique

Pandas Unique - Find the distinct or unique values of your pandas column

.unique() Parameters

None! .unique() does not take any values. Simply pass your list of values or a series and the distinct values will be returned to you.

One thing to note: You can call pd.unique() and pass a list of values, or you can call pd.Series.unique() and get the distinct values right on your series.

Now the fun part, let’s take a look at a code sample

In [1]:
import pandas as pd

Pandas Unique

Pandas Unique will show you the unique values within your dataset or Series. This is very useful when you're trying to understand the cardinality (how many elements) in a group.

Let's run through an example

  1. Find the unique values within a Pandas column

And one application

  1. Iterate through a DataFrame's column's unique values. Then filter the DataFrame and do something with your data

Let's first create a DataFrame

In [6]:
df = pd.DataFrame([('Foreign Cinema', 'Restaurant', 289.0),
                   ('Liho Liho', 'Restaurant', 224.0),
                   ('500 Club', 'Bar', 80.5),
                   ('The Square', 'Bar', 19.34),
                   ('The Square', 'Bar', 29.30),
                   ('Foreign Cinema', 'Restaurant', 340.03),
                   ('500 Club', 'Bar', 50.7),
                   ('500 Club', 'Bar', 45.2),],
           columns=('name', 'type', 'AvgBill')
                 )
df
Out[6]:
nametypeAvgBill
0Foreign CinemaRestaurant289.00
1Liho LihoRestaurant224.00
2500 ClubBar80.50
3The SquareBar19.34
4The SquareBar29.30
5Foreign CinemaRestaurant340.03
6500 ClubBar50.70
7500 ClubBar45.20

1. Find the unique values within a Pandas column

Say you want to find the unique values within a Pandas Column. All you need to do is call .unique() on the column you're interested in. Let's first find the unique values within 'name', then within 'type'

In [4]:
df['name'].unique()
Out[4]:
array(['Foreign Cinema', 'Liho Liho', '500 Club', 'The Square'],
      dtype=object)
In [5]:
df['type'].unique()
Out[5]:
array(['Restaurant', 'bar'], dtype=object)

Application: 1. Iterate through a DataFrame's column's unique values. Then filter the DataFrame and do something with your data

Often times I'll use .unique() when I want to iterate through the subsets of my DataFrame. Here I'm going to iterate through the unique values within the 'name' column and find the sum of the 'AvgBill' column.

There are more efficient ways of doing this, but we'll use this as the demonstration.

In [7]:
unique_values = df['name'].unique()
unique_values
Out[7]:
array(['Foreign Cinema', 'Liho Liho', '500 Club', 'The Square'],
      dtype=object)
In [9]:
for rezy in unique_values:
    df_single_rezy = df[df['name']==rezy]
    
    print ("Your total bill for {} is {}".format(rezy, df_single_rezy['AvgBill'].sum()))
Your total bill for Foreign Cinema is 629.03
Your total bill for Liho Liho is 224.0
Your total bill for 500 Club is 176.39999999999998
Your total bill for The Square is 48.64

Link to code above

Check out more Pandas functions on our Pandas Page

Official Documentation