Selecting Data – Pandas loc & iloc[] – The Guide

When it comes to selecting data on your DataFrame, Pandas loc and iloc are two top favorites. They are quick, fast, and easy to read when reviewing code later. Lets see how to view dataframe loc vs iloc.

Pandas loc will select data based off of the label of your index (row/column labels) whereas Pandas iloc will select data based off of the position of your index (position 1, 2, 3, etc.)

Pandas loc/iloc is best used when you want a range of data. To select/set a single cell, check out Pandas .at().

Let’s break down index label vs position:

  1. Index Labels (df.loc[]) – This is the label or what your row/column actually says. Usually your index row labels will be the same as their position because your row labels are you row numbers. Your column labels will be the name of your columns.
  2. Index Positions (df.iloc[])– On the other hand, your index position will be an integer representing where your index sits in relation to other indexes. This of this as row numbers and column numbers. Remember that python starts it’s index at 0 (vs 1 like humans).
1. pd.DataFrame.loc['row_label']
2. pd.DataFrame.loc['row_label', 'column_label']
3. pd.DataFrame.iloc[row_position]
4. pd.DataFrame.iloc[row_position, column_position]

Pseudo code: For a given DataFrame, return a subset of rows/columns based off of their label (loc) or position (iloc)

Selecting Data Via Pandas loc & iloc[]

Pandas Select Data via loc and iloc - Select a single item or multiple items with pandas loc

3 Methods To Select Data Via loc & iloc

Method 1 – Via Scalar

You can select a single row, single column, or single value via scalar (single) values.

pd.DataFrame.loc['row_label', 'column_label']
pd.DataFrame.loc[:, 'column_label'] # To select single column

pd.DataFrame.iloc[row_position, col_position]
pd.DataFrame.iloc[:, col_position] # To select single column

If you want to select multiple columns you’ll need to use a list or slice.

Method 2 – Multiple Rows & Columns Via List

You can select a multiple items by passing a list of labels or index positions. Remember, use index values. This comes in handy when you are select rows or rows and columns from your data frame.

pd.DataFrame.loc[['row_label1', 'row_label2'],
                 ['column_label1', 'column_label2']]

pd.DataFrame.iloc[[row_pos1, row_pos2],
                  [col_pos1, col_pos2]]

Method 3 – Multiple Items Via Slicing

The last way to do data selection is via slices. You can either slice with labels or index positions. Think of it as “select all rows/columns between item 1 and item 2.”

pd.DataFrame.loc[['row_label1' : 'row_label2'],
                 ['column_label1' : 'column_label2']]

pd.DataFrame.iloc[[row_pos1 : row_pos2],
                  [col_pos1 : col_pos2]]

All samples above should give valid output for your pd dataframe. Let’s take a look at a code sample

In [1]:
import pandas as pd

Pandas Selecting Data Via loc[] & iloc[]

We will run through 3 examples of selections with both loc (index labels) and iloc (index positions).

  1. Selecting via scalars
  2. Selecting via lists
  3. Selecting via slices

First, let's create our DataFrame

In [2]:
df = pd.DataFrame([('Foreign Cinema', 'Restaurant', 289.0, 5.2),
                   ('Liho Liho', 'Restaurant', 224.0, 4.3),
                   ('500 Club', 'bar', 80.5, 3.9),
                   ('The Square', 'bar', 25.30, 1.7)],
           columns=('name', 'type', 'AvgBill', 'Rating')
df.set_index('name', inplace=True)
Foreign CinemaRestaurant289.05.2
Liho LihoRestaurant224.04.3
500 Clubbar80.53.9
The Squarebar25.31.7

1. Selecting via scalars

To select via scalar (single value), simply pass an index label for loc and and index position for iloc. I'll show a few ways to select a single value, whole row, and whole column.

In [3]:
df.loc['Foreign Cinema', 'type']
In [4]:
df.loc[: , 'type']
Foreign Cinema    Restaurant
Liho Liho         Restaurant
500 Club                 bar
The Square               bar
Name: type, dtype: object
In [5]:
df.loc["Liho Liho"]
type       Restaurant
AvgBill           224
Rating            4.3
Name: Liho Liho, dtype: object

And via iloc using index position

In [6]:
df.iloc[2, 1]
In [7]:
df.iloc[:, 1]
Foreign Cinema    289.0
Liho Liho         224.0
500 Club           80.5
The Square         25.3
Name: AvgBill, dtype: float64
In [8]:
type        bar
AvgBill    80.5
Rating      3.9
Name: 500 Club, dtype: object

2. Selecting via lists

If you wanted multiple items, you can pass a list of labels or positions

In [9]:
df.loc[['Foreign Cinema', '500 Club'], ['AvgBill', 'Rating']]
Foreign Cinema289.05.2
500 Club80.53.9
In [10]:
df.iloc[[0,2], [0,1]]
Foreign CinemaRestaurant289.0
500 Clubbar80.5

3. Selecting via slices

Finally, let's look at how to select multiple items via slices. Slices are when you want to select everything in between two items.

In [11]:
df.loc['Foreign Cinema' : '500 Club', 'type' : 'Rating']
Foreign CinemaRestaurant289.05.2
Liho LihoRestaurant224.04.3
500 Clubbar80.53.9
In [12]:
df.iloc[0 : 2, 1 : 2]
Foreign Cinema289.0
Liho Liho224.0

Link to code above

Check out more Pandas functions on our Pandas Page

Official Documentation