Pandas Drop – pd.DataFrame.drop()

If you’re looking for information on how to pandas drop a column and return it to you, check out pandas pop.

A lot of data can be too much data. This is when you need to remove items from your dataset. Pandas Drop is what you’re looking for

Drop – pd.DataFrame.Drop() removes data (rows or columns) from your DataFrame. It’s extremely useful when dropping a single or multiple rows or columns.

Pseudo code: Remove (or drop) single/multiple rows/columns from a DataFrame or Series

Example: Below we drop the column “Type” from our DataFrame

Pandas Drop - Dropping a column from a Pandas DataFrame

Pandas Drop

Let’s take a look at the different parameters you can pass pd.drop():

  • Labels (required) – This is where you can pass a single column/row name, or multiple. Pass a single value, or else a list-like of values to drop multiple columns
  • Axis (Default 0) – You can set axis to specify whether you want to drop rows, or columns. Axis = 0 or ‘index’ tells Pandas you want to remove rows. Lastly, axis = 1 or ‘columns tells Pandas you want to remove columns.
  • Index – Optional field where you can specify a single value or a list of rows to drop. If you set the index parameter to a value, then Pandas assumes that you’re dropping rows. If set, there is no need to specify “Axis”
  • Columns – Similar to “Index”, if the “columns” parameter is set, then Pandas assumes that you’re dropping columns. You do not need to set “Axis”
  • Level – For when you have a MultiIndex. Here you can specify which level of the MultiIndex that your “Labels” (from above) refer to.
  • Inplace (Default: False) – If set to ‘False’ then Pandas will drop the data, and return a copy of your DataFrame. If ‘True’, then pandas will drop the data and overwrite your existing DataFrame.
  • Errors (Default: ‘Raise’) – Basically, do you want to see your errors raised or not? You’ll get an error if you reference a row or a column that doesn’t exist. If you want to ignore errors, then set errors to “ignore”
In [27]:
import pandas as pd

Pandas Drop

Dropping in pandas means to remove rows or columns from your dataset.

Let's first create a DataFrame

In [28]:
df = pd.DataFrame([('Foreign Cinema', 'Restaurant', 289.0),
                   ('Liho Liho', 'Restaurant', 224.0),
                   ('500 Club', 'bar', 80.5),
                   ('The Square', 'bar', 25.30)],
           columns=('name', 'type', 'AvgBill')
                 )
df
Out[28]:
nametypeAvgBill
0Foreign CinemaRestaurant289.0
1Liho LihoRestaurant224.0
2500 Clubbar80.5
3The Squarebar25.3

Dropping Columns

I want to use this dataframe example mulitple times. So I'm going to create a copy first

In [29]:
df_drop_column = df.copy()

Now let's remove the "type" column from our dataset. We set our axis=1 to specify we are dropping columns

In [30]:
df_drop_column.drop("type", axis=1)
Out[30]:
nameAvgBill
0Foreign Cinema289.0
1Liho Liho224.0
2500 Club80.5
3The Square25.3

However, we could also set "columns" to equal the column(s) that we want to drop

In [31]:
df_drop_column.drop(columns='type')
Out[31]:
nameAvgBill
0Foreign Cinema289.0
1Liho Liho224.0
2500 Club80.5
3The Square25.3

If I wanted to drop multiple columns, say "type" & "AvgBill" then I could pass a list of columns to drop

In [32]:
df_drop_column.drop(["type", "AvgBill"], axis=1)
Out[32]:
name
0Foreign Cinema
1Liho Liho
2500 Club
3The Square

Dropping Rows

In [33]:
df_drop_rows = df.copy()

In order to drop rows, we need to specify labels within the index that we want to drop. Most of the time this will be row numbers, but double check your data!

In [34]:
df_drop_rows.drop(1, axis=0)
Out[34]:
nametypeAvgBill
0Foreign CinemaRestaurant289.0
2500 Clubbar80.5
3The Squarebar25.3

Again, we could also give a label or list of labels to "index" and pandas will know to remove rows (instead of columns)

In [35]:
df_drop_rows.drop(index=3)
Out[35]:
nametypeAvgBill
0Foreign CinemaRestaurant289.0
1Liho LihoRestaurant224.0
2500 Clubbar80.5

Or if we wanted to drop multiple items, we could pass a list of index labels

In [36]:
df_drop_rows.drop([1,2], axis=0)
Out[36]:
nametypeAvgBill
0Foreign CinemaRestaurant289.0
3The Squarebar25.3

Surrpressing Errors

Say you accidentally reference a row or column that isn't in your dataset, this would normally raise an error. However you can surrpress this error by setting "errors" to 'ignore'

In [37]:
df_drop_column.drop("sample_non_existent_column", axis=1, errors='ignore')
Out[37]:
nametypeAvgBill
0Foreign CinemaRestaurant289.0
1Liho LihoRestaurant224.0
2500 Clubbar80.5
3The Squarebar25.3

Link to code above

Check out more Pandas functions on our Pandas Page

Official Documentation