Pandas Fill NA – DataFrame.fillna()

Your data may have NAs (Not Available) values within your DataFrame. Think of these as blank, null, or not present values. Many pandas functions will give you a hard time if you use them with NAs. That is where Pandas Fill NA .fillna()comes into play.

Pandas Fill NA will fill in your DataFrame <NA> values with another value of your choice. You can also “backfill” or “forwardfill” your cells with other values from the DataFrame.

1. pd.DataFrame.fillna(value="Value To Fill")

Pseudo code: With all of my NA values, fill them in with something concrete.

Pandas Fill NA

Pandas Fill NA - Fill your Not Available (NA) values in your DataFrame with a value of your choice.

Fill NA Parameters

.fillna() starts off simple, but unlocks a ton of value once you start backfilling and forward filling. Let’s take a look at the parameters

  • value (scalar, dict, Series, or DataFrame: This single parameter has a ton of value packed into it. Let’s take a look at each option. Take a look at the example below for reference to these.
    • Scalar: Fill in your DataFrame’s missing values with a single other value.
    • Dict: Fill in your missing values with different values depending on the index. Ex: Change your “fill values” depending on the column or row.
    • Series: Same as dict above, you can customize your fill values based on the index. Make sure that your Series index references your DataFrame-to-be-filled’s index.
    • DataFrame: The big one, fill in another DataFrame with values from another DataFrame.
  • Method (‘backfill’, ‘bfill’, ‘pad’, ‘ffill’, None), default None: *Awesome parameter alert* This parameter will fill your NAs with the last known observation or next known observation. It is great when you want to fill your NAs with other values within your DataFrame.
    • Backfill/bfill – Fill in your NAs with the next known observation
    • ffill – Or also known as “forward fill”. This will fill your NAs with the last known observation.
  • Axis (‘index’ or ‘columns’): You can use the method above to backfill or forward fill along an axis. Do you want your row values to propagate to the NAs? Or your column values?
  • Inplace: If true, this will fill in your DataFrame inplace, meaning a copy will not be returned and your old DataFrame will be overwritten.
  • Limit: The number of NA values you wish to fill forward/backward. Any cell that is further than your limit from the last/first known observation will continue to be NA. Check out the examples below to see this in action.

Pandas Fill NA has a ton of flexibility. The range of this function is best learned through examples. Let’s fun through a couple:


In [1]:
import pandas as pd

Pandas Fill NA

Pandas Fill NA has a ton of functionality and flexibility once you dive into the parameters. Let's start off simple then explore around.

We will run through 3 examples:

  1. Default Fill NA
  2. Fill NA based off of the index - specific values for rows and columns
  3. Fill NA - Backfill Foward fill
  4. Fill NA - Backfill Foward fill w/ limits

But first, let's create our DataFrame with NA values. Luckily Pandas as a pd.NA that we can use.

In [6]:
df = pd.DataFrame([('Foreign Cinema', 'Restaurant', pd.NA),
                   ('Liho Liho', 'Restaurant', 224.0),
                   (pd.NA, 'bar', 80.5),
                   (pd.NA, 'bar', pd.NA),
                   (pd.NA, 'bar', 65.23),
                   ('Blue Barn', pd.NA, 361.98)],
           columns=('name', 'type', 'AvgBill')
                 )
df
Out[6]:
nametypeAvgBill
0Foreign CinemaRestaurant<NA>
1Liho LihoRestaurant224
2<NA>bar80.5
3<NA>bar<NA>
4<NA>bar65.23
5Blue Barn<NA>361.98

1. Default Fill NA

To start off, let's fill in our NA values with another string "No Value Available." You can also do a number or timestamp or anything you want.

Notice how all of the NAs have been replaced.

In [7]:
df.fillna("No Value Available")
Out[7]:
nametypeAvgBill
0Foreign CinemaRestaurantNo Value Available
1Liho LihoRestaurant224
2No Value Availablebar80.5
3No Value AvailablebarNo Value Available
4No Value Availablebar65.23
5Blue BarnNo Value Available361.98

2. Fill NA based off of the index - specific values for rows and columns

However, "No Value Available" is weird to fill-in for INT and String columns. Luckily Pandas will allow us to fill in values per index (per column or row) with a dict, Series, or DataFrame.

dict = {key: value} key=index, value=fill_with

Notice how columns or axis that I don't specify do not get filled in.

In [13]:
df.fillna({'name': 'No Name Rest.', 'type': 'No Name Type'})
Out[13]:
nametypeAvgBill
0Foreign CinemaRestaurant<NA>
1Liho LihoRestaurant224
2No Name Rest.bar80.5
3No Name Rest.bar<NA>
4No Name Rest.bar65.23
5Blue BarnNo Name Type361.98

To fill with Series, have your index be the index you want to fill, and the value the fill value.

In [15]:
s = pd.Series(data=["No Name Type2", 100], index=["type", 'AvgBill'])
s
Out[15]:
type       No Name Type2
AvgBill              100
dtype: object
In [16]:
df.fillna(s)
Out[16]:
nametypeAvgBill
0Foreign CinemaRestaurant100.00
1Liho LihoRestaurant224.00
2<NA>bar80.50
3<NA>bar100.00
4<NA>bar65.23
5Blue BarnNo Name Type2361.98

3. Fill NA - Backfill Foward fill

Next up is Backfill and Forward Fill - These awesome methods help you fill in null values with other values from your DataFrame.

Backfill = 'Step back and fill your values'

Forward Fill = 'Step forward and fill your values'

Notice below how Blue Barn replaces the 3 filled in restaurant names above is. Blue Barn is stepped back and filled in. There is no row in front of Row 5, Column type - So nothing gets filled in.

In [20]:
df.fillna(method='bfill', axis=0)
Out[20]:
nametypeAvgBill
0Foreign CinemaRestaurant224.00
1Liho LihoRestaurant224.00
2Blue Barnbar80.50
3Blue Barnbar65.23
4Blue Barnbar65.23
5Blue Barn<NA>361.98

Here the inverse happens, the values that do the filling are propagated forward.

In [23]:
df.fillna(method='ffill', axis=0)
Out[23]:
nametypeAvgBill
0Foreign CinemaRestaurant<NA>
1Liho LihoRestaurant224
2Liho Lihobar80.5
3Liho Lihobar80.5
4Liho Lihobar65.23
5Blue Barnbar361.98

You can also back/forward fill on the row axis. Notice how 'bar' fills the NAs of 'name' column.

In [25]:
df.fillna(method='bfill', axis=1)
Out[25]:
nametypeAvgBill
0Foreign CinemaRestaurant<NA>
1Liho LihoRestaurant224
2barbar80.5
3barbar<NA>
4barbar65.23
5Blue Barn361.98361.98

4. Fill NA - Backfill Foward fill w/ limits

Say you have a ton of NAs and you want to forward or backfill them. However, you don't want to forward fill or backfill too many cells ahead/behind. You can set a limit which will tell pandas how many cells.

Here we will set the limit to 2 and the 3rd cell will not get forward filled

In [26]:
df.fillna(method='ffill', axis=0, limit=2)
Out[26]:
nametypeAvgBill
0Foreign CinemaRestaurant<NA>
1Liho LihoRestaurant224
2Liho Lihobar80.5
3Liho Lihobar80.5
4<NA>bar65.23
5Blue Barnbar361.98

Link to code above

Check out more Pandas functions on our Pandas Page

Official Documentation