Pandas Replace Values- pd.DataFrame.replace()

Want to replace values in your DataFrame with something else? No problem. That is where pandas replace comes in.

Pandas DataFrame.replace() is a small but powerful function that will replace (or swap) values in your DataFrame with another value. What starts as a simple function, can quickly be expanded for most of your scenarios

1. YourDataFrame.replace(to_replace='what you want to replace',\
                         value='what you want to replace with')

This function is very similar to DataFrame.at(), or trying to set a value via DataFrame.iloc/loc. However, in .replace(), pandas will do the searching for you.

Beginner Pandas users will have fun doing simple replaces, but the kung-fu Pandas master will go 3 levels deep.

Pseudo code: Find current values within my DataFrame, then replace them with another value.

Pandas Replace

.replace() starts off easy, but quickly gets nuanced as you dig deeper. Here are the most common ways to use pandas replace.Here’s a breakdown of the different

CodePlain Language
df.replace(0, 5)Replace all of the 0s in your DataFrame with 5s
df.replace([0, 1, 2, 3], 4)Replace all the 0s, 1s, 2s, 3s in your DataFrame with 4s
df.replace([0, 1, 2, 3], [4, 3, 2, 1])Replace all the 0s with 4s, 1s with 3s, 2s with 2s, and 3s with 1s. Note: if you pass two lists they both much be the same length
df.replace({0: 10, 1: 100})Using a dict – Replace 0s with 10s, and 1s with 100s.
df.replace({'A': 0, 'B': 5}, 100)Replace 0’s in column “A” with 100, and replace 5s in column “B” with 100
df.replace({'C': {1: 100, 3: 300}})Using a dict – Within column “C” replace 1s with 100 and 3s with 300
df.replace(to_replace=r'^ba.$', value='new', regex=True)Replace anything that matched the regex ‘^ba.$’ with “new”

Replace Parameters

  • to_replace: The values, list of values, or values which match regex, that you would like to replace. If using a dict, you can also include the values you would like to do the replacing. There are a ton of details here, we recommend referring to the official documentation for more.
  • value: The values that will do the replacing. Note: This can also be none if you have a dict in your to_replace parameter.
  • inplace (Default: False): If true, you would like to do your operation in place (write over your current DataFrame). If false, then your DataFrame will be returned to you.
  • limit: The max size you could like to forward or back fill. Example: You may want to fill from values that are 2-3 rows away, but do you really want to fill from values that are 30 rows away?
  • regex: If you want to_replace to read your inputs as regex or not.
  • method: The fill method to use when to_replace is either a scalar, list, or tuple. Value must be None
    • pad/ffill – Take the value that is in the back of what your replacing, and fill it going forward
    • bfill – Take the value that is in the front of your value to be replaced, and fill it going backward.

Here’s a Jupyter notebook showing how to set index in Pandas

In [3]:
import pandas as pd

Pandas Replace

Pandas Replace will replace values in your DataFrame with another value. This function starts simple, but gets flexible & fun later on.

We will run through 7 examples:

  1. Single 1<>1 replace across your whole DataFrame
  2. Single Many<>1 replace across your whole DataFrame
  3. Many 1<>1 replaces across your whole DataFrame
  4. Many 1<>1 replaces across your whole DataFrame via a dictionary
  5. 1<>1 column specific replaces across multiple columns via a dictionary
  6. Many 1<>1 replaces with a single column via a dictionary
  7. Backfill a value with another value in the row below.

Let's create our DataFrame

In [4]:
df = pd.DataFrame({'X': [1, 2, 3, 4, 5],
                   'Y': [5, 6, 7, 8, 9],
                   'Z': ['z', 'y', 'x', 'w', 'v']})
df
Out[4]:
XYZ
015z
126y
237x
348w
459v

1. Single 1<>1 replace across your whole DataFrame

Here we will find a all instances of a single value in our DataFrame, and replace it with something else.

Here all of the 2s are being replaced with 20s

In [27]:
df.replace(to_replace=2, value=20)
Out[27]:
XYZ
015z
1206y
237x
348w
459v

2. Single Many<>1 replace across your whole DataFrame

Here we will pass a list of values in our DataFrame that we want to replace with something else

We will replace all 1s, 3s, and 5s with 20

In [28]:
df.replace(to_replace=[1,3,5], value=20)
Out[28]:
XYZ
02020z
126y
2207x
348w
4209v

3. Many 1<>1 replaces across your whole DataFrame

Here we will pass two lists, one of values that need replacing, and one with the valuing that will do replacing

Notice that the 1s get replaced with 10s, the 3s with 30s and the 5s with 50s

In [29]:
df.replace(to_replace=[1,3,5], value=[10,30,50])
Out[29]:
XYZ
01050z
126y
2307x
348w
4509v

4. Many 1<>1 replaces across your whole DataFrame via a dictionary

Here we will pass a dictionary. The dictionary keys are the values we want to replace and the dictionary values are the values doing the replacing.

We are replacing 1s with 10s, 'z's with 'zz's, and 'v's with 'vvv's

In [30]:
df.replace(to_replace={1: 10, 'z':'zz', 'v':'vvv'}, value=None)
Out[30]:
XYZ
0105zz
126y
237x
348w
459vvv

5. 1<>1 column-specific replaces across multiple columns via a dictionary

One interesting feature of pandas.replace is that you can specify values to replace per column. Example: you may want to only replace the 1s in your first column, but not in your second column.

To do this, you need to have a nested dict. The parent dict will have the column you want to specify, the child dict will have the values to replace.

Here we are replacing the 5s in column X (only) with 50s

In [16]:
df.replace(to_replace={'X': {5: 50}}, value=None)
Out[16]:
XYZ
015z
126y
237x
348w
4509v

6. Many 1<>1 column-specific replaces via a dictionary

We'll do the same thing here, but multiple values within multiple columns

Here we are doing a few replaces:

  • In column "X": Replace 1s with 10s and 4s with 40s
  • In column "Y": Replace 8s with 80s and 9s with 99s
  • In column "Z": Replace 'z's with 'zzz's, 'y's with 'yyy's and 'x's with 'xx's
In [31]:
df.replace(to_replace={'X': {1: 10, 4: 40},\
                       'Y': {8: 80, 9: 99},
                       'Z': {'z': 'zzz', 'y': 'yyy', 'x': 'xx'},}, value=None)
Out[31]:
XYZ
0105zzz
126yyy
237xx
34080w
4599v

7. Backfill a value with another value in the row below.

For this example, we will specify to_replace with value=None. However this time, we will also set method='bfill' which will fill a value with the row below it.

Here we are replacing 1, 2, 'w', and 4 with the values in the next row below them. This is most helpful when you have NAs (look into using .fillna()) or when you want to overwrite.

Notice how both 1 and 2 were getting replaced in column X, with method='bfill', the 3 filled both 1 and 2

In [33]:
df.replace([1, 2, 'w', 4], value=None, method='bfill')
Out[33]:
XYZ
035z
136y
237x
358v
459v

Link to code above

Check out more Pandas functions on our Pandas Page

Official Documentation