KeyError Pandas – How To Fix

Pandas KeyError is frustrating. This error happens because Pandas cannot find what you’re looking for.

To fix this either:

  1. Preferred Option: Make sure that your column label (or row label) is in your dataframe!
  2. Error catch option: Use df.get(‘your column’) to look for your column value. No error will be thrown if it is not found.
1. df.get('your_column', default=value_if_no_column)

Pseudo code: Check to see if a column is in your dataframe, if not, return the default value.

Pandas KeyError

KeyError Pandas - How to fix. First, make sure your dataframe contains your column. Then try df.get('your_column')

In most cases, think of ‘key’ as the same as ‘name.’ Pandas is telling you that it can not find your column name. The preferred method is to *make sure your column name is in your dataframe.*

OR if you want to try and catch your error, you can use df.get(‘your_column’). However, if you don’t know what columns are in your dataframe…do you really know your data?

It’s best to head back upstream with your code and debug where your expectations and dataframe columns mismatch.

Try, Except

For a general solution, you can use the Try Except convention to catch errors in your code. However, beware. Using a blanket Try/Except clause is dangerous and poor code practice if you do not know what you are doing.

If you are ‘catching’ general errors with try/except, this means that anything can slip through your code. This could result in unexpected errors getting through and a web of complexity.

Goal: Try to never let the reality of your code get too far away from the expectations of your code.

Let’s take a look at a sample:


In [1]:
import pandas as pd

Pandas KeyError

Pandas KeyError can be annoying. It generally happens when pandas cannot find the thing you're looking for. Usually this is to due a column it cannot find. It's simple to debug!

Let's check out some examples:

  1. Locating the error
  2. Fixing the error via the root cause
  3. Catching the error with df.get()

First, let's create a DataFrame

In [5]:
df = pd.DataFrame([('Foreign Cinema', 'Restaurant'),
                   ('Liho Liho', 'Restaurant'),
                   ('500 Club', 'bar'),
                   ('The Square', 'bar')],
           columns=('name', 'type')
                 )

df
Out[5]:
nametype
0Foreign CinemaRestaurant
1Liho LihoRestaurant
2500 Clubbar
3The Squarebar

Now let's try to call a column that is in our dataframe and is NOT in our dataframe

In [6]:
# Is in our dataframe
df['name']
Out[6]:
0    Foreign Cinema
1         Liho Liho
2          500 Club
3        The Square
Name: name, dtype: object
In [9]:
# Is not in our dataframe
df['food']
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
/opt/anaconda3/lib/python3.7/site-packages/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance)
   2888             try:
-> 2889                 return self._engine.get_loc(casted_key)
   2890             except KeyError as err:

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

KeyError: 'food'

The above exception was the direct cause of the following exception:

KeyError                                  Traceback (most recent call last)
<ipython-input-9-dce7ca21d87e> in <module>
      1 # Is not in our dataframe
----> 2 df['food']

/opt/anaconda3/lib/python3.7/site-packages/pandas/core/frame.py in __getitem__(self, key)
   2897             if self.columns.nlevels > 1:
   2898                 return self._getitem_multilevel(key)
-> 2899             indexer = self.columns.get_loc(key)
   2900             if is_integer(indexer):
   2901                 indexer = [indexer]

/opt/anaconda3/lib/python3.7/site-packages/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance)
   2889                 return self._engine.get_loc(casted_key)
   2890             except KeyError as err:
-> 2891                 raise KeyError(key) from err
   2892 
   2893         if tolerance is not None:

KeyError: 'food'

Oh no! We got a KeyError. This means that Pandas cannot find "food" within our dataframe. We know why this is. Simply, it's not in our DF.

To get around this, either add a 'food' column. Or use df.get() to try and catch it. Here we will use .get() and notice there is no error thrown.

In [12]:
df.get('food')

This error can also happen when you're trying to access an index (for rows) label that doesn't exist. Check out this example

In [22]:
# Does exist
df.loc[2]
Out[22]:
name    500 Club
type         bar
Name: 2, dtype: object
In [23]:
# Does not
df.loc[9]
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
/opt/anaconda3/lib/python3.7/site-packages/pandas/core/indexes/range.py in get_loc(self, key, method, tolerance)
    350                 try:
--> 351                     return self._range.index(new_key)
    352                 except ValueError as err:

ValueError: 9 is not in range

The above exception was the direct cause of the following exception:

KeyError                                  Traceback (most recent call last)
<ipython-input-23-026debeea1cf> in <module>
      1 # Does not
----> 2 df.loc[9]

/opt/anaconda3/lib/python3.7/site-packages/pandas/core/indexing.py in __getitem__(self, key)
    877 
    878             maybe_callable = com.apply_if_callable(key, self.obj)
--> 879             return self._getitem_axis(maybe_callable, axis=axis)
    880 
    881     def _is_scalar_access(self, key: Tuple):

/opt/anaconda3/lib/python3.7/site-packages/pandas/core/indexing.py in _getitem_axis(self, key, axis)
   1108         # fall thru to straight lookup
   1109         self._validate_key(key, axis)
-> 1110         return self._get_label(key, axis=axis)
   1111 
   1112     def _get_slice_axis(self, slice_obj: slice, axis: int):

/opt/anaconda3/lib/python3.7/site-packages/pandas/core/indexing.py in _get_label(self, label, axis)
   1057     def _get_label(self, label, axis: int):
   1058         # GH#5667 this will fail if the label is not present in the axis.
-> 1059         return self.obj.xs(label, axis=axis)
   1060 
   1061     def _handle_lowerdim_multi_index_axis0(self, tup: Tuple):

/opt/anaconda3/lib/python3.7/site-packages/pandas/core/generic.py in xs(self, key, axis, level, drop_level)
   3480             loc, new_index = self.index.get_loc_level(key, drop_level=drop_level)
   3481         else:
-> 3482             loc = self.index.get_loc(key)
   3483 
   3484             if isinstance(loc, np.ndarray):

/opt/anaconda3/lib/python3.7/site-packages/pandas/core/indexes/range.py in get_loc(self, key, method, tolerance)
    351                     return self._range.index(new_key)
    352                 except ValueError as err:
--> 353                     raise KeyError(key) from err
    354             raise KeyError(key)
    355         return super().get_loc(key, method=method, tolerance=tolerance)

KeyError: 9

This example immediately jumps out to me and says "hey, I can not find the label 9 in your row. Do something about this"

If you wanted to programatically do this, the long way would be to check for it first. If it is in the index, then proceed. But usually, I want to make sure I'm calling something I KNOW is in my index.

In [25]:
value = 9

if value in df.index:
    print(df.loc[value])
else:
    print("Not in index")
Not in index
In [26]:
value = 2

if value in df.index:
    print(df.loc[value])
else:
    print("Not in index")
name    500 Club
type         bar
Name: 2, dtype: object

Link to code above

Check out more Pandas functions on our Pandas Page