Add Column To Dataframe Pandas

One of the most common Pandas tasks you’ll do is add more data to your DataFrame. This means you need to become an expert at adding a column to your DataFrame.

5 ways to add a new column to your DataFrame in Pandas:

Pseudo code: Using a new scalar or list of data, add a new column to your DataFrame.

Adding Column To Pandas DataFrame

Let’s take a look at the 5 ways you can add a column to your DataFrame. For examples of these, check out the code below.

Declare new column by referencing a new name

95% of the time you’ll be adding a new column to your dataset by referencing a column name that isn’t already there.

You can add a scalar (a single value) or a list (Series, dict, etc.) of items. Make sure if you add a list it is the same length as your df.

This method will put the new column at the end of your DataFrame (last column).

df['new_column_name'] = 5 # You'll get a column of all 5s
df['new_column_name'] = [1,2,3,4] # Assuming your df is 4 items long, you'll get a new column of 1,2,3,4

Using df.insert()

Insert will put a new column in your DataFrame at a specified location. The main advantage is you get to pick where in your DataFrame you want the column.


Using df.assign()

Assign will also add new columns to your DataFrame, but this time, you can add multiple columns. The entire DataFrame will be returned.

df.assign(new_column=lambda x: x.another_column + 7)

Using A Dictionary

One of the most straight forward ways is to simply use a dictionary. This new dict will add new rows based off of the key values you pass.

people_dict = {'bob': 'boy', 'Mike': 'boy', 
           'Katie': 'girl', 'Stacey': 'girl'}  
df['people'] = people_dict 

Using .loc[]

Not recommended, try one of the above methods first.

You could add a new column via the .loc[] methods. This is generally used for data look ups.

df.loc[:,'new_column'] = new_column_series

Here’s a Jupyter notebook with a few examples:

In [28]:
import pandas as pd

Pandas Add New DataFrame Column

Let's run through 5 different ways to add a new column to a Pandas DataFrame

  1. By declaring a new column name with a scalar or list of values
  2. By using df.insert()
  3. Using df.assign()
  4. Using a dictionary
  5. Using .loc[]

First, let's create our DataFrame

In [29]:
df = pd.DataFrame([('Foreign Cinema', 'Restaurant', 289.0),
                   ('Liho Liho', 'Restaurant', 224.0),
                   ('500 Club', 'bar', 80.5),
                   ('The Square', 'bar', 25.30)],
           columns=('name', 'type', 'AvgBill')
0Foreign CinemaRestaurant289.0
1Liho LihoRestaurant224.0
2500 Clubbar80.5
3The Squarebar25.3

1. Declaring a new column name with a scalar or list of values

The easiest way to create a new column is to simply write one out! Then assign either a scalar (single value) or a list of items to it.

In [30]:
df['Day'] = "Monday"
0Foreign CinemaRestaurant289.0Monday
1Liho LihoRestaurant224.0Monday
2500 Clubbar80.5Monday
3The Squarebar25.3Monday
In [31]:
df['Day'] = ['Monday', 'Tuesday', 'Wednesday', 'Thursday']
0Foreign CinemaRestaurant289.0Monday
1Liho LihoRestaurant224.0Tuesday
2500 Clubbar80.5Wednesday
3The Squarebar25.3Thursday

2. Using df.insert()

.insert() will do what it sounds like...insert a new column to your DataFrame. The nice part is you get to pick where you column appears

In [32]:
df.insert(loc=1, column="Stars", value=[2,2,3,4])
0Foreign Cinema2Restaurant289.0Monday
1Liho Liho2Restaurant224.0Tuesday
2500 Club3bar80.5Wednesday
3The Square4bar25.3Thursday

3. Using df.assign()

.assign() is a bit like .insert, but you can pass multiple

In [33]:
df.assign(AvgHalfBill=lambda x: x.AvgBill / 2)
0Foreign Cinema2Restaurant289.0Monday144.50
1Liho Liho2Restaurant224.0Tuesday112.00
2500 Club3bar80.5Wednesday40.25
3The Square4bar25.3Thursday12.65

4. Passing a dictionary to your DataFrame

You can also pass a dictionary to your DataFrame. The keys of the dictionary will become the new values of your column. Notice how the last entry "Square" does not match what is in the 'name' column. This is ok and pandas will insert the value by the order they are in the dictionary.

In [35]:
df['Month'] = {'Jan':'Foreign Cinema', 'Feb':'Liho Liho', 'Apr':'500 Club', 'Dec':'Square'}
0Foreign Cinema2Restaurant289.0MondayJan
1Liho Liho2Restaurant224.0TuesdayFeb
2500 Club3bar80.5WednesdayApr
3The Square4bar25.3ThursdayDec

5. Using .loc[]

Not recommended, there are other (and faster) ways to insert a new column, but oh well, pick your poison! Try one of the other ways first

In [36]:
df.loc[:, "Year"] = [2019, 2020, 1995, 1990]
0Foreign Cinema2Restaurant289.0MondayJan2019
1Liho Liho2Restaurant224.0TuesdayFeb2020
2500 Club3bar80.5WednesdayApr1995
3The Square4bar25.3ThursdayDec1990

Link to code above

Check out more Pandas functions on our Pandas Page

Official Documentation