Pandas Melt – pd.melt()

Pandas Melt is not only one of my favorite function names (makes me think of face melting in India Jones – gross clip), but it’s also a crucial data analysis tool.

Pandas pd.melt() will simply turn a wide table, tall. This will ‘unpivot’ your data so column(s) get enumerated into rows.

1. pd.melt(Your_DateFrame)
2. df.melt()

I use this function I want to turn pretty data thats easy to read (many columns) into usable data (many rows) that is easier to analyze.

Pandas Melt - How to unpivot your datatable from wide to tall

In the above scenario, having two columns for dates (8/6 & 8/7) looks good, but it’s harder to do analysis on. I want turn those columns into row permutations.

Pseudo code: Take a column or columns, and transpose them into rows. “Unpivot” your data.

Pandas Melt

Pandas Melt is a function you’ll use when deciding the architecture of your of your data sets. This will ultimately lead to how you think about your analysis and questions you want to answer.

Melt Parameters

  • id_vars: The column or columns you’d like to “unpivot” around. This will be the column that will be expanded out.
  • value_vars (Default: All non-id_vars columns): This is where you set which columns you’d like to unpivot. By default, everything other than id_vars will be unpivoted. You also have the option to specify a subset of columns.
  • var_name: The name of your unpivoted column. This needs to be a single value (scalar).
  • value_name: The name of your new values column.
  • col_level: If you were using a MultiIndex, you specify which level you wanted to melt. This parameter won’t get used much and is optional
  • ignore_index: If you wanted to use your old index, no problem, set ignore_index=False. I’d recommend leaving this default (True) and generating a new index.

Here’s a Jupyter notebook showing how to Melt in Pandas

In [1]:
import pandas as pd
import numpy as np

Pandas Melt

Pandas Melt will turn our dataframe from wide (many columns) to tall (many rows)

Let's run through 2 examples:

  1. Standard Pandas Melt
  2. Standard Pandas Melt with custom column names

First, let's create a wide DataFrame. I'll use different restaurant purchases per day. It's common to see data presented with data per day in columns. However, it's tough to do analysis on.

In [2]:
df = pd.DataFrame.from_dict({"Name": ['Liho Liho', 'Tompkins', 'The Square', 'Chambers'],
                             "8/4/2020": np.random.randint(10,200, size=(1,4))[0],
                             "8/5/2020": np.random.randint(12,200, size=(1,4))[0],
                             "8/6/2020": np.random.randint(12,200, size=(1,4))[0],
                             "8/7/2020": np.random.randint(12,200, size=(1,4))[0]}, orient='columns')
df
Out[2]:
Name8/4/20208/5/20208/6/20208/7/2020
0Liho Liho151131613
1Tompkins15510457101
2The Square62137105184
3Chambers80444962

1. Standard Pandas Melt

In this example, I want to take all of the date columns to the right of "Name" and turn them into rows.

This means that I'll have 4 Liho Liho rows, one for each date that is a column.

I'll specify id_vars="Name" tell pandas thats what I want to 'unpivot' around. Another way - By specifying "Name" in id_vars, I'm telling pandas to take all of the other columns (besides "Name"), and combine them to one column.

Notice how we now have 16 rows -- 4 rows (because there were 4 columns) for every 1 row we had before

In [3]:
df.melt(id_vars='Name')
Out[3]:
Namevariablevalue
0Liho Liho8/4/2020151
1Tompkins8/4/2020155
2The Square8/4/202062
3Chambers8/4/202080
4Liho Liho8/5/202013
5Tompkins8/5/2020104
6The Square8/5/2020137
7Chambers8/5/202044
8Liho Liho8/6/202016
9Tompkins8/6/202057
10The Square8/6/2020105
11Chambers8/6/202049
12Liho Liho8/7/202013
13Tompkins8/7/2020101
14The Square8/7/2020184
15Chambers8/7/202062

If you ever only want to do a subset of your columns, then you'll need to specify them directly through value_vars. Notice how the other columns I did not specify were dropped. Make sure this is what you intend to do.

In [4]:
df.melt(id_vars='Name', value_vars=['8/4/2020', '8/5/2020'])
Out[4]:
Namevariablevalue
0Liho Liho8/4/2020151
1Tompkins8/4/2020155
2The Square8/4/202062
3Chambers8/4/202080
4Liho Liho8/5/202013
5Tompkins8/5/2020104
6The Square8/5/2020137
7Chambers8/5/202044

Warning: A confusing example and I'm not sure when you would use this.

If you didn't want your data dropped like above, then you could specify the columns you wanted to keep in your id_vars. This will unpivot everything else.

In [5]:
df.melt(id_vars=['Name','8/4/2020', '8/5/2020'])
Out[5]:
Name8/4/20208/5/2020variablevalue
0Liho Liho151138/6/202016
1Tompkins1551048/6/202057
2The Square621378/6/2020105
3Chambers80448/6/202049
4Liho Liho151138/7/202013
5Tompkins1551048/7/2020101
6The Square621378/7/2020184
7Chambers80448/7/202062

2. Standard Pandas Melt with custom column names

Very similar to above, but in this case, we will use custom names for our new columns.

You can specify your new long column (what's been unpivoted) through var_name, and your value column through value_name

In [6]:
df.melt(id_vars='Name', var_name='Date', value_name='Transaction_Amount')
Out[6]:
NameDateTransaction_Amount
0Liho Liho8/4/2020151
1Tompkins8/4/2020155
2The Square8/4/202062
3Chambers8/4/202080
4Liho Liho8/5/202013
5Tompkins8/5/2020104
6The Square8/5/2020137
7Chambers8/5/202044
8Liho Liho8/6/202016
9Tompkins8/6/202057
10The Square8/6/2020105
11Chambers8/6/202049
12Liho Liho8/7/202013
13Tompkins8/7/2020101
14The Square8/7/2020184
15Chambers8/7/202062

3. Bonus: You can also call melt directly from pandas

Instead of your DataFrame. However, these are identical.

In [9]:
pd.melt(df, id_vars='Name')
Out[9]:
Namevariablevalue
0Liho Liho8/4/2020151
1Tompkins8/4/2020155
2The Square8/4/202062
3Chambers8/4/202080
4Liho Liho8/5/202013
5Tompkins8/5/2020104
6The Square8/5/2020137
7Chambers8/5/202044
8Liho Liho8/6/202016
9Tompkins8/6/202057
10The Square8/6/2020105
11Chambers8/6/202049
12Liho Liho8/7/202013
13Tompkins8/7/2020101
14The Square8/7/2020184
15Chambers8/7/202062

Link to code above

Check out more Pandas functions on our Pandas Page

Official Documentation