Pandas Write To CSV – pd.DataFrame.to_csv()

Ah, your work is finally done. You’ve made your models and gathered your data insights. All that is left is to save your work. But oh no! How do you do this? That is where Pandas To CSV comes into play.

Pandas To CSV will save your DataFrame to your computer as a comma separated value (CSV) datatype. This means that you can access your data at a later time when you are ready to come back to it.

pandas.DataFrame.to_csv('your_file_name')

I save my data files when I’m at a good check point to stop. This means that I’ve done my transformations, and I’m ready to have a record of new data. Or use the data elsewhere – Like uploading to Google Sheets

Pseudo Code: Write your Pandas DataFrame to a Comma Separated Value file (CSV File)

Pandas To CSV

Pandas To CSV - How to write your file to csv in pandas python

Pandas .to_csv() Parameters

At a bare minimum you should provide the name of the file you want to create. After that I recommend setting Index=false to clean up your data.

  • path_or_buf = The name of the new file that you want to create with your data. If you don’t specify a path, then Pandas will return a string to you.
  • index = By default, when your data is saved, Pandas will include your index. This can be very annoying because when you load up your data again, your index will be there as a new column. I highly recommend setting index=false unless you have a specific reason not to.
  • sep = By default your file will be a ‘CSV’ which stands for comma separated values. This literally means that your data is separated by commas. However if you didn’t like commas, you could set the ‘sep’ to something else. This separator is usually referred to as the ‘delimiter.’
  • columns = Columns to write. If you only wanted to save a subset of your columns, you can specify that subset here.
  • header = Say you wanted to switch your column names, then you can specify what you want your columns to be called here. This should be a list of the same length as the number of columns in your data.
  • Other Parameters – Other parameters are not often used and won’t mentioned here. If you’re curious head over to the official documentation (below) and check them out.

Now the fun part, let’s take a look at a code sample

In [1]:
import pandas as pd

Pandas To CSV

Write your DataFrame directly to file using .to_csv(). This function starts simple, but you can get complicated quickly.

  1. Save your data to your python file's location
  2. Save your data to a different location
  3. Explore parameters while saving your file
  4. If you don't specify a file name, Pandas will return a string

First, let's create our DataFrame

In [7]:
df = pd.DataFrame([('Foreign Cinema', 50, 289.0),
                   ('Liho Liho', 45, 224.0),
                   ('500 Club', 102, 80.5),
                   ('The Square', 65, 25.30)],
           columns=('name', 'num_customers', 'AvgBill')
                 )
df
Out[7]:
namenum_customersAvgBill
0Foreign Cinema50289.0
1Liho Liho45224.0
2500 Club10280.5
3The Square6525.3

1. Save your data to your python file's location

To save your data as a csv to your files location, all you need to do is specify the new file name. I will also set index=false so my index does not get saved with my file

In [8]:
df.to_csv('my_new_file.csv', index=False)

Then let's check to makes sure that it saved. To do this I'll call from_csv() to read it.

In [9]:
df_saved_file = pd.read_csv('my_new_file.csv')
df_saved_file
Out[9]:
namenum_customersAvgBill
0Foreign Cinema50289.0
1Liho Liho45224.0
2500 Club10280.5
3The Square6525.3

Awesome it works.

2. Save your data to a different location

If you wanted to save your file to a different location, all you need to do it specify the path of the location you want to do.

Here I'm starting my path with '...' which means 'go one folder up.' Then I'm saying '/data/' which means 'enter the data folder.' This folder is already created. If it wasn't then I would get an error.

Then finally I'm specifying my new file name 'my_new_file.csv'

In [10]:
df.to_csv('../data/my_new_file.csv', index=False)

Then let's check to make sure it is there again.

In [11]:
df_saved_file = pd.read_csv('../data/my_new_file.csv')
df_saved_file
Out[11]:
namenum_customersAvgBill
0Foreign Cinema50289.0
1Liho Liho45224.0
2500 Club10280.5
3The Square6525.3

Nice!

3. Explore parameters while saving your file

Here I want to explore some of the parameters of to_csv(). I'm going to do two extra things 1) Subset my columns via the 'columns' parameter and 2) rename my columns via the 'header' parameter.

In [14]:
df.to_csv('my_new_file.csv',
          index=False,
          columns=['name', 'AvgBill'],
          header=['new_name', 'NewBill'])
In [15]:
df_saved_file = pd.read_csv('my_new_file.csv')
df_saved_file
Out[15]:
new_nameNewBill
0Foreign Cinema289.0
1Liho Liho224.0
2500 Club80.5
3The Square25.3

See above how only 2 columns were saved, and they were also renamed. This is because I specified the columns/headers parameters.

4. If you don't specify a file name, Pandas will return a string

Finally, let's see what happens when you don't specify a new file name. If you don't, Pandas will return a string. Watch out, this is a dangerous if your dataset is large.

In [16]:
df.to_csv()
Out[16]:
',name,num_customers,AvgBill\n0,Foreign Cinema,50,289.0\n1,Liho Liho,45,224.0\n2,500 Club,102,80.5\n3,The Square,65,25.3\n'

Link to code above

Check out more Pandas functions on our Pandas Page

Official Documentation