Pandas Number Of Rows – 6 Methods

Often times you’ll need to know how many rows are in your dataset. Simply, it is the most foundational metric you can know about your data. Plus, you may want to find out how long your .apply() function is going run. We’ll show your 6 pandas ways to count number of rows.

Pandas number of rows will tell you…drumroll…how many rows you have in your dataset. This is important to know before applying an expensive (long running) function to your dataset. It is crucial to understand while getting to know your data.

Lets go over 6 methods in order of our favorites

  1. DataFrame Length – len(df)
  2. DataFrame Info – df.info
  3. DataFrame__len__ – df.__len__
  4. DataFrame Shape – df.shape
  5. DF Count – df.count
  6. DataFrame Axis Length – (df.axes[0])

Pseudo code: Return the number of rows in a pandas DataFrame or Series

Pandas Number Of Rows

Pandas Number Of Rows - Count the number of rows in your DataFrame or Series with 6 different methods: Length, Info, __len__, shape, count, and axis length

6 Methods To Find Row Count

Below are 6 methods to find out how tall your your dataset is. We’ve listed them in order of our favorite to least favorite.

DataFrame Length

len(df)

First up is DataFrame Length. This super easy and fast function will return the length of your DataFrame. The default length is the number of rows in your dataset. This is my #1 go to function to find out row count. len() come from vanilla python.

DataFrame Info

df.info()

Next is DataFrame Info. Though it is a bit slower, you’ll get more information for free. df.info() will return column names, row count, and how many non-na values you have in each row. It is useful when trying to get to know your data. I use this when I want to know row count and the characteristics of my columns.

DataFrame __len__

df.__len__

Fun fact, functions that start with double underscores have a short name of “dunder.” df.__len__ is a pass-through function that simply calls len(df.index). It is quick and easy. I don’t use it that often because 1) I have to type out extra characters and 2) the double underscores don’t look clean. But it’s fast!

DataFrame Shape

df.shape[0] - To count rows
df.shape[1] - To count columns

With DataFrame shape you’ll get the shape of your DataFrame. Yes I know that sentence is palindrome. Think of shape as the height and width of your table. You’ll be returned a tuple with two values, height and width. Shape works well, but in order to get the row count you need to reference the first item of your tuple via “[0].”

DataFrame Count

df.count()

DataFrame Count will return the number of Non-NA values within each column. I don’t love this one because 1) it’s slower and 2) you need to do extra data work after your call .count(). Be careful, if you have NAs in your dataset, you may get confusing examples. .count() will skip these by default.

DataFrame Axes Length

len(df.axes[0])

Next up is our most verbose option – DataFrame Axes Length. Let’s break this one down. df.axes will return a tuple of your two axes for rows and columns. [0]will pull the first item (rows) from your tuple. Then finally len() will find the length, or how many items, you have in your axis which is your row count.

Let’s look at an examples


In [1]:
import pandas as pd
import numpy as np

Pandas Number Of Rows

Counting the number of rows in your dataset is the most basic metric you'll need when getting to know your data. Let's run through examples to find the count of rows in your data.

We will run through 6 methods:

  1. len(df)
  2. df.info
  3. __len__ - (same as len(df.index))
  4. df.shape[0]
  5. df.count()
  6. len(df.axes[0])

But first, let's create our DataFrame

In [2]:
num_rows = 234
num_columns=5

df = pd.DataFrame(data=np.random.randint(0, 1000, (num_rows, num_columns)),
                  columns=["Column#{}".format(x) for x in range(num_columns)])
df.head()
Out[2]:
Column#0Column#1Column#2Column#3Column#4
0536597176624464
1328958427907319
2558765951347975
3149111729773334
4158374470998117

Method 1: len(df)

The most simple and easiest way to find the length of your dataframe is using vanilla python. This means calling python len() on your dataframe.

In [3]:
len(df)
Out[3]:
234

Method 2: df.info

Next is df.info. This one is nice because you get extra information for free. Unfortunately this is more for computer <> human readability. If you're going to use the number of rows in your dataframe somewhere else in your program, use len().

Notice how '234' rows is shown after 'RangeIndex: '

In [4]:
df.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 234 entries, 0 to 233
Data columns (total 5 columns):
 #   Column    Non-Null Count  Dtype
---  ------    --------------  -----
 0   Column#0  234 non-null    int64
 1   Column#1  234 non-null    int64
 2   Column#2  234 non-null    int64
 3   Column#3  234 non-null    int64
 4   Column#4  234 non-null    int64
dtypes: int64(5)
memory usage: 9.3 KB

Method 3: df.len

df.__len__ will call len(df.index). It's quick and easy, but takes a few more characters to type which is why we don't like it.

In [5]:
df.__len__()
Out[5]:
234

Method 4: df.shape[0]

Next is df.shape which will return a tuple with the 1) row count and 2) column count of your data. Make sure to pull the row count via '[0]' on your shape.

In [6]:
print (df.shape)
print ()
print (df.shape[0])
(234, 5)

234

Method 5: df.count()

Next is df.count() which will count the number of non-na values within each of your columns. You'll need to interpret the data that is returned. Be careful, your data may contain NAs and output misleading results.

In [7]:
df.count()
Out[7]:
Column#0    234
Column#1    234
Column#2    234
Column#3    234
Column#4    234
dtype: int64

Method 6: len(df.axes[0])

Last up is len(df.axes[0]). This long function will return your row axis, then you must count the length of it. Let's look through it step by step.

  1. Return both axis (rows/columns)
  2. Pull our the rows
  3. Count the length
In [8]:
df.axes
Out[8]:
[RangeIndex(start=0, stop=234, step=1),
 Index(['Column#0', 'Column#1', 'Column#2', 'Column#3', 'Column#4'], dtype='object')]
In [9]:
df.axes[0]
Out[9]:
RangeIndex(start=0, stop=234, step=1)
In [10]:
len(df.axes[0])
Out[10]:
234

Link to code above

Check out more Pandas functions on our Pandas Page