Use to_string() to Cease Python from Hiding the Physique of the Printed DataFrames | by Yufeng | Apr, 2023


3-Minutes Pandas

What ought to we do to see all the printed dataframe after the execution of a Python script?

Photograph by Pascal Müller on Unsplash

Generally working by a Python script with out reporting any errors just isn’t the one activity of the debugging course of. We’d like to ensure the capabilities are executed as anticipated. It’s a typical step within the exploratory information evaluation to test how the info seems like earlier than and after some particular information processing.

So, we have to print out some information frames or important variables in the course of the execution of the script, to be able to test whether or not they’re “right”. Nonetheless, easy print command can solely present the highest and backside rows of the info body typically (as proven within the instance under), which makes the checking process unnecessarily laborious.

Normally, the info frames are within the format of pandas.DataFrame, and in the event you use the print command instantly, you may get one thing like this,

import pandas as pd
import numpy as np

information = np.random.randn(5000, 5)
df = pd.DataFrame(information, columns=['A', 'B', 'C', 'D', 'E'])

print(df.head(100))

print the highest 100 rows (picture by writer)

You’ll have already seen that the center a part of the info body is hidden by three dots. What if we actually must test what the highest 100 rows are? For instance, we wish to test the results of a selected step in the midst of a big Python script, to be able to be certain the capabilities are executed as anticipated.

set_option()

Probably the most easy options is to edit the default variety of rows that Pandas present,

pd.set_option('show.max_rows', 500)
print(df.head(100))
print the highest 100 rows after setting the default variety of rows that Pandas shows (picture by writer)

the place set_option is a technique that permits you to management the conduct of Pandas capabilities, which incorporates setting the utmost variety of rows or columns to show, as we did above. The primary argument show.max_rows is to regulate the utmost variety of rows to show and 500 is the worth we set as the utmost row quantity.

Although this methodology is extensively used, it’s not ultimate to place it inside an executable Python file, particularly when you have a number of information frames to print and they’re desired to show totally different numbers of rows.

For instance, I’ve a script structured as proven,

## Code Block 1 ##
...
print(df1.head(20))
...

## Code Block 2 ##
...
print(df2.head(100))
...

## Code Block N ##
...
print(df_n)
...

we now have totally different numbers of high rows to point out by all the script, and typically we wish to see all the printed information body, however typically we solely care in regards to the dimension and construction of the info body with out the necessity to see all the information.

In such a case, we most likely want to make use of the perform pd.set_option() to set the specified show or pd.reset_option() to make use of the default choices each time earlier than we print a knowledge body, which makes it very messy and troublesome.

## Code Block 1 ##
...
pd.set_option('show.max_rows', 20)
print(df1.head(20))
...

## Code Block 2 ##
...
pd.set_option('show.max_rows', 100)
print(df2.head(100))
...

## Code Block N ##
...
pd.reset_option('show.max_rows')
print(df_n)
...

There’s truly a extra versatile and efficient manner of displaying all the information body with out specifying the show choices for Pandas.

to_string()

to_string() instantly switch the pd.DataFrame object to a string object and after we print it out, it doesn’t care in regards to the show restrict from pandas .

pd.set_option('show.max_rows', 10)
print(df.head(100).to_string())
print the highest 100 rows utilizing to_string() (picture by writer)

We will see above that regardless that I set the utmost variety of rows to show as 10, to_string() helps us print all the information body of 100 rows.

The perform, to_string() , converts a whole information body to the string format, so it may maintain all of the values and indexes within the information body within the printing step. Since set_option() is barely efficient on pandas objects, our printing string just isn’t restricted by the utmost variety of rows to show set earlier.

So, the technique is that you just don’t must set something by way of set_option() and also you solely want to make use of to_string() to see all the information body. It’s going to prevent from serious about which choice to set during which half throughout the script.

Takeaways

  1. Use set_option('show.max_rows') when you’ve a constant variety of rows to show throughout all the script.
  2. Use to_string() if you wish to print out all the Pandas information body it doesn’t matter what Pandas choices have been set.

Thanks for studying! Hope you get pleasure from utilizing the Pandas trick in your work!

Please subscribe to my Medium if you wish to learn extra tales from me. And you can too be a part of the Medium membership by my referral link!

Leave a Reply

Your email address will not be published. Required fields are marked *