Visualization of Information with Pie Charts in Matplotlib | by Diana Rozenshteyn | Oct, 2024


Examples of easy methods to create several types of pie charts utilizing Matplotlib to visualise the outcomes of database evaluation in a Jupyter Pocket book with Pandas

Picture by Niko Nieminen on Unsplash

Whereas engaged on my Grasp’s Thesis titled “Elements Related to Impactful Scientific Publications in NIH-Funded Coronary heart Illness Analysis”, I’ve used several types of pie charts for instance among the key findings from the database evaluation.

A pie chart will be an efficient alternative for knowledge visualization when a dataset accommodates a restricted variety of classes representing components of an entire, making it well-suited for displaying categorical knowledge with an emphasis on evaluating the relative proportions of every class.

On this article, I’ll display easy methods to create 4 several types of pie charts utilizing the identical dataset to supply a extra complete visible illustration and deeper perception into the info. To realize this, I’ll use Matplotlib, Python’s plotting library, to show pie chart visualizations of the statistical knowledge saved within the dataframe. If you’re not aware of Matplotlib library, a superb begin is Python Information Science Handbook by Jake VanderPlas, particularly chapter on Visualization with Matplotlib and matplotlib.org.

First, let’s import all the required libraries and extensions:

Subsequent, we’ll put together the CSV file for processing:

The mini dataset used on this article highlights the highest 10 journals for coronary heart illness analysis publications from 2002 to 2020 and is an element of a bigger database collected for the Grasp’s Thesis analysis. The columns “Feminine,” “Male,” and “Unknown” signify the gender of the primary writer of the printed articles, whereas the “Complete” column displays the full variety of coronary heart illness analysis articles printed in every journal.

Picture by the writer and represents output of the Pie_Chart_Artcile_2.py pattern code above.

For smaller datasets with fewer classes, a pie chart with exploding slices can successfully spotlight a key class by pulling it out barely from the remainder of the chart. This visible impact attracts consideration to particular classes, making them stand out from the entire. Every slice represents a portion of the full, with its dimension proportional to the info it represents. Labels will be added to every slice to point the class, together with percentages to indicate their proportion to the full. This visible approach makes the exploded slice stand out with out shedding the context of the total knowledge illustration.

Picture by the writer and represents output of the Pie_Chart_Artcile_3.py pattern code above.

The identical exploding slices approach will be utilized to all different entries within the pattern dataset, and the ensuing charts will be displayed inside a single determine. Any such visualization helps to spotlight the over illustration or beneath illustration of a selected class throughout the dataset. Within the instance supplied, presenting all 10 charts in a single determine reveals that not one of the prime 10 journals in coronary heart illness analysis printed extra articles authored by girls than males, thereby emphasizing the gender disparity.

Gender distributions for prime 10 journals for coronary heart illness analysis publications, 2002–2020. Picture by the writer and represents output of the Pie_Chart_Artcile_4.py pattern code above.

A variation of the pie chart, often known as a donut chart, can be used to visualise knowledge. Donut charts, like pie charts, show the proportions of classes that make up an entire, however the heart of the donut chart can be utilized to current extra knowledge. This format is much less cluttered visually and may make it simpler to match the relative sizes of slices in comparison with a regular pie chart. Within the instance used on this article, the donut chart highlights that among the many prime 10 journals for coronary heart illness analysis publications, the American Journal of Physiology, Coronary heart and Circulatory Physiology printed probably the most articles, accounting for 21.8%.

Picture by the writer and represents output of the Pie_Chart_Artcile_5.py pattern code above.

We will improve the visualization of extra data from the pattern dataset by constructing on the earlier donut chart and making a nested model. The add_artist() technique from Matplotlib’s determine module is used to include any extra Artist (reminiscent of figures or objects) into the bottom determine. Much like the sooner donut chart, this variation shows the distribution of publications throughout the highest 10 journals for coronary heart illness analysis. Nevertheless, it additionally contains a further layer that reveals the gender distribution of first authors for every journal. This visualization highlights {that a} bigger proportion of the primary authors are male.

Picture by the writer and represents output of the Pie_Chart_Artcile_6.py pattern code above.

In conclusion, pie charts are efficient for visualizing knowledge with a restricted variety of classes, as they allow viewers to shortly perceive an important classes or dominant proportions at a look. On this particular instance, using 4 several types of pie charts supplies a transparent visualization of the gender distribution amongst first authors within the prime 10 journals for coronary heart illness analysis publications, based mostly on the 2002 to 2020 mini dataset used on this research. It’s evident {that a} greater proportion of the publication’s first authors are males, and not one of the prime 10 journals for coronary heart illness analysis printed extra articles authored by females than by males in the course of the examined interval.

Jupyter Pocket book and dataset used for this text will be discovered at GitHub

Thanks for studying,

Diana

Word: I used GitHub embeds to publish this text.

Leave a Reply

Your email address will not be published. Required fields are marked *