How one can Use Conditional Formatting in Pandas to Improve Knowledge Visualization

Picture by Writer | DALLE-3 & Canva

Whereas pandas is principally used for information manipulation and evaluation, it may well additionally present primary information visualization capabilities. Nonetheless, plain dataframes could make the data look cluttered and overwhelming. So, what may be carried out to make it higher? If you happen to’ve labored with Excel earlier than, you recognize that you may spotlight essential values with totally different colours, font kinds, and many others. The concept of utilizing these kinds and colours is to speak the data in an efficient method. You are able to do comparable work with pandas dataframes too, utilizing conditional formatting and the Styler object.

On this article, we’ll see what conditional formatting is and learn how to use it to boost your information readability.

Conditional Formatting

Conditional formatting is a characteristic in pandas that lets you format the cells primarily based on some standards. You possibly can simply spotlight the outliers, visualize tendencies, or emphasize essential information factors utilizing it. The Styler object in pandas offers a handy approach to apply conditional formatting. Earlier than masking the examples, let’s take a fast have a look at how the Styler object works.

What’s the Styler Object & How Does It Work?

You possibly can management the visible illustration of the dataframe through the use of the property. This property returns a Styler object, which is liable for styling the dataframe. The Styler object lets you manipulate the CSS properties of the dataframe to create a visually interesting and informative show. The generic syntax is as follows:



df.model.<methodology>(<arguments>)

 
The place <methodology> is the particular formatting perform you wish to apply, and <arguments> are the parameters required by that perform. The Styler object returns the formatted dataframe with out altering the unique one. There are two approaches to utilizing conditional formatting with the Styler object:

Constructed-in Types: To use fast formatting kinds to your dataframe
Customized Stylization: Create your individual formatting guidelines for the Styler object and cross them by means of one of many following strategies (Styler.applymap: element-wise or Styler.apply: column-/row-/table-wise)

Now, we'll cowl some examples of each approaches that will help you improve the visualization of your information.
 
Examples: Constructed-in-Types
 
Let’s create a dummy inventory worth dataset with columns for Date, Price Worth, Satisfaction Rating, and Gross sales Quantity to show the examples beneath:

import pandas as pd
import numpy as np

information = {'Date': ['2024-03-05', '2024-03-06', '2024-03-07', '2024-03-08', '2024-03-09', '2024-03-10'],
        'Price Worth': [100, 120, 110, 1500, 1600, 1550],
        'Satisfaction Rating': [90, 80, 70, 95, 85, 75],
        'Gross sales Quantity': [1000, 800, 1200, 900, 1100, None]}

df = pd.DataFrame(information)
df

 
Output:
 

Authentic Unformatted Dataframe
 
1. Highlighting Most and Minimal Values
We will use highlight_max and highlight_min capabilities to focus on the utmost and minimal values in a column or row. For column set axis=0 like this:

# Highlighting Most and Minimal Values
df.model.highlight_max(colour="inexperienced", axis=0 , subset=['Cost Price', 'Satisfaction Score', 'Sales Amount']).highlight_min(colour="pink", axis=0 , subset=['Cost Price', 'Satisfaction Score', 'Sales Amount'])

 
Output:
 

Max & Min Values
 
2. Making use of Shade Gradients
Shade gradients are an efficient approach to visualize the values in your information. On this case, we'll apply the gradient to satisfaction scores utilizing the colormap set to 'viridis'. It is a sort of colour coding that ranges from purple (low values) to yellow (excessive values). Right here is how you are able to do this:

# Making use of Shade Gradients
df.model.background_gradient(cmap='viridis', subset=['Satisfaction Score'])

 
Output:
 

Colormap - viridis
 
3. Highlighting Null or Lacking Values
When we've massive datasets, it turns into troublesome to establish null or lacking values. You need to use conditional formatting utilizing the built-in df.model.highlight_null perform for this goal. For instance, on this case, the gross sales quantity of the sixth entry is lacking. You possibly can spotlight this info like this:

# Highlighting Null or Lacking Values
df.model.highlight_null('yellow', subset=['Sales Amount'])

 
Output:
 

Highlighting Lacking Values
 
Examples: Customized Stylization Utilizing apply() & applymap()
 
1.  Conditional Formatting for Outliers
Suppose that we've a housing dataset with their costs, and we wish to spotlight the homes with outlier costs (i.e., costs which are considerably larger or decrease than the opposite neighborhoods). This may be carried out as follows:

import pandas as pd
import numpy as np

# Home costs dataset
df = pd.DataFrame({
   'Neighborhood': ['H1', 'H2', 'H3', 'H4', 'H5', 'H6', 'H7'],
   'Worth': [50, 300, 360, 390, 420, 450, 1000],
})

# Calculate Q1 (twenty fifth percentile), Q3 (seventy fifth percentile) and Interquartile Vary (IQR)
q1 = df['Price'].quantile(0.25)
q3 = df['Price'].quantile(0.75)
iqr = q3 - q1

# Bounds for outliers
lower_bound = q1 - 1.5 * iqr
upper_bound = q3 + 1.5 * iqr

# Customized perform to focus on outliers
def highlight_outliers(val):
   if val < lower_bound or val > upper_bound:
      return 'background-color: yellow; font-weight: daring; colour: black'
   else:
      return ''

df.model.applymap(highlight_outliers, subset=['Price'])


 
Output:
 

Highlighting Outliers
 
2. Highlighting Developments
Think about that you simply run an organization and are recording your gross sales day by day. To investigate the tendencies, you wish to spotlight the times when your day by day gross sales enhance by 5% or extra. You possibly can obtain this utilizing a customized perform and the apply methodology in pandas. Right here’s how:

import pandas as pd

# Dataset of Firm's Gross sales
information = {'date': ['2024-02-10', '2024-02-11', '2024-02-12', '2024-02-13', '2024-02-14'],
        'gross sales': [100, 105, 110, 115, 125]}

df = pd.DataFrame(information)

# Each day share change
df['pct_change'] = df['sales'].pct_change() * 100

# Spotlight the day if gross sales elevated by greater than 5%
def highlight_trend(row):
    return ['background-color: green; border: 2px solid black; font-weight: bold' if row['pct_change'] > 5 else '' for _ in row]

df.model.apply(highlight_trend, axis=1)

 
Output:
 

 
3. Highlighting Correlated Columns
Correlated columns are essential as a result of they present relationships between totally different variables. For instance, if we've a dataset containing age, revenue, and spending habits and our evaluation reveals a excessive correlation (near 1) between age and revenue, then it means that older individuals typically have larger incomes. Highlighting correlated columns helps to visually establish these relationships. This method turns into extraordinarily useful because the dimensionality of your information will increase. Let's discover an instance to higher perceive this idea: 

import pandas as pd

# Dataset of individuals
information = {
    'age': [30, 35, 40, 45, 50],
    'revenue': [60000, 66000, 70000, 75000, 100000],
    'spending': [10000, 15000, 20000, 18000, 12000]
}

df = pd.DataFrame(information)

# Calculate the correlation matrix
corr_matrix = df.corr()

# Spotlight extremely correlated columns
def highlight_corr(val):
    if val != 1.0 and abs(val) > 0.5:   # Exclude self-correlation
        return 'background-color: blue; text-decoration: underline'
    else:
        return ''

corr_matrix.model.applymap(highlight_corr)

 
Output:
 

Correlated Columns
 
Wrapping Up
 
These are simply a number of the examples I confirmed as a starter to up your recreation of knowledge visualization. You possibly can apply comparable strategies to varied different issues to boost the information visualization, resembling highlighting duplicate rows, grouping into classes and choosing totally different formatting for every class, or highlighting peak values. Moreover, there are lots of different CSS choices you'll be able to discover within the official documentation. You possibly can even outline totally different properties on hover, like magnifying textual content or altering colour. Try the "Fun Stuff" part for extra cool concepts. This text is a part of my Pandas collection, so in case you loved this, there's lots extra to discover. Head over to my writer web page for extra suggestions, methods, and tutorials. 
 
 
Kanwal Mehreen Kanwal is a machine studying engineer and a technical author with a profound ardour for information science and the intersection of AI with drugs. She co-authored the e-book "Maximizing Productiveness with ChatGPT". As a Google Era Scholar 2022 for APAC, she champions range and tutorial excellence. She's additionally acknowledged as a Teradata Variety in Tech Scholar, Mitacs Globalink Analysis Scholar, and Harvard WeCode Scholar. Kanwal is an ardent advocate for change, having based FEMCodes to empower ladies in STEM fields.



                    
                            
                
	
		Continue Reading
		Previous How we constructed AlphaFold 3 to foretell the construction and interplay of all of life’s molecules
Next AWS AI chips ship excessive efficiency and low price for Llama 3.1 fashions on AWS

How one can Use Conditional Formatting in Pandas to Improve Knowledge Visualization

Conditional Formatting

What’s the Styler Object & How Does It Work?

Examples: Constructed-in-Types

1. Highlighting Most and Minimal Values

2. Making use of Shade Gradients

3. Highlighting Null or Lacking Values

Examples: Customized Stylization Utilizing `apply()` & `applymap()`

1. Conditional Formatting for Outliers

2. Highlighting Developments

3. Highlighting Correlated Columns

Wrapping Up

Google Pictures brings SynthID to Reimagine in Magic Editor

Automate bulk picture modifying with Crop.photograph and Amazon Rekognition

Change into an AI Engineer for Free This Week

Leave a Reply Cancel reply

You may have missed

Studying Methods to Play Atari Video games By way of Deep Neural Networks

Google Pictures brings SynthID to Reimagine in Magic Editor

Revolutionizing enterprise processes with Amazon Bedrock and Appian’s generative AI expertise

Automate bulk picture modifying with Crop.photograph and Amazon Rekognition

New Cloudinary 3D Platform Simplifies 3D & AR Content material Creation

Conditional Formatting

What’s the Styler Object & How Does It Work?

Examples: Constructed-in-Types

1. Highlighting Most and Minimal Values

2. Making use of Shade Gradients

3. Highlighting Null or Lacking Values

Examples: Customized Stylization Utilizing apply() & applymap()

1. Conditional Formatting for Outliers

2. Highlighting Developments

3. Highlighting Correlated Columns

Wrapping Up

More Stories

Leave a Reply Cancel reply

You may have missed

Examples: Customized Stylization Utilizing `apply()` & `applymap()`