10 Methods to Add a Column to Pandas DataFrames | by Soner Yıldırım | Jul, 2023


We frequently must derive or create new columns

Picture by Austin Chan on Unsplash

DataFrame is a two-dimensional information construction with labeled rows and columns. We frequently want so as to add new columns as a part of information evaluation or function engineering processes.

There are lots of alternative ways of including new columns. What fits greatest to your want will depend on the duty at hand.

On this article, we’ll be taught 10 methods so as to add a column to Pandas DataFrames.

Let’s begin by making a easy DataFrame utilizing the DataFrame constructor of Pandas. We’ll cross the info as a Python dictionary with column names being keys and rows being the values of the dictionary.

import pandas as pd

# create DataFrame
df = pd.DataFrame(

{
"first_name": ["Jane", "John", "Max", "Emily", "Ashley"],
"last_name": ["Doe", "Doe", "Dune", "Smith", "Fox"],
"id": [101, 103, 143, 118, 128]
}
)

# show DataFrame
df

df (picture by creator)

1. Use a relentless worth

We are able to add a brand new column of a relentless worth as follows:

df.loc[:, "department"] = "engineering"

# show DataFrame
df

df (picture by creator)

2. Use array-like construction

We are able to use an array-like construction so as to add a brand new column. On this case, ensure that the variety of values within the array is identical because the variety of rows within the DataFrame.

df.loc[:, "salary"] = [45000, 43000, 42000, 45900, 54000]

Within the instance above, we used a Python listing. Let’s decide the values randomly with NumPy’s random module.

import numpy as np

df.loc[:, "salary"] = np.random.randint(40000, 55000, dimension=5)

# show DataFrame
df

Leave a Reply

Your email address will not be published. Required fields are marked *