Your KDnuggets Submit – Why You Ought to Not Overuse Listing Comprehensions in Python
Picture by Creator
In Python, checklist comprehensions present a concise syntax to create new lists from current lists and different iterables. Nevertheless, when you get used to checklist comprehensions you could be tempted to make use of them even while you should not.
Bear in mind, your aim is to put in writing easy and maintainable code; not advanced code. It’s typically useful to revisit the Zen of Python, a set of aphorisms for writing clear and stylish Python, particularly the next:
- Lovely is best than ugly.
- Easy is best than advanced.
- Readability counts.
On this tutorial, we’ll code three examples—every extra advanced than the earlier one—the place checklist comprehensions make the code tremendous troublesome to keep up. We’ll then attempt to write a extra maintainable model of the identical.
So let’s begin coding!
Let’s begin by reviewing checklist comprehensions in Python. Suppose you could have an current iterable akin to a listing or a string. And also you’d wish to create a brand new checklist from it. You possibly can loop by means of the iterable, course of every merchandise, and append the output to a brand new checklist like so:
new_list = []
for merchandise in iterable:
new_list.append(output)
However much less comprehensions present a concise one-line different to do the identical:
new_list = [output for item in iterable]
As well as, you too can add filtering situations.
The next snippet:
new_list = []
for merchandise in iterable:
if situation:
new_list.append(output)
Will be changed by this checklist comprehension:
new_list = [output for item in iterable if condition]
So checklist comprehensions enable you write Pythonic code—typically make your code cleaner by decreasing visible noise.
Now let’s take three examples to grasp why you should not be utilizing checklist comprehensions for duties that require tremendous advanced expressions. As a result of in such instances, checklist comprehensions—as a substitute of creating your code elegant—make your code troublesome to learn and keep.
Drawback: Given a quantity upper_limit
, generate a listing of all of the prime numbers as much as that quantity.
You possibly can break down this downside into two key concepts:
- Checking if a quantity is prime
- Populating a listing with all of the prime numbers
The checklist comprehension expression to do that is as proven:
import math
upper_limit = 50
primes = [x for x in range(2, upper_limit + 1) if x > 1 and all(x % i != 0 for i in range(2, int(math.sqrt(x)) + 1))]
print(primes)
And right here’s the output:
Output >>>
[2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47]
At first look, it’s troublesome to see what’s going on…Let’s make it higher.
Maybe, higher?
import math
upper_limit = 50
primes = [
x
for x in range(2, upper_limit + 1)
if x > 1 and all(x % i != 0 for i in range(2, int(math.sqrt(x)) + 1))
]
print(primes)
Simpler to learn, actually. Now let’s write a really higher model.
A Higher Model
Although a listing comprehension is definitely a good suggestion to resolve this downside, the logic to verify for primes within the checklist comprehension is making it noisy.
So let’s write a extra maintainable model that strikes the logic for checking if a quantity is prime to a separate operate is_prime()
. And name the operate is_prime()
within the comprehension expression:
import math
def is_prime(num):
return num > 1 and all(num % i != 0 for i in vary(2, int(math.sqrt(num)) + 1))
upper_limit = 50
primes = [
x
for x in range(2, upper_limit + 1)
if is_prime(x)
]
print(primes)
Output >>>
[2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47]
Is the higher model ok? This makes the comprehension expression a lot simpler to grasp. It is now clear that the expression collects all numbers as much as upper_limit
which might be prime (the place is_prime()
returns True).
Drawback: Given a matrix, discover the next:
- All of the prime numbers
- The indices of the prime numbers
- Sum of the primes
- Prime numbers sorted in descending order
Picture by Creator
To flatten the matrix and accumulate the checklist of all prime numbers, we are able to use a logic much like the earlier instance.
Nevertheless, to search out the indices, we’ve one other advanced checklist comprehension expression (I’ve formatted the code such that it’s straightforward to learn).
You possibly can mix checking for primes and getting their indices in a single comprehension. However that won’t make issues any less complicated.
Right here’s the code:
import math
from pprint import pprint
my_matrix = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
def is_prime(num):
return num > 1 and all(num % i != 0 for i in vary(2, int(math.sqrt(num)) + 1))
# Flatten the matrix and filter to include solely prime numbers
primes = [
x
for row in my_matrix
for x in row
if is_prime(x)
]
# Discover indices of prime numbers within the authentic matrix
prime_indices = [
(i, j)
for i, row in enumerate(my_matrix)
for j, x in enumerate(row)
if x in primes
]
# Calculate the sum of prime numbers
sum_of_primes = sum(primes)
# Type the prime numbers in descending order
sorted_primes = sorted(primes, reverse=True)
# Create a dictionary with the outcomes
consequence = {
"primes": primes,
"prime_indices": prime_indices,
"sum_of_primes": sum_of_primes,
"sorted_primes": sorted_primes
}
pprint(consequence)
And the corresponding output:
Output >>>
{'primes': [2, 3, 5, 7],
'prime_indices': [(0, 1), (0, 2), (1, 1), (2, 0)],
'sum_of_primes': 17,
'sorted_primes': [7, 5, 3, 2]}
So what’s a greater model?
A Higher Model
Now for the higher model, we are able to outline a collection of capabilities to separate out issues. In order that if there’s an issue, which operate to return to and repair the logic.
import math
from pprint import pprint
def is_prime(num):
return num > 1 and all(n % i != 0 for i in vary(2, int(math.sqrt(num)) + 1))
def flatten_matrix(matrix):
flattened_matrix = []
for row in matrix:
for x in row:
if is_prime(x):
flattened_matrix.append(x)
return flattened_matrix
def find_prime_indices(matrix, flattened_matrix):
prime_indices = []
for i, row in enumerate(matrix):
for j, x in enumerate(row):
if x in flattened_matrix:
prime_indices.append((i, j))
return prime_indices
def calculate_sum_of_primes(flattened_matrix):
return sum(flattened_matrix)
def sort_primes(flattened_matrix):
return sorted(flattened_matrix, reverse=True)
my_matrix = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
primes = flatten_matrix(my_matrix)
prime_indices = find_prime_indices(my_matrix, primes)
sum_of_primes = calculate_sum_of_primes(primes)
sorted_primes = sort_primes(primes)
consequence = {
"primes": primes,
"prime_indices": prime_indices,
"sum_of_primes": sum_of_primes,
"sorted_primes": sorted_primes
}
pprint(consequence)
This code additionally provides the identical output as earlier than.
Output >>>
{'primes': [2, 3, 5, 7],
'prime_indices': [(0, 1), (0, 2), (1, 1), (2, 0)],
'sum_of_primes': 17,
'sorted_primes': [7, 5, 3, 2]}
Is the higher model ok? Whereas this works for a small matrix such because the one on this instance, returning a static checklist is usually not advisable. And for generalizing to bigger dimensions, you should use generators as a substitute.
Drawback: Parse a given nested JSON string based mostly on situations and get a listing of required values.
Parsing nested JSON strings is difficult as a result of you must account for the totally different ranges of nesting, the dynamic nature of the JSON response, and numerous information sorts in your parsing logic.
Let’s take an instance of parsing a given JSON string based mostly on situations to get a listing of all values which might be:
- Integers or checklist of integers
- Strings or checklist of strings
You possibly can load a JSON string right into a Python dictionary utilizing the hundreds
operate from the built-in json module. So we’ll have a nested dictionary over which we’ve a listing comprehension.
The checklist comprehension makes use of nested loops to iterate over the nested dictionary. For every worth, it constructs a listing based mostly on the next situations:
- If the worth will not be a dictionary and the important thing begins with ‘inner_key’, it makes use of
[inner_item]
. - If the worth is a dictionary with ‘sub_key’, it makes use of
[inner_item['sub_key']]
. - If the worth is a string or integer, it makes use of
[inner_item]
. - If the worth is a dictionary, it makes use of
checklist(inner_item.values())
.
Take a look on the code snippet beneath:
import json
json_string = '{"key1": {"inner_key1": [1, 2, 3], "inner_key2": {"sub_key": "worth"}}, "key2": {"inner_key3": "textual content"}}'
# Parse the JSON string right into a Python dictionary
information = json.hundreds(json_string)
flattened_data = [
value
if isinstance(value, (int, str))
else value
if isinstance(value, list)
else list(value)
for inner_dict in data.values()
for key, inner_item in inner_dict.items()
for value in (
[inner_item]
if not isinstance(inner_item, dict) and key.startswith("inner_key")
else [inner_item["sub_key"]]
if isinstance(inner_item, dict) and "sub_key" in inner_item
else [inner_item]
if isinstance(inner_item, (int, str))
else checklist(inner_item.values())
)
]
print(f"Values: {flattened_data}")
Right here’s the output:
Output >>>
Values: [[1, 2, 3], 'worth', 'textual content']
As seen, the checklist comprehension could be very troublesome to wrap your head round.
Please do your self and others on the workforce a favor by by no means writing such code.
A Higher Model
I feel the next snippet utilizing nested for loops and if-elif ladder is best. As a result of it’s simpler to grasp what’s happening.
flattened_data = []
for inner_dict in information.values():
for key, inner_item in inner_dict.gadgets():
if not isinstance(inner_item, dict) and key.startswith("inner_key"):
flattened_data.append(inner_item)
elif isinstance(inner_item, dict) and "sub_key" in inner_item:
flattened_data.append(inner_item["sub_key"])
elif isinstance(inner_item, (int, str)):
flattened_data.append(inner_item)
elif isinstance(inner_item, checklist):
flattened_data.prolong(inner_item)
elif isinstance(inner_item, dict):
flattened_data.prolong(inner_item.values())
print(f"Values: {flattened_data}")
This provides the anticipated output, too:
Output >>>
Values: [[1, 2, 3], 'worth', 'textual content']
Is the higher model ok? Nicely, probably not.
As a result of if-elif ladders are sometimes thought-about a code scent. You could repeat logic throughout branches and including extra situations will solely make the code tougher to keep up.
However for this instance, the if-elif ladders and nested loops the model is less complicated to grasp than the comprehension expression, although.
The examples we’ve coded up to now ought to offer you an thought of how overusing a Pythonic characteristic akin to checklist comprehension can typically change into an excessive amount of of a very good factor. That is true not only for checklist comprehensions (they’re probably the most regularly used, although) but in addition for dictionary and set comprehensions.
You must all the time write code that’s straightforward to grasp and keep. So attempt to hold issues easy even when it means not utilizing some Pythonic options. Hold coding!
Bala Priya C is a developer and technical author from India. She likes working on the intersection of math, programming, information science, and content material creation. Her areas of curiosity and experience embody DevOps, information science, and pure language processing. She enjoys studying, writing, coding, and low! At the moment, she’s engaged on studying and sharing her information with the developer neighborhood by authoring tutorials, how-to guides, opinion items, and extra.