Python: Generator Comprehension

Python: Generator Comprehension

ยท

5 min read

Comprehensions come in many flavors. Using [ ] we get lists. Using { } we can make dictionaries or sets. You might think that ( ) will make a tuple. Instead, you'll get a generator.

A generator is an object that we can iterate through. But unlike a list, tuple or dictionary the values are evaluated as it's iterated. A generator has rules on how to create the sequence of values.

Function Generator

The longer method of creating a Generator is using a Function. Here is an example of a function that creates a generator. A generator function uses yield instead of return. As the function is run the yield releases the sequence until it's exhausted.

def numbers():
  num_list = [1, 2, 3, 4, 5, 6, 7]
  for num in num_list:
    yield num

We start the generator by calling the function and capturing the returning generator object in a variable.

my_numbers = numbers()

Printing the generator object doesn't do anything special, just display what the object is.

> print(my_numbers)
<generator object numbers at 0x7f7fbf9834a0>

What we can now do is iterate over the generator object and each one will be returned one at a time.

for num in my_numbers:
  print(num)

Generator Comprehension Version

A comprehension gives us a short cut on creating the generator.

num_list = [1, 2, 3, 4, 5, 6, 7]
my_numbers = (num for num in num_list)

my_numbers is the same generator object created by the generator function.

Differences: List vs Generator Comprehension

Since the code inside is the same between a List and Generator comprehensions. Why would you use one instead of the other?

A Generator Comprehension doesn't evaluate the values until the next one is requested. This means it can be very small.

import sys

nums_list = [num for num in range(10_000)]
nums_gen = (num for num in range(10_000))

print(sys.getsizeof(nums_list))
print(sys.getsizeof(nums_gen))

Even though these will loop through exactly the same, the list has a size of 87616 and the generator object is 112.

On the other hand, a List Comprehension is complete so it has a length and can be sorted. A List can also be sliced.

A Generator can be converted to List as such:

new_list = List(nums_gen)

Infinite generator

Generators can use an unending iterator to generate a value one after another. Since Lists need to finish creating the list before moving on trying this with a List Comprehension would create an infinite loop.

Let's create an unending list of perfect squares. The itertools package has a count method that will provide an unending sequence of integers. We can create a comprehension that only returns the value if it's a perfect square, otherwise we keep testing until one appears.

import itertools

squares = (num
  for num in itertools.count(start=1)
  if ((int(num ** (1/2)) == num ** (1/2))) and (num != 0)
)

Now that we have an object, we can iterate over that will give us an ending sequence, we do need to be careful that we don't create an infinite loop.

for square in squares:
  print(square)

This will continue to output perfect squares until we break out. Or something crashes.

One option is to use the __next__() method. This method will pass the next value in the sequence.

for i in range(10):
  print(squares.__next__())

This loop will output the next 10 values in the sequence.

Another option is to use some trigger to stop the loop. We can check if the square is divisible by 2, 3, 4 and 5.

for square in squares:
  print(square)
  if (square%2 == 0) and (square%3 == 0) and (square%4 == 0) and (square%5 == 0):
    print(square, " is divisible by 2, 3, 4 and 5")
    break

The generator will continue going through the same sequence even if it's different loops. For example, if we ran the code above a second time the sequence would start on the next item. If the sequence needs to be restarted, the comprehension will need to be redefined.

any() and all()

The any() and all() built in functions can be very useful. Each take an iterable of Booleans and tells if any are true or all are true, depending on which you use.

You've most likely seen these used with List Comprehensions.

all_even = all([num for num in range(10)])
any_even = any([num for num in range(10)])

The List Comprehension will be completed first and then all or any will be run.

If we remove the [ ], these functions will treat them as a generator comprehension. The great thing is that once it confirms an answer it stops. For example, for all() once it finds a single False it stops and returns False. For any() the firstTruewill have a response ofTrue`, no need to keep going.

Earlier we looked at how a non-ending sequence can't work with a List Comprehension. Let's see what would happen here.

any_even = any(num%2
          for num in itertools.count(start=1))

all_even = all(num%2
          for num in itertools.count(start=1))

In both of these cases the checking will continue until any() and all() can determine which way. Since the sequence is unending, this may take a long time.

While we can put an unending iterator in these lines, we want to be very careful. all() could never return True if we used num>0 in the line above. We know that all that is returned from intertools.count(start=1)) is greater then 0, all() will have to check every number first.

Conclusion

Use a List Comprehension if you need to

  • sort
  • determine length

Use a Generator Comprehension if you want to

  • save memory
  • stop before the entire sequence finishes

Please comment with your favorite comprehension example.

Resources

Python.org : itertools.count

Programming Expert : Check if number is perfect square

PEP-255 : Simple Generators

Did you find this article valuable?

Support Russ Eby by becoming a sponsor. Any amount is appreciated!