Lab 10: Iterators and Generators
Due at 11:59:59 pm on 4/16/2024.
Starter Files
Download lab10.zip. Inside the archive, you will find starter files for the questions in this lab, along with a copy of the OK autograder.
Iterables and Iterators
Remember the for
loop? (We really hope so.) The object the for
loop
iterates over must be an iterable! An iterable is a way of representing
explicit sequences (like lists or strings) as well as implicit sequences (like
the natural numbers 1, 2, 3, ...
).
for elem in iterable:
# do something
for
loops only work with iterables. This means the object you want to use a
for
loop on must implement the iterable interface. To implement the
iterable interface, an object must define an __iter__
method that returns an
object that implements the iterator interface.
To also implement the iterator interface, an object must define a __next__
method to compute and return the next element in the sequence. If the iterator exhausts the sequence, it raises StopIteration
to send a signal to indicate
that it reaches the end.
An iterable object can create an arbitrary amount of iterator objects. In addition, the iterators are independent of each other; in other words they can have a different position in the sequence.
Here is a table summarizing the required methods of the iterable and iterator interfaces/protocols. Python also has more documentation about iterator types.
Iterable | Iterator |
---|---|
__iter__ : return a new iterator |
__iter__ : must return itself |
__next__ : return the next element,
or raise StopIteration |
In Python, an iterator must also be an iterable. In other words, it must have a
__iter__
method that returns itself (with the current state unaltered).
Analogy: an iterable is like a book (one can flip through the
pages) and an iterator is a bookmark (saves the position and can locate
the next page). Calling __iter__
on a book gives you a new bookmark
independent of other bookmarks, but calling __iter__
on a bookmark
gives you the bookmark itself, without changing its position at all.
Here is an example of a class definition for an object that implements the iterator interface:
class AnIterator:
def __init__(self):
self.current = 0
def __next__(self):
if self.current > 5:
raise StopIteration
self.current += 1
return self.current
def __iter__(self):
return self
Let's go ahead and try out our new toy.
>>> for num in AnIterator():
... print(num)
1
2
3
4
5
6
This is somewhat equivalent to running:
t = AnIterator()
t = iter(t) # iter(t) is the same as t.__iter__()
try:
while True:
# next(t) is the same as t.__next__()
print(next(t))
except StopIteration as e:
pass
Please note that Question 1 DOES NOT count towards your Gradescope submission points and is an exercise meant to provide additional practice.
Question 1: Does it work?
Consider the following iterators. Which ones work and which ones don't? Why?
Use OK to test your knowledge with the following conceptual questions:
python3 ok -q does_it_work -u
class IteratorA:
def __init__(self):
self.start = 10
def __next__(self):
if self.start > 100:
raise StopIteration
self.start += 20
return self.start
def __iter__(self):
return self
No problem, this is a beautiful iterator.
class IteratorB:
def __init__(self):
self.start = 5
def __iter__(self):
return self
Oh no! Where is __next__
? This fails to implement the iterator
interface because calling __iter__
doesn't return something that has
a __next__
method.
class IteratorC:
def __init__(self):
self.start = 5
def __next__(self):
if self.start == 10:
raise StopIteration
self.start += 1
return self.start
This also fails to implement the iterator interface. Without the
__iter__
method, the for
loop will error. The for
loop needs to
call __iter__
first because some objects might not implement the
__next__
method themselves, but calling __iter__
will return an
object that does.
class IteratorD:
def __init__(self):
self.start = 1
def __next__(self):
self.start += 1
return self.start
def __iter__(self):
return self
This is technically an iterator, because it implements both __iter__
and
__next__
. Notice that it's an infinite sequence! Sequences like these are
the reason iterators are useful. Because iterators delay computation, we can
use a finite amount of memory to represent an infinitely long sequence.
Generators
A generator function returns a special type of iterator called
a generator object. Such functions can be written using a yield
statement:
def <generator_fn_name>():
<somevariable> = <something>
while <predicate>:
yield <something>
<increment somevariable>
Calling a generator function (a function with a yield statement in it) makes it return a generator object rather than executing the body of the function.
The reason we say a generator object is a special type of iterator is that it has all the properties of an iterator, meaning that:
- Calling the
__iter__
method makes a generator object return itself without modifying its current state. - Calling the
__next__
method makes a generator object compute and return the next object in its sequence. If the sequence is exhausted,StopIteration
is raised. - Typically, a generator should not restart unless it's defined that way. But
calling the generator function returns a brand new generator object (like
calling
__iter__
on an iterable object).
However, they do have some fundamental differences:
- An iterator is a class with
__next__
and__iter__
explicitly defined, but a generator can be written as a mere function with ayield
in it. __iter__
in an iterator usesreturn
, but a generator usesyield
.A generator "remembers" its state for the next
__next__
call. Therefore, the first__next__
call works like this:- Enter the function, run until the line with
yield
. - Return the value in the
yield
statement, but remember the state of the function for future__next__
calls.
And subsequent
__next__
calls work like this:- Re-enter the function, start at the line after
yield
, and run until the nextyield
statement. - Return the value in the
yield
statement, but remember the state of the function for future__next__
calls.
- Enter the function, run until the line with
Use OK to test your knowledge with the following What would Python print questions:
python3 ok -q generators -u
def generator():
print("Starting here")
i = 0
while i < 6:
print("Before yield")
yield i
print("After yield")
i += 1
>>> g = generator()
>>> g # what type of object is this?
______<generator object ...>
>>> g == iter(g) # equivalent of g.__iter__()
______True
>>> next(g) # equivalent of g.__next__()
______Starting here
Before yield
0
>>> next(g)
______After yield
Before yield
1
>>> next(g)
______After yield
Before yield
2
Trace through the code and make sure you know where and why each statement is printed.
You might have noticed from the Iterators section that IteratorB
, which didn't
define a __next__
method, failed to run in the for
loop. However, this is
not always the case.
class IterGen:
def __init__(self):
self.start = 5
def __iter__(self):
while self.start < 10:
self.start += 1
yield self.start
for i in IterGen():
print(i)
Why does this iterable work without defining a __next__
method?
The for
loop only expects the object returned by __iter__
to have a
__next__
method. The __iter__
method is a generator function because of the
yield
statement in the body. Therefore, when __iter__
is called, it returns
a generator object, which you can call __next__
on.
Question 2: Countdown
Write a generator that counts down to 0.
Write it in both ways: using a generator function on its own, and
within the __iter__
method of a class.
def countdown(n):
"""
>>> from types import GeneratorType
>>> type(countdown(0)) is GeneratorType # countdown is a generator
True
>>> for number in countdown(5):
... print(number)
...
5
4
3
2
1
0
"""
"*** YOUR CODE HERE ***"
while n >= 0:
yield n
n = n - 1
Use OK to test your code:
python3 ok -q countdown
class Countdown:
"""
>>> from types import GeneratorType
>>> type(Countdown(0)) is GeneratorType # Countdown is an iterator
False
>>> for number in Countdown(5):
... print(number)
...
5
4
3
2
1
0
"""
"*** YOUR CODE HERE ***"
def __init__(self, cur):
self.cur = cur
def __iter__(self):
return self
def __next__(self):
result = self.cur
if result < 0:
raise StopIteration
self.cur -= 1
return result
# Alternate solution that makes __iter__ a generator function:
class Countdown:
def __init__(self, cur):
self.cur = cur
def __iter__(self):
while self.cur >= 0:
yield self.cur
self.cur -= 1
Hint: Is it possible to not have a __next__
method in Countdown
?
Use OK to test your code:
python3 ok -q Countdown
Question 3: Primes
Write a generator that generates prime numbers.
Fill out the is_prime
helper function and use that
to create your generator.
def is_prime(n):
"""
Return True if n is prime, false otherwise.
>>> is_prime(1)
False
>>> is_prime(2)
True
>>> is_prime(19)
True
"""
"*** YOUR CODE HERE ***"
if n < 2:
return False
counter = 2
while counter <= sqrt(n):
if n % counter == 0:
return False
counter += 1
return True
def primes():
"""
An infinite generator that outputs primes.
>>> p = primes()
>>> for i in range(3):
... print(next(p))
...
2
3
5
"""
"*** YOUR CODE HERE ***"
num = 0
while True:
if is_prime(num):
yield num
num += 1
Use OK to test your code:
python3 ok -q is_prime
Use OK to test your code:
python3 ok -q primes
Submission
When you are done, submit your file to Gradescope. You only need to upload the following files:
lab10.py