Lab 11: Trees, Iterators and Generators
Due at 11:59:59 pm on Tuesday, 11/23/2021.
Starter Files
Download lab11.zip. Inside the archive, you will find starter files for the questions in this lab, along with a copy of the OK autograder.
Submission
By the end of this lab, you should have submitted the lab with python3 ok --submit
. You may submit more than once before the deadline; only the final submission will be graded. Check that you have successfully submitted your code on okpy.org. See this article for more instructions on okpy and submitting assignments.
Iterables and Iterators
Remember the for
loop? (We really hope so.) The object the for
loop
iterates over must be an iterable! An iterable is a way of representing
explicit sequences (like lists or strings) as well as implicit sequences (like
the natural numbers 1, 2, 3, ...
).
for elem in iterable:
# do something
for
loops only work with iterables. This means the object you want to use a
for
loop on must implement the iterable interface. To implement the
iterable interface, an object must define an __iter__
method that returns an
object that implements the iterator interface. To implement the iterator
interface, an object must define a __next__
method to compute and return the
next element in the sequence. If the iterator exhausts the sequence, it raises
StopIteration
to send a signal to indicate that it reaches the end.
An iterable object can create an arbitrary amount of iterator objects. In addition, the iterators are independent of each other; in other words they can have a different position in the sequence.
Here is a table summarizing the required methods of the iterable and iterator interfaces/protocols. Python also has more documentation about iterator types.
Iterable | Iterator |
---|---|
__iter__ : return a new iterator |
__iter__ : must return itself |
__next__ : return the next element,
or raise StopIteration |
In Python, an iterator must also be an iterable. In other words, it must have a
__iter__
method that returns itself (with the current state unaltered).
Analogy: an iterable is like a book (one can flip through the
pages) and an iterator is a bookmark (saves the position and can locate
the next page). Calling __iter__
on a book gives you a new bookmark
independent of other bookmarks, but calling __iter__
on a bookmark
gives you the bookmark itself, without changing its position at all.
Here is an example of a class definition for an object that implements the iterator interface:
class AnIterator:
def __init__(self):
self.current = 0
def __next__(self):
if self.current > 5:
raise StopIteration
self.current += 1
return self.current
def __iter__(self):
return self
Let's go ahead and try out our new toy.
>>> for num in AnIterator():
... print(num)
1
2
3
4
5
6
This is somewhat equivalent to running:
t = AnIterator()
t = iter(t) # iter(t) is the same as t.__iter__()
try:
while True:
# next(t) is the same as t.__next__()
print(next(t))
except StopIteration as e:
pass
Question 1: Does it work?
Consider the following iterators. Which ones work and which ones don't? Why?
Use OK to test your knowledge with the following conceptual questions:
python3 ok -q does_it_work -u
class IteratorA:
def __init__(self):
self.start = 10
def __next__(self):
if self.start > 100:
raise StopIteration
self.start += 20
return self.start
def __iter__(self):
return self
No problem, this is a beautiful iterator.
class IteratorB:
def __init__(self):
self.start = 5
def __iter__(self):
return self
Oh no! Where is __next__
? This fails to implement the iterator
interface because calling __iter__
doesn't return something that has
a __next__
method.
class IteratorC:
def __init__(self):
self.start = 5
def __next__(self):
if self.start == 10:
raise StopIteration
self.start += 1
return self.start
This also fails to implement the iterator interface. Without the
__iter__
method, the for
loop will error. The for
loop needs to
call __iter__
first because some objects might not implement the
__next__
method themselves, but calling __iter__
will return an
object that does.
class IteratorD:
def __init__(self):
self.start = 1
def __next__(self):
self.start += 1
return self.start
def __iter__(self):
return self
This is technically an iterator, because it implements both __iter__
and
__next__
. Notice that it's an infinite sequence! Sequences like these are
the reason iterators are useful. Because iterators delay computation, we can
use a finite amount of memory to represent an infinitely long sequence.
Question 2: Str
Write an iterator that takes a string as input and outputs the letters in order when iterated over.
class Str:
def __init__(self, str):
self.str = str
def __iter__(self):
return iter(self.str)
That works (why?), but just kidding.
class Str:
"*** YOUR CODE HERE ***"
def __init__(self, str):
self.str = str
self.i = -1
def __iter__(self):
return self
def __next__(self):
self.i += 1
if self.i >= len(self.str):
raise StopIteration
return self.str[self.i]
Test your code by either writing your own doc tests, or by running python3 -i lab11.py
There is no doctest for this problem. Test this on your own! We recommend starting with an interactive python shell python3 -i lab11.py
.
Generators
A generator function returns a special type of iterator called
a generator object. Such functions can be written using a yield
statement:
def <generator_fn_name>():
<somevariable> = <something>
while <predicate>:
yield <something>
<increment somevariable>
Calling a generator function (a function with a yield statement in it) makes it return a generator object rather than executing the body of the function.
The reason we say a generator object is a special type of iterator is that it has all the properties of an iterator, meaning that:
- Calling the
__iter__
method makes a generator object return itself without modifying its current state. - Calling the
__next__
method makes a generator object compute and return the next object in its sequence. If the sequence is exhausted,StopIteration
is raised. - Typically, a generator should not restart unless it's defined that way. But
calling the generator function returns a brand new generator object (like
calling
__iter__
on an iterable object).
However, they do have some fundamental differences:
- An iterator is a class with
__next__
and__iter__
explicitly defined, but a generator can be written as a mere function with ayield
in it. __iter__
in an iterator usesreturn
, but a generator usesyield
.A generator "remembers" its state for the next
__next__
call. Therefore, the first__next__
call works like this:- Enter the function, run until the line with
yield
. - Return the value in the
yield
statement, but remember the state of the function for future__next__
calls.
And subsequent
__next__
calls work like this:- Re-enter the function, start at the line after
yield
, and run until the nextyield
statement. - Return the value in the
yield
statement, but remember the state of the function for future__next__
calls.
- Enter the function, run until the line with
Use OK to test your knowledge with the following What would Python print questions:
python3 ok -q generators -u --local
def generator():
print("Starting here")
i = 0
while i < 6:
print("Before yield")
yield i
print("After yield")
i += 1
>>> g = generator()
>>> g # what type of object is this?
______<generator object ...>
>>> g == iter(g) # equivalent of g.__iter__()
______True
>>> next(g) # equivalent of g.__next__()
______Starting here
Before yield
0
>>> next(g)
______After yield
Before yield
1
>>> next(g)
______After yield
Before yield
2
Trace through the code and make sure you know where and why each statement is printed.
You might have noticed from the Iterators section that IteratorB
, which didn't
define a __next__
method, failed to run in the for
loop. However, this is
not always the case.
class IterGen:
def __init__(self):
self.start = 5
def __iter__(self):
while self.start < 10:
self.start += 1
yield self.start
for i in IterGen():
print(i)
Why does this iterable work without defining a __next__
method?
The for
loop only expects the object returned by __iter__
to have a
__next__
method. The __iter__
method is a generator function because of the
yield
statement in the body. Therefore, when __iter__
is called, it returns
a generator object, which you can call __next__
on.
Question 3: Countdown
Write a generator that counts down to 0.
Write it in both ways: using a generator function on its own, and
within the __iter__
method of a class.
def countdown(n):
"""
>>> from types import GeneratorType
>>> type(countdown(0)) is GeneratorType # countdown is a generator
True
>>> for number in countdown(5):
... print(number)
...
5
4
3
2
1
0
"""
"*** YOUR CODE HERE ***"
while n >= 0:
yield n
n = n - 1
Use OK to test your code:
python3 ok -q countdown
class Countdown:
"""
>>> from types import GeneratorType
>>> type(Countdown(0)) is GeneratorType # Countdown is an iterator
False
>>> for number in Countdown(5):
... print(number)
...
5
4
3
2
1
0
"""
"*** YOUR CODE HERE ***"
def __init__(self, cur):
self.cur = cur
def __iter__(self):
return self
def __next__(self):
result = self.cur
if result < 0:
raise StopIteration
self.cur -= 1
return result
# Alternate solution that makes __iter__ a generator function:
class Countdown:
def __init__(self, cur):
self.cur = cur
def __iter__(self):
while self.cur >= 0:
yield self.cur
self.cur -= 1
Hint: Is it possible to not have a __next__
method in Countdown
?
Use OK to test your code:
python3 ok -q Countdown
Question 4: Scale
Implement the generator function scale(s, k)
, which yields elements of the
given iterable s
, scaled by k
.
def scale(s, k):
"""Yield elements of the iterable s scaled by a number k.
>>> s = scale([1, 5, 2], 5)
>>> type(s)
<class 'generator'>
>>> list(s)
[5, 25, 10]
>>> m = scale(naturals(), 2)
>>> [next(m) for _ in range(5)]
[2, 4, 6, 8, 10]
"""
"*** YOUR CODE HERE ***"
for elem in s:
yield elem * k
Use OK to test your code:
python3 ok -q scale
Question 5: Pairs (generator)
Write a generator function pairs
that takes a list and yields all the
possible pairs of elements from that list.
Note that this means that you should be yielding a tuple.
def pairs(lst):
"""
>>> type(pairs([3, 4, 5]))
<class 'generator'>
>>> for x, y in pairs([3, 4, 5]):
... print(x, y)
...
3 3
3 4
3 5
4 3
4 4
4 5
5 3
5 4
5 5
"""
"*** YOUR CODE HERE ***"
for i in lst:
for j in lst:
yield i, j
Use OK to test your code:
python3 ok -q pairs
Submit
Make sure to submit this assignment by running:
python3 ok --submit