Workshop 15

Iterators and generators.

Iterators

Iterators and the "for" loop

In [ ]:
for i in range(5):
    print(i)

for element in [1, 2, 3]:
    print(element)

for element in (1, 2, 3):
    print(element)

for key in {'one': 1, 'two': 2}:
    print(key)

for char in "123":
    print(char)

for line in open("text.txt"):
    print(line, end='')

Under the hood: functions "iter" and "next"

In [1]:
s = 'abc'
it = iter(s)
print(it)

print(next(it))
print(next(it))
print(next(it))
print(next(it))
<str_iterator object at 0x7f2428940048>
a
b
c
---------------------------------------------------------------------------
StopIteration                             Traceback (most recent call last)
<ipython-input-1-eee11cb7aa73> in <module>()
      6 print(next(it))
      7 print(next(it))
----> 8 print(next(it))

StopIteration: 

Generators

Generators are a simple and powerful tool for creating iterators.

In [ ]:
def reverse(data):
    for index in range(len(data)-1, -1, -1):
        yield data[index]

print(reverse)

for char in reverse('golf'):
    print(char)
<function reverse at 0x7f62f63629d8>
f
l
o
g
In [ ]:
def squares(n):
    for i in range(n):
        yield i ** 2


for x in squares(10):
    print(x)
In [ ]:
def fact(n):
    f = 1
    for i in range(1, n):
        f *= i
        yield f


for x in fact(10):
    print(x)

Generator expressions

In [ ]:
for i in (x*x for x in range(10)):
    print(i)
In [ ]:
for i in (x for x in range(10) if x % 2 == 0):
    print(i)
In [ ]:
import math
for i in (x for x in range(10) if math.sqrt(x) - math.trunc(math.sqrt(x)) == 0):
    print(i)
In [ ]:
s = sum(i*i for i in range(10))
print(s)
In [ ]:
from math import pi, sin
sin_table = {x: sin(x*pi/180) for x in range(0, 91)}
print(sin_table)
In [ ]:
data = 'golf'
letter_list = list(data[i] for i in range(len(data)-1, -1, -1))
print(letter_list)
In [ ]:
print([x + y for x in 'abc' for y in 'lmn'])

Functional tools

filter

In [ ]:
print(*filter(lambda x: x % 2 != 0, range(10)))
1 3 5 7 9
In [ ]:
import math
print(*filter(lambda x: math.sqrt(x) - int(math.sqrt(x)) == 0, range(100)))
0 1 4 9 16 25 36 49 64 81

map

In [ ]:
print(*map(lambda x: x * x, range(10)))
0 1 4 9 16 25 36 49 64 81
In [ ]:
print(*map(lambda c: '_' + c.upper() + '_' , 'hello'))
_H_ _E_ _L_ _L_ _O_

reduce

In [ ]:
from functools import reduce

# Arithmetic series
print(reduce(lambda a, b: a + b, range(1, 5)))

# Factorial
print(reduce(lambda a, b: a * b, range(1, 5)))
10
24

zip

In [ ]:
x_list = [10, 20, 30]
y_list = [7, 5, 3]
s = sum(x*y for x, y in zip(x_list, y_list))
print(s)

enumerate

In [ ]:
for i, x in enumerate(x * x for x in range(10)):
    print(i, " * ", i, " = ", x)

partial

In [ ]:
from functools import partial

binStrToInt = partial(int, base=2)
print(binStrToInt('10010'))
18

Task 1

Implement a generator function that takes several sequences as input and return their elements as a single sequence. First all the elements of the first argument, then the second one, and so on.

In [ ]:
sequence1 = '123'
sequence2 = 'ABC'
# implement the function 'combine'
sequence_combined = combine(sequence1, sequence2)
for symbol in sequence_combined:
    print(symbol, end=' ')

# the output should be
# 1 2 3 A B C
1 2 3 A B C 

Task 2

For a number N, output the list of all prime numbers below N. Use the function math.sqrt as opposed to x**0.5 to compute the square root.

Use the function filter to get the output list.

In [ ]:
 

itertools

Library to make various iterators to automate common operations.

https://docs.python.org/3.8/library/itertools.html

islice()

Makes slices of any iterable objects.

In [ ]:
text_data = '''First line has data
### Second line doesn't have data
Third line has data
### Every even line does not contain data
Every odd line contains data
### No data
Data here
### No data
Data here
'''

with open('slicefile.txt', 'w') as output_file:
  output_file.write(text_data)
In [ ]:
with open('slicefile.txt') as input_file:
  for line in input_file:
    print(line.strip())
First line has data
### Second line doesn't have data
Third line has data
### Every even line does not contain data
Every odd line contains data
### No data
Data here
### No data
Data here
In [ ]:
import itertools

with open('slicefile.txt') as input_file:
  # itertools.islice(sequence, start, stop, step)
  input_file_even = itertools.islice(input_file, 0, None, 2)
  for line in input_file_even:
    print(line.strip())
First line has data
Third line has data
Every odd line contains data
Data here
Data here
In [ ]:
import itertools

print(*(itertools.combinations('123456', 2)), sep='\n')
('1', '2')
('1', '3')
('1', '4')
('1', '5')
('1', '6')
('2', '3')
('2', '4')
('2', '5')
('2', '6')
('3', '4')
('3', '5')
('3', '6')
('4', '5')
('4', '6')
('5', '6')

Task 3

You are given a list of data pairs. The first element of a pair is a year, the second one is a data point.

Split the data into groups by year using the itertools.groupby() function. Then for every year output data points containing numbers over 1000. Use the function filter to do it.

Do not use any lists / tuples / sets / dictionaries as variables. Solve everything using groupby, filter, map and other tools for iterable objects.

You can learn how to use groupby here: https://docs.python.org/3.9/library/itertools.html#itertools.groupby

The key part is this:

for k, g in groupby(data, keyfunc):

You output should be

Data for the year 1988
4636 1808 1108
Data for the year 1989
3517
Data for the year 1990
2276 2407 1798
In [ ]:
import itertools

data = [
    (1988, 330),
    (1988, 4636),
    (1988, 1808),
    (1988, 1108),
    (1988, 766),
    (1988, 383),
    (1988, 411),
    (1988, 363),
    (1989, 76),
    (1989, 202),
    (1989, 3517),
    (1989, 451),
    (1989, 132),
    (1989, 141),
    (1989, 193),
    (1990, 111),
    (1990, 2276),
    (1990, 2407),
    (1990, 405),
    (1990, 151),
    (1990, 459),
    (1990, 1798)
]

Appendix

The following section is included only as trivia. It includes additional information that may be a future workshop topic.

How iterators are implemented

Iterator that returns letters of a string in a reversed order:

In [ ]:
class Reverse:
    """Iterator for looping over a sequence backwards."""

    def __init__(self, data):
        self.data = data
        self.index = len(data)

    def __iter__(self):
        return self

    def __next__(self):
        if self.index == 0:
            raise StopIteration
        self.index = self.index - 1
        return self.data[self.index]


rev = Reverse('spam')
print(iter(rev))

for char in rev:
    print(char)

Iterator that returns factorials

In [ ]:
class Fact:
    """Iterator for calculating factorials."""

    def __init__(self, limit):
        self.n = 1
        self.limit = limit
        self.data = 1

    def __iter__(self):
        return self

    def __next__(self):
        if self.n >= self.limit:
            raise StopIteration
        self.data *= self.n
        self.n += 1
        return self.data


for x in Fact(10):
    print(x)