Sets¶

In [ ]:
new_set = {1,2,3,1,2,3} # braces
new_set

In [ ]:
another_set = set([3,1,2])
print(another_set)

In [ ]:
new_set == another_set

In [ ]:
print(set(range(10)))
print(set('qwetyqwerty')) # random order
print(set(map(abs,(1,2,3,-1,-2,-3))))

In [ ]:
mixed_set = {'2',0,False,(3,2,1)} # no list inside
mixed_set

In [ ]:
mixed_set = {1,'2',True,(1,2,3), [1, 2]} # list can't be inside, this will give an error
mixed_set

In [ ]:



Hash-function does not guarantee that if $A>B$ then $h(A)>h(B)$¶

In [ ]:
print(sorted(set('qwertyqwertyaa')))

In [ ]:
for i in set((1,2,3,4,5,11,2,23,1,2,-1,-1)):
print(i,end='\t')

In [ ]:
if 1 in {1,2,3}:
print('YES')

In [ ]:
if 1 in {2,3} & {1,5}:
print('YES')

In [ ]:
if 1 in {2,3} | {1,5}:
print('YES')

In [ ]:
print({2,3,1} | {1,5})

In [ ]:
if 1 in {2,3} - {1,5}:
print('YES')

In [ ]:
if 1 in {2,3} ^ {1,5}: # in A|B but not in A&B (xor)
print('YES')


Sets methods¶

In [ ]:
t_set = set(range(5))
t_set

In [ ]:
len(t_set)

In [ ]:
t_set.discard(1)


Removing¶

In [ ]:
t_set.remove(1) # throws an error if can't find the element
t_set.pop() # you will not know what item gets removed

In [ ]:
t_set.clear()
t_set

In [ ]:
del t_set


Dictionaries¶

In [2]:
# JSON
phones = {"Mom": ["+79999999999","+79999999998"],
"Bro":"+71111111111",
print(*phones)

Mom Bro Dad

In [ ]:
print(phones['Mom'],phones.get('Mom'))
print(phones.get('Mom','No such number'))

In [ ]:
squares = {}
squares[0] = 0
squares[1] = 1
squares[2] = 4

squares['surprise'] = 'here'
print(squares)

In [ ]:
sqrts = {}
sqrts[1] = 1
sqrts[4] = 2
sqrts[9] = 3
sqrts

In [ ]:
del sqrts

In [1]:
dict([[1,2],(2,3),['no','yes']])

Out[1]:
{1: 2, 2: 3, 'no': 'yes'}
In [4]:
print(phones.items())
my_list = [4124, 1251, 5321]
for index, value in enumerate(my_list):
print(index, value)
for name,number in phones.items():
print(name,number,sep=' <<<<<<->>>>>> ')

dict_items([('Mom', ['+79999999999', '+79999999998']), ('Bro', '+71111111111'), ('Dad', '+77777777777')])
0 4124
1 1251
2 5321
Mom <<<<<<->>>>>> ['+79999999999', '+79999999998']
Bro <<<<<<->>>>>> +71111111111

In [5]:
phones.values(),phones.keys()

Out[5]:
(dict_values([['+79999999999', '+79999999998'], '+71111111111', '+77777777777']),
dict_keys(['Mom', 'Bro', 'Dad']))
In [ ]:
phones['Mo'] = '+70000000000'
phones

In [7]:
for i in phones:
print(i)
print(phones[i])

Mom
['+79999999999', '+79999999998']
Bro
+71111111111
+77777777777

In [8]:
for key, value in phones.items():
print(key, 'is', value)

Mom is ['+79999999999', '+79999999998']
Bro is +71111111111

In [ ]:
phones.items()

In [ ]:
deleted = phones.pop('Bro')

In [ ]:
sorted(phones) # keys

In [9]:
Capitals = dict()

Capitals['Russia'] = 'Moscow'
Capitals['Ukraine'] = 'Kiev'
Capitals['USA'] = 'Washington'

Countries = ['Russia', 'France', 'USA', 'Russia']

for country in Countries:
if country in Capitals:
print('Capital of ' + country + ': ' + Capitals[country])
else:

Capital of Russia: Moscow
Capital of USA: Washington
Capital of Russia: Moscow

In [13]:
Capitals = {'Russia': 'Moscow', 'Ukraine': 'Kiev', 'USA': 'Washington'}
Capitals = dict(Russia = 'Moscow', Ukraine = 'Kiev', USA = 'Washington')
Capitals = dict([("Russia", "Moscow"), ("Ukraine", "Kiev"), ("USA", "Washington")])
Capitals = dict(zip(["Russia", "Ukraine", "USA"], ["Moscow", "Kiev", "Washington"]))
print(list(zip(["Russia", "Ukraine", "USA"], ["Moscow", "Kiev", "Washington"])))

[('Russia', 'Moscow'), ('Ukraine', 'Kiev'), ('USA', 'Washington')]

In [ ]:
A = {'ab' : 'ba', 'aa' : 'aa', 'bb' : 'bb', 'ba' : 'ab', 'ac' : 'ca'}

key = 'ac'
if key in A:
del A[key]
try:
del A[key]
except KeyError:
print('There is no element with key "' + key + '" in dict')
print(A)

In [ ]:
A = dict(zip('abcdef', list(range(6))))
for key in A:
print(key, A[key])

In [ ]:
A = dict(zip('abcdef', list(range(6))))
for key, val in A.items():
print(key, val)


In [ ]:



You are given two pieces of text, both potentially containing many words. Output the number of unique words that appear in both texts and the sorted list of such words. (use sets and then sorted(my_set))

Example

Text A

Hurricane Gonzalo was the second tropical cyclone, after Hurricane Fay, to directly strike the island of Bermuda in a one-week time frame in October 2014, and was the first Category 4 Atlantic hurricane since Hurricane Ophelia in 2011. At the time, it was the strongest hurricane in the Atlantic since Igor in 2010. Gonzalo struck Bermuda less than a week after the surprisingly fierce Hurricane Fay; 2014 was the first season in recorded history to feature two hurricane landfalls in Bermuda. A powerful Atlantic tropical cyclone that wrought destruction in the Leeward Islands and Bermuda, Gonzalo was the seventh named storm, sixth and final hurricane and only the second major hurricane of the below-average 2014 Atlantic hurricane season. The storm formed from a tropical wave on October 12, while located east of the Lesser Antilles. It made landfall on Antigua, Saint Martin, and Anguilla as a Category 1 hurricane, causing damage on those and nearby islands. Antigua and Barbuda sustained US\$40 million in losses, and boats were abundantly damaged or destroyed throughout the northern Leeward Islands. The storm killed three people on Saint Martin and Saint BarthÃ©lemy. Gonzalo tracked northwestward as it intensified into a major hurricane. Eyewall replacement cycles led to fluctuations in the hurricanes structure and intensity, but on October 16, Gonzalo peaked with maximum sustained winds of 145 mph (230 km/h).

Text B

Hurricane Ophelia was the most intense hurricane of the 2011 Atlantic hurricane season. The seventeenth tropical cyclone, sixteenth tropical storm, fifth hurricane, and third major hurricane, Ophelia originated in a tropical wave in the central Atlantic, forming approximately midway between the Cape Verde Islands and the Lesser Antilles on September 17. Tracking generally west-northwestward, Ophelia was upgraded to a tropical storm on September 21, and reached an initial peak of 65 mph (100 km/h) on September 22. As the storm entered a region of higher wind shear it began to weaken, and was subsequently downgraded to a remnant low on September 25. The following day, however, the remnants of the system began to reorganize as wind shear lessened, and on September 27, the National Hurricane Center once again began advisories on the system. Moving northward, Ophelia regained tropical storm status early on September 28, and rapidly deepened to attain its peak intensity with maximum sustained winds of 140 mph (220 km/h) several days later. The system weakened as it entered cooler sea surface temperatures and began a gradual transition to an extratropical cyclone, a process it completed by October 3.

Output

31
Atlantic Hurricane Islands Lesser October Ophelia The a and as cyclone, hurricane hurricane, in it major maximum mph of on season. storm storm, sustained the to tropical was wave winds with
In [ ]:



Modify the solution to the last task so that the same word written in different case (e.g. 'Appear' and 'appear') is counted as the same word.

When outputting the list of common words, you may use any case.

In [ ]:



Hash-functions¶

In [ ]:
def myHASHer(string):
hash_string = ''
for i in string:
hash_string +='{:03d}'.format(ord(i))

return (hash_string)

In [ ]:
hash_string = myHASHer('Hello, HSE!)_:2020')
hash_string

In [ ]:
def unHASHer(hash_string):
string = ''
for i in range(len(hash_string)//3):
string += chr(int((hash_string)[3*i:3*i+3]))
return string

In [ ]:
unHASHer(hash_string)

In [ ]:
import hashlib
hashlib.algorithms_available

In [ ]:
a ='''
{'blake2b',
'blake2b512',
'blake2s',
'blake2s256',
'md4',
'md5',
'md5-sha1',
'ripemd160',
'sha1',
'sha224',
'sha256',
'sha3-224',
'sha3-256',
'sha3-384',
'sha3-512',
'sha384',
'sha3_224',
'sha3_256',
'sha3_384',
'sha3_512',
'sha512',
'sha512-224',
'sha512-256',
'shake128',
'shake256',
'shake_128',
'shake_256',
'sm3',
'whirlpool'}
'''
print(hashlib.sha256(a.encode()).hexdigest())
print(hashlib.sha256('hellp'.encode()).hexdigest())
print(hashlib.sha256('hello'.encode()).hexdigest())
print(hashlib.sha256('hello'.encode()).hexdigest())

In [ ]:
12352356473273


The ideal cryptographic hash function has five main properties:¶

• it is deterministic so the same message always results in the same hash
• it is quick to compute the hash value for any given message
• it is infeasible to generate a message from its hash value except by trying all possible messages
• a small change to a message should change the hash value so extensively that the new hash value appears uncorrelated with the old hash value
• it is infeasible to find two different messages with the same hash value

https://en.wikipedia.org/wiki/Cryptographic_hash_function