Strings

The Python string data type is a sequence made up of one or more individual characters that could consist of letters, numbers, whitespace characters, or symbols. Because a string is a sequence, it can be accessed in the same ways that other sequence-based data types are, through indexing and slicing.

r'ABC' - raw string. Python raw string treats backslash (\) as a literal character

u'ABC' or just 'ABC' - unicode string. Sequence of unicode characters. Each character may be encoded in several bytes

b'ABC' - bytes string. Each element is a byte or char. Can be stored on disk

f'ABC' - format string. Designed to apply formatting techniques

string-vs-byte-in-python.png

Raw String

In [ ]:
print(r'a\nb') # escaping \
In [ ]:
print('a\nb')  # reading \n as new-line symbol

Encode / Decode

In [ ]:
# Unicode to bytes
a = 'Строка на русском'
a.encode()
In [ ]:
# Bytes to Unicode
b'\xd0\x9f\xd1\x80\xd0\xb8\xd0\xb2\xd0\xb5\xd1\x82'.decode ()

Format string

In [ ]:
# Formatting string without format or %. No more code duplication!

a = 'some variable'
print(f'y = {a + " Y"} + x')
print(a + " Y")
y = some variable Y + x
some variable Y

Arithmetic operations

Several arithmetic operations available when working with strings

In [ ]:
# Concatenation
'abc' + 'def'
In [ ]:
# Duplication
'abc' * 3
Out[ ]:
'abcabcabc'

Size of the string

In [ ]:
len('abcdef')
list('abcdef')
Out[ ]:
['a', 'b', 'c', 'd', 'e', 'f']

Indexing

изображение.png

In [ ]:
a = 'abcdef'
for symbol in a:
  print(symbol)
In [ ]:
for index in range(len(a)):
  print(index, a[index])

String in python - immutable object

Result of all operations on string is another string. It is impossible to change the string in-place

In [ ]:
a = 'abcdef'
#    012345
# a[0]  = 'A'
var = 1000
b = 'A' + a[1:]
Abcdef

Slices

We can also call out a range of characters from the string. With slices, we can call multiple character values by creating a range of index numbers separated by a colon [x:y]

In [ ]:
# selections and slices
a = 'abcdef'
a[4]
In [ ]:
# Up to the fourth symbol
a[:4]
In [ ]:
# Indexing from the end
a[-3]

We can also specify the step of the slice. This parameter is called stride

In [ ]:
# Step of the slice
a[1:4:2]
In [ ]:
# String revert
a[::-1]

Sequences and strings

In [ ]:
# joining string
",".join(['Hello', 'student'])
Out[ ]:
'Hello,student'
In [ ]:
# splitting string
a = list("Hello, student")
a.split()
Out[ ]:
['H', 'e', 'l', 'l', 'o', ',', ' ', 's', 't', 'u', 'd', 'e', 'n', 't']

Case functions

In [ ]:
# upper
a.upper()
Out[ ]:
'ABCDEF'
In [ ]:
# lower
a.lower()
Out[ ]:
'abcdef'
In [ ]:
a = "mysevEralwordtitle"
a.title()
Out[ ]:
'Myseveralwordtitle'
In [ ]:
# Change case
a.swapcase()
Out[ ]:
'MY SEVeRAL WORD TITLE'

String type

In [ ]:
# is alphabetic symbol
a.isalpha()
Out[ ]:
True
In [ ]:
a[0].isdigit()
Out[ ]:
False

Find and replace

In [ ]:
# Number of occurrences of substring
a.count('e')
In [ ]:
# Return index of first occurrence
a.find('e')
In [ ]:
a.replace('e', 'AAAAA')

in / not in

In [ ]:
b_string = a
'a' not in b_string
Out[ ]:
False
In [ ]:
'Z' not in a   
Out[ ]:
True

Removing excess symbols

In [ ]:
# Remove trailing symbols
a = '    some noizy string       \n'
print(a.strip())
some noizy string
In [ ]:
a = '!!!!!some noizy string!!!!!!!'
a.rstrip('\n')
Out[ ]:
'!!!!!some noizy string!!!!!!!'
In [ ]:
index = a.find('s')

Exercise

Repetition encoding is one of the ways to compress strings: sequence of the same elements is substituted with one symbol concatenated with the number of repetitions.

'A6B5C11' -> 'AAAAAABBBBBCCCCCCCCCCC'

Let's suppose the string may contain only alphabetic symbols A-Z or a-z

Read an input string, compressed using repetition encoding and decode it.

Part A

Numbers are 1-9

Part B

Any numbers

In [ ]:
a.isalpha() # check if all characters of a string are letters
a.isdigit() # check if all characters of a string are digits

Second maximum

Find a second maximum in a sequence of numbers (like previously, numbers are entered, 0 means end of sequence, 0 is not a part of the sequence)

Input example code is available below

In [ ]:
current = int(input())
minimum = current
while current != 0:
    # your code
    current = int(input())

print(minimum)