import os
os.scandir()
, os.listdir()
¶https://docs.python.org/3/library/os.html#os.scandir
Objects inside the loop are special os.DirEntry
objects
https://docs.python.org/3/library/os.html#os.DirEntry
They have fields .name
, .path
and methods .is_dir()
, .is_file()
to make working with them easier. They can also be directly used in open(file, mode)
.
for file in os.scandir():
print(file)
for file in os.listdir():
print(file)
for file in os.listdir('data_folder'):
print(file)
os.path.isfile(path)
and os.path.isdir(path)
.¶These functions work with both strings and os.DirEntry
objects.
for file in os.scandir():
if os.path.isfile(file):
print(f'{file.name} is a file')
if os.path.isdir(file):
print(f'{file.name} is a directory')
os.path.getsize(path)
returns the size of the file in bytes. Is may not be the same as the number of characters (some characters take more than 1 byte to write).
path = './text.txt'
print(f'size of {path} is {os.path.getsize(path)} bytes')
print()
for file in os.scandir():
if os.path.isfile(file):
print(f'size of {file.name} is {os.path.getsize(file)} bytes')
with open(file, 'r') as f:
try:
contents = f.read()
print(f'length of string of all contents of the file is {len(contents)} characters')
except UnicodeDecodeError as e:
print("Can't decode Unicode in the file")
print()
Print names of all files in the current folder with size larger than 100 bytes.
Print names of all files in the current folder which contain a word "print".
Additional part: print a line that contains the word "print" after the name of the file. If several lines contain this word, print any one of them.
Print a list of all files in the current folder and its subfolders.
Lists below are formatted for illustration, you can just print a list of names.
You can get the name of the current folder with os.getcwd()
. Use os.scandir(subfolder)
or os.listdir(subfolder)
to get lists of files in folders other than the current one.
You can use the line print(os.path.split(os.getcwd())[-1] + '/')
to print the current directory without the full path.
In the examples below, this is the actual structure of files:
folder/
|-data.txt
|-Workshop 20. File system. Practice.ipynb
|-my_file2.txt
|-datafolder/
|-data.txt
|-moredata.txt
|-another_folder/
|-data.csv
|-folder/
|-hidden_data.json
|-more_folders/
|-Workshop 21.ipynb
Version 1: 2 levels. Only the folder and the folders one level down. No formatting. Order doesn't matter. Add /
to names of folders.
folder/
data.txt
Lecture 8. Files, Exceptions.ipynb
my_file2.txt
datafolder/
data.txt
moredata.txt
another_folder/
more_folders/
Workshop 21.ipynb
Version 2: 2 levels. Formatting as above. Order of files in a folder doesn't matter.
folder/
|-data.txt
|-Lecture 8. Files, Exceptions.ipynb
|-my_file2.txt
|-datafolder/
|-data.txt
|-moredata.txt
|-another_folder/
|-more_folders/
|-Workshop 21.ipynb
Version 3: all levels. Formatted as the example below. Order of files in a folder doesn't matter.
folder/
|-data.txt
|-Workshop 20. File system. Practice.ipynb
|-my_file2.txt
|-datafolder/
|-data.txt
|-moredata.txt
|-another_folder/
|-data.csv
|-folder/
|-hidden_data.json
|-more_folders/
|-Workshop 21.ipynb
Write a program that simulates coin flips by generating random numbers. Generate a sequence of coin flips until you get either heads or tails 3 times in a row.
Run the simulation 15 times. Compute the average number of flips it takes to get the same result 3 times in a row.
H T H T T T (6 flips)
H H T H H T H H H (9 flips)
T T T (3 flips)
...
T H H H (4 flips)
Average: 8.1 flips
In the file sales.csv
(link) there is a fictional set of sales data for employees of some company.
The structure of the file is the following:
division,level of education,training level,work experience,salary,sales
computer hardware,some college,1,5,81769,302611
peripherals,bachelor's degree,1,4,89792,274336
peripherals,high school,0,5,70797,256854
The first row is a header, then all rows contain data.
You task is the following:
Write a function word_count()
that takes a string as an input and outputs a dictionary where words are keys and the number of their appearances are values.
In this task, a word is always one of
Additional rules and assumptions:
{'the': 2}
.{'the': 1, 'spacebar': 1, 'is':1, 'broken':1}
.You can use module string
to get all punctuation characters.
import string
print(string.punctuation)
import string
def word_count(sentence):
word_dict = {}
return word_dict
# Tests
def check(output, answer):
if output == answer:
print("OK")
print(output)
else:
print("WA")
print(f"Expected {answer}, got {output}")
tests = [('word',{'word': 1}),
('a word and another word',{'a': 1, 'word': 2, 'and': 1, 'another': 1}),
('punctuation? You! Ignore it: all^ of*((%@)) %#@%:it',
{'punctuation': 1, 'you': 1, 'ignore': 1, 'it': 2, 'all': 1, 'of': 1}),
("Don't forget about apostrophes. They're tricky",
{"don't": 1, 'forget': 1, 'about': 1, 'apostrophes': 1, "they're": 1, 'tricky': 1}),
("'apostrophes' are just regular words, like apostrophes.",
{'apostrophes': 2, 'are': 1, 'just': 1, 'regular': 1, 'words': 1, 'like': 1}),
('be extra !!! careful *** not to make words out of nothing',
{'be': 1, 'extra': 1, 'careful': 1, 'not': 1, 'to': 1, 'make': 1, 'words': 1, 'out': 1, 'of': 1, 'nothing': 1}),
("''double'' quotation marks", {'double': 1, 'quotation': 1, 'marks': 1}),
("The_spacebar_is_broken,so.i'm_writing.Like.This",
{'the': 1, 'spacebar': 1, 'is': 1, 'broken': 1, 'so': 1, "i'm": 1, 'writing': 1, 'like': 1, 'this': 1})
]
for test in tests:
check(word_count(test[0]), test[1])