Exercises
Last updated
Last updated
A great resource to download text files is the Project Gutenberg. Project Gutenberg is a library of over 70,000 free eBooks, mostly older works for which U.S. copyright has expired. You can download them (including as text files) or read them online.
Here is a small extract from the book "Photographic investigations of faint nebulae" by Edwin Hubble (1920). The complete text is also available on the Project Gutenberg site.
Write a function that reads bookSampleHubble.txt
and prints only the words with more than 10 characters (not counting whitespace).
In 1939 Ernest Vincent Wright published a 50,000 word novel called "Gadsby" that does not contain the letter e.' Since e' is the most common letter in English, that's not easy to do.
Write a function called has_no_e(filename)
that returns True
if the given text file doesn't have the letter e in it.
Write a function no_e_percentage(filename)
that computes the percentage of the words in the file that have no e.
Write a function named avoids(filename, forbidden)
that takes a text file's name and a string of forbidden letters, and that returns the set
of words that don't use any of the forbidden letters.
Modify your program to find a combination of 5 forbidden letters that excludes the smallest number of words.
Write a function named redact_uses_only(inputfile, outputfile, letters)
that takes a text file inputfile
as input, read the file and redact out all the words in the text file that is not only composed of characters from the string letters
, and write the redacted text into the outputfile
. For example, if letters == 'ehlo'
the text 'Hello, I am in hell'
should be redacted to 'Hello, _ __ __ hell'
.