Nov 2: Sequences
Learning Objectives
After today's class, you should be able to:
- Understand string slicing and indexing
- Learn more string methods
- Split and join strings
Lesson Outline¶
Quiz 4B: 25 minutes
- Log in as
student
on the lab machine - Open web browser, go to canvas and log in with your eid
- In canvas, go to Modules, Quiz4B
- Do part one of the Module
- The programming portion is given on sheet of paper
- Open Thonny
- Submit work using Canvas with Gradesope, Module Quiz4B, Programming
This week
-
String Manipulation
- Python String Methods
- splitting and joining
- example joining
# string.join(iterable), returns a string myTuple = ("Sharon", "Kathryn", "Sydney") x = "&".join(myTuple) print(x) myDict = {"name": "John", "country": "Norway"} mySeparator = "____" x = mySeparator.join(myDict) print(x)
-
Natural Language Processing
- Python's NLTK Package
- install as thonny package
- Example: parts of speech
- Python's NLTK Package
The following program analyzes the book Frankenstein using the Natural Language Toolkit.
from nltk.tokenize import word_tokenize
from nltk.corpus import stopwords
# Read the book into a string
file = open("pg84.txt", encoding="utf-8-sig")
text = file.read()
# Count how many times each word occurs
words = {} # dictionary
for word in word_tokenize(text): #word_tokenize() returns list of strings
if word.isalpha(): # Ignore non-words
word = word.lower() # Ignore case
# Update dictionary
if word in words:
words[word] += 1
else:
words[word] = 1
# Remove stop words (like "the" and "you")
stop_words = set(stopwords.words("english"))
for word in sorted(words.keys()): # sorted() returns a list, key only if dict
if word in stop_words:
count = words.pop(word)
print(f"Removed {count} instances of {word}")
# Display other frequently used words
print()
for word in sorted(words.keys()):
count = words[word]
if count >= 50:
print(f"Found {count} instances of {word}")
Inclass Activity
Your To-Do List¶
By today
-
Start Programming Assignment 2: PA 2
- Three parts
- Part A: 10 points Readiness Quiz
- Due Monday Nov 6th
- Part B: 30 points PA 2
- Due Wednesday Nov 8th
- Part C: 60 points PA 2
- Due Wedensady Nov 15th
- PA 2 Attribution
- Due Wednesday Nov 15th
- Part A: 10 points Readiness Quiz
- Three parts
Complete by Monday November 6th
- Complete the Chapter 10 "orange" textbook activities. You must do this through CANVAS to receive credit.