Nov 2: Sequences

Learning Objectives

After today's class, you should be able to:

  • Understand string slicing and indexing
  • Learn more string methods
  • Split and join strings

Lesson Outline

Quiz 4B: 25 minutes

  • Log in as student on the lab machine
  • Open web browser, go to canvas and log in with your eid
  • In canvas, go to Modules, Quiz4B
  • Do part one of the Module
  • The programming portion is given on sheet of paper
  • Open Thonny
  • Submit work using Canvas with Gradesope, Module Quiz4B, Programming

This week

  • String Manipulation

    • Python String Methods
    • splitting and joining
    • example joining
      # string.join(iterable), returns a string
      
      myTuple = ("Sharon", "Kathryn", "Sydney")
      x = "&".join(myTuple)
      print(x)
      
      myDict = {"name": "John", "country": "Norway"}
      mySeparator = "____"
      x = mySeparator.join(myDict)
      print(x)
      
  • Natural Language Processing

The following program analyzes the book Frankenstein using the Natural Language Toolkit.

from nltk.tokenize import word_tokenize
from nltk.corpus import stopwords

# Read the book into a string
file = open("pg84.txt", encoding="utf-8-sig")
text = file.read()

# Count how many times each word occurs
words = {}   # dictionary 
for word in word_tokenize(text):   #word_tokenize() returns list of strings
    if word.isalpha():       # Ignore non-words
        word = word.lower()  # Ignore case
        # Update dictionary
        if word in words:
            words[word] += 1
        else:
            words[word] = 1

# Remove stop words (like "the" and "you")
stop_words = set(stopwords.words("english"))   
for word in sorted(words.keys()):   # sorted() returns a list, key only if dict
    if word in stop_words:
        count = words.pop(word)
        print(f"Removed {count} instances of {word}")

# Display other frequently used words
print()
for word in sorted(words.keys()):
    count = words[word]
    if count >= 50:
        print(f"Found {count} instances of {word}")

Inclass Activity

Your To-Do List

By today

Complete by Monday November 6th