Please enable JavaScript.
Coggle requires JavaScript to display documents.
Python Data Structures
Univ of Michigan, Dr. Chuck, working code - Coggle…
Python Data Structures
Univ of Michigan, Dr. Chuck
week 1:3:45,
-
-
-
Autograder
Assignments in this course will not work in the Coursera app. You must use a web browser (like Chrome, Firefox, or Safari) to launch the course assignments.
- Assignments are pass/fail. There is no partial credit.
- https://code.labstack.com/python
-
-
string
-
-
[ ], index bracket, pronounce sub
looping through strings
for letter in fruit:
while loop:
iteration
in, logical operator
for letter in 'banana':
s[6:7]
6, not including 7
- print(s[6,100]) for len(s)=10
it is ok, it will stop at 10 :pencil2:
string compare
if word<'banana'
string function: methods
greet = "abc"
greet.lower()
functions are built into every string. called method
dir(str) give bunch of methods, including str.lower(), str.upper(), str.strip()
open python3 on desktop to try
- functions do not modify the original string, they return a new string
string library
capitalization--turn first letter upper
replace--have a new string with old substr replaced by new substr
find()--find position of substr
at_pos =data.find('') --position is index
sp_pos = data.find(' ', at_pos) // find space after at_pos
- upper(), lower()
strip(), lstrip(), rstrip() -- remove white space, such as tab, \n
strip don't take care of white space in the middle
line.startswith() --
string is immutable
build a new string
python3, all str are unicode
object: string, integer, file
-
w2 : 2:43
anaconda
install Python using the Anaconda distribution
-
run python on command line
search command
- it is important to know what folder things has been running
- c:\users\susan>python
- \ >>> chevron prompt is Python interpreter. waiting for python commands
- ctrl-Z to quit python
- c:\users\susan>dir//get folders and files
- install atom
- cd desktop, chdir
window snipping
- unassociate program, associate program
command prompt font size 24//property, font
upper arrow to repeat command
-
-
w3:3:26 File
file handle: a connection, not data file
-
fhand = open('m.txt')
inp = fhand.read()
print(len(inp))//94626, char
print(inp[:20]//first 20 char
for line in fhand:
if line.startswith('From:'):
print(line)
//print add a new line, will get a blank line
use line = line.rstrip()
mbox.txt, 1797 subject lines
mbox-short.txt, 27 subject lines
-
fname = input('Enter file name:')
try:
fh =open(fname)
except:
print('not a valid file, try again:')
for line in fh:
line = line.rstrip()
line = line.upper()
print(line)
w4 list:3:17
list constants
[1,'blue',[1,76]]
- element any python object,
even another list
- can be empty
-
mutable
(not mutable need a new instance,
mutable can use index operator)
- string -- not mutable
- list - mutable //lotto[2]=28
- tuples - not mutable
- dictionary - mutable
-
concatenating
- a = [1,2,3]
b=[4,5,6]
c=a+b
Slicing
t[1:3], t[:4],t[:]
methods
append(), count()//matching items
extend, index, insert, pop,
remove, reverse, sort
constructor
- stuff = list()
- stuff = []
- stuff.append(item)//don't do stuff=stuff.append(item), mess up
favorite operator is IN
some = [1,9,11,20]
9 in some, True
15 in some, False
21 not in some, False
ordered
- insert
- sort
- indexed
Build-in functions
- len()
- min(), max(), sum()
sum can use total/count
two ways: algorithm, and data
structure way
-
list work with string
split()
//text, read a line, split
into words, look a each char
- delimiter,',',':' //default is space
strip(), rstrip(), startswith()
-
list comprehension
- print(sorted( [(v,k) for k.v in c.items() ]) //
dictionary->a list of (v, k) :star:
list sort
- function: list.sort(reverse=True)
- list method: sorted(list, reverse=True)
-
w5 dictionary:2:46
Dictionary
- key, value
- python;s most powerful data collection
- assoicated array in perl/php
properties or map or hashmap -- JAVA
property bag -C#(sharp)/.net
- no order, mutable
- similar to list, except index use key
constructor
- purse = dict() or {}
- purse['money']=12
purse['candy']=3
purse['tissues']=75
- print(purse0
print(purse['candy'])
- purse['candy'] = purse['candy']+2
solve problem/use
count name
if name not in dict:
dict[name]=1
get Method
x = counts.get(name, 0)//
equals: if name in counts,
x= counts[name]
else: x=0
building a histogram
get()
- for nmae in names:
counts[name] = counts.get(name,0) +1
Retrieving lists of keys and values
- jj ={'Chuck':1, 'Fred':2, 'Jan': 100)
print(list(jj))
// ['Jan', 'Fred', 'Chuck']
// only the keys
print(jj.values()), print(jj.keys()), print(jj.items())
traceback
- ccc={}
print(ccc['csev'}
KeyError: 'csev'
- max_sender = None
max_count = 0
for sender, count in counts.items():
if max_count < count:
max_sender = sender
max_count = count
print(max_sender, max_count)
w6 tuples:2:18
-
tuples methods
- l = list()
dir(l) // append, count, extend, index...
t = tuple()
dir(t) //['count', 'index']
can't sort, append, reverse
Tuples and Assignment
- (x, y)=(4, 'fred')
Tuples are comparable
- compare first number, then 2nd, 3rd
sorting lists of tuples
dictionary.items() is a tuple
use sorted() function
sorted(d.items())
dictionary key can't be duplicated
for k,v in sorted(d.items()):// sorted by k
- make (value, key) tuples
tmp = list()
for kv in c.items():
tmp.append((v,k))
temp = sorted(tmp, reverse=True)
big to small order
top 10 most common words
- fname = input('give a file name:')
if len(fname)<1:
fname = 'romeo.txt'
- word_histogram = {}
for line in fname:
words = line.split()
for word in words:
word_histogram[word] = word_histogram.get(word, 0)+1
new_l = list()
for word, count in word_histogram.items():
new_t = (count, word)
new_l = new_l.append(new_t)
- new_l = sorted(new_l, reverse=True)
top_10 = tuple(new_l[:10])
print(top_10)
tuple advantage
- for a temporary list of item
without modification
JavaScript is undeniably better than Python for website development for one simple reason: JS runs in the browser while Python is a backend server-side language. While Python can be used in part to create a website, it can't be used alone. ... JavaScript is the better choice for desktop and mobile websites.
working code
- 10.2 read mbox-short.txt, get hour for each messages.z accumulated the counts for each hour, print out the counts, sorted by hour
name = input("Enter file:")
if len(name) < 1:
name = "mbox-short.txt"
handle = open(name)
lst = list()
for line in handle:
if line.startswith('From:') or not line.startswith('From'): continue
words = line.split()
lst.append(words[5])
hour_hist = {}
for i in range(len(lst)):
hour = lst[i][:2]
hour_hist[hour] = hour_hist.get(hour,0)+1
lst2= sorted([(v,k) for v,k in hour_hist.items()]) //list comprehension
"""
lst2= list()
for k,v in hour_hist.items():
lst2.append((k,v))
lst2.sort()
"""
for x, y in lst2:
print(x+' '+str(y))