CSC401: Dictionaries

Dictionaries

Another name for maps

Also called hashes and associative arrays

Built in to the language

Very handy to be able to just write them down

Creating and Indexing

Create by putting key/value pairs inside {}

{"Newton":1642, "Darwin":1809}

Empty dictionary written {}

Index a dictionary using []

birthday = {"Newton":1642, "Darwin":1809}
print birthday["Darwin"], birthday["Newton"]
1809 1642

Can only access keys that are present

birthday = {"Newton":1642, "Darwin":1809}
print birthday["Turing"]
KeyError: Turing

Test for presence of key using k in d

birthday = {"Newton":1642, "Darwin":1809}
if "Turing" in birthday:
    print birthday["Turing"]
else:
    print "Who?"
Who?

Iterating

for k in d iterates over keys, not values

Inconsistent but useful

birthday = {"Newton" : 1642,
            "Darwin" : 1809,
            "Turing" : 1912}
for name in birthday:
  print name, birthday[name]
Turing 1912
Newton 1642
Darwin 1809

Adding Information

Assigning to a dictionary key:

Creates a new entry if key not in dictionary

Overwrites value if key already in dictionary

birthday = {}
birthday["Darwin"] = 1809
birthday["Newton"] = 1942  # oops
birthday["Newton"] = 1642
print birthday
{"Darwin": 1809, "Newton": 1642}

Counting Frequency

words = ['my','dog','ate','my','tutorial','notes']
freq = {}
for w in words:
    if w in freq:
        freq[w] += 1
    else:
        freq[w] = 1

print freq
{'notes': 1, 'ate': 1, 'my': 2, 'dog': 1, 'tutorial': 1}

Common Dictionary Methods

d.clear() Empty the dictionary.
Use del d[k] to delete specific items.
d.get(k, default) Get value associated with k,
or default if k not present.
d.has_key(k) Test whether k is in d.
d.keys() Get list of d's keys.
d1.update(d2) Merge keys and values from d2 into d1.
d.values() Get list of d's values.
d.items() Get list of d's keys and values (in tuples).
len(d) Get d's length.

Nested Dictionaries

calories = {'fruit': {'apple':80, 'banana':112}, 'dairy':{'milk':125, 'cheese':108}}
print calories['dairy']['milk']
125

A Better Way to Count

words = ["I","will","finish","what","I","star"]
for w in words:
    freq[w] = freq.get(w, 0) + 1
print freq
{'I': 2, 'will': 1, 'what': 1, 'finish': 1, 'star': 1}

Printing in Sorted Order

...build up freq as before...
keys = freq.keys()
keys.sort()
for k in keys:
    print k, freq[k]
I 2
finish 1
star 1
what 1
will 1

Inverting a Dictionary

seq = "GATTAATGCCATTGCTTA"
freq = {}
for c in seq:
    freq[c] = freq.get(c, 0) + 1
count = {}
for (k, v) in freq.items():
    count[v] = count.get(v, '') + k
print count
{3: 'CG', 5: 'A', 7: 'T'}

Slides originally created by Greg Wilson. Initial adaptation for CSC401 by David James. Revisions by Michelle Craig, Michael Szamosi, Karen Reid, and David James. Revisions for CSC401 Winter 2006 by Cosmin Munteanu.