Dictionaries and sets
Dictionaries
Dictionaries store key-value pairs. Keys must be hashable — strings, numbers, or tuples of hashable types. Values can be any type. Python 3.7 and later preserve insertion order.
Creating dictionaries
The following examples show the standard ways to create dictionaries:
# With curly braces:
empty_dict = {}
beatles = {
"John": "Lennon",
"Paul": "McCartney",
"George": "Harrison",
"Ringo": "Starr",
}
# With dict() — keyword arguments become string keys:
poet = dict(first='Edgar', middle='Alan', last='Poe')
# {'first': 'Edgar', 'middle': 'Alan', 'last': 'Poe'}
Convert two-value sequences into a dictionary by passing a list of pairs to dict():
# Two-item tuples:
tups = [('a', 'z'), ('b', 'y'), ('c', 'x')]
dict(tups)
# {'a': 'z', 'b': 'y', 'c': 'x'}
# Tuple of two-item lists:
tup_of_lis = (['a', 'z'], ['b', 'y'], ['c', 'x'])
dict(tup_of_lis)
# {'a': 'z', 'b': 'y', 'c': 'x'}
# Two-character strings — first character becomes the key, second the value:
two_char_str = ['ab', 'cd', 'ef']
dict(two_char_str)
# {'a': 'b', 'c': 'd', 'e': 'f'}
Adding and changing items
If the key does not exist, the assignment adds it. If it does, the assignment replaces the value:
beatles = {
'Lennon': 'John',
'McCartney': 'Paul',
'Harrison': 'George',
'Starr': 'Ringo',
}
# Add a new key:
beatles['Preston'] = 'Billy'
# Add, then correct a value:
beatles['Clapton'] = 'Erik'
beatles['Clapton'] = 'Eric' # replaces the typo
Getting items
The following examples cover the common ways to read from a dictionary:
# Get by key — raises KeyError if missing:
beatles['Lennon'] # 'John'
# Get with a default — returns the default instead of raising:
beatles.get('Martin') # None
beatles.get('Martin', 'Not a Beatle') # 'Not a Beatle'
beatles.get('Harrison') # 'George'
# Get all keys:
beatles.keys() # dict_keys(['Lennon', 'McCartney', 'Harrison', 'Starr', ...])
# Get all values:
beatles.values() # dict_values(['John', 'Paul', 'George', 'Ringo', ...])
# Store keys or values in a list:
list(beatles.values()) # ['John', 'Paul', 'George', 'Ringo']
# Get as a list of (key, value) tuples:
list(beatles.items())
# [('Lennon', 'John'), ('McCartney', 'Paul'), ...]
# Length:
len(beatles) # 4
# Combine two dicts with ** — later keys override earlier ones:
stones = {'Jagger': 'Mick', 'Richards': 'Keith', 'Watts': 'Charlie', 'Wyman': 'Bill'}
supergroup = {**beatles, **stones}
# Combine with update() — modifies the dict in place:
ruttles = {}
ruttles.update(beatles)
# Delete by key:
del beatles['Preston']
# Remove and return with pop():
beatles.pop('Clapton') # returns 'Eric'
# Delete all items:
ruttles.clear()
# Check for a key:
'Lennon' in beatles # True
# Copy:
ruttles = beatles.copy()
# Compare:
beatles == ruttles # True
Iterating through dictionaries
A for/in loop over a dictionary iterates over keys by default. Apply .values() to iterate over values, and .items() to iterate over key-value pairs:
for guy in beatles:
print(guy)
# Lennon McCartney Harrison Starr
for guy in beatles.values():
print(guy)
# John Paul George Ringo
# Unpack each item into named variables:
for last, first in beatles.items():
print(f"{first}'s last name is {last}")
# John's last name is Lennon
# Paul's last name is McCartney
# George's last name is Harrison
# Ringo's last name is Starr
Building grouped structures with defaultdict
A defaultdict from the collections module creates missing keys automatically using a factory function. This eliminates the KeyError that occurs when you append to a key that does not yet exist. Apply it when grouping items by a category:
from collections import defaultdict
# Group log entries by severity level:
raw_logs = [
("ERROR", "Disk quota exceeded"),
("INFO", "Health check passed"),
("ERROR", "Connection pool exhausted"),
("WARNING", "Retry attempt 3/5"),
("INFO", "Scheduled backup completed"),
]
by_level: defaultdict[str, list[str]] = defaultdict(list)
for level, message in raw_logs:
by_level[level].append(message) # no KeyError — missing keys get an empty list
# {'ERROR': ['Disk quota exceeded', 'Connection pool exhausted'],
# 'INFO': ['Health check passed', 'Scheduled backup completed'],
# 'WARNING': ['Retry attempt 3/5']}
Counting with Counter
Counter from the collections module counts occurrences of each element in an iterable. It returns a dictionary subclass where keys are elements and values are counts:
from collections import Counter
# Count word frequencies in a document:
text = "the quick brown fox jumps over the lazy dog the fox"
word_counts = Counter(text.split())
# Counter({'the': 3, 'fox': 2, 'quick': 1, ...})
word_counts.most_common(3)
# [('the', 3), ('fox', 2), ('quick', 1)]
# Count HTTP status codes from a request log:
status_codes = [200, 404, 200, 500, 200, 404, 200]
code_counts = Counter(status_codes)
# Counter({200: 4, 404: 2, 500: 1})
Dictionary comprehensions
Dictionary comprehensions build a dictionary from an iterable in a single expression. The format is:
{key_expression: value_expression for expression in iterable}
The following example counts character occurrences in a word:
word = 'better'
better_count = {letter: word.count(letter) for letter in word}
# {'b': 1, 'e': 2, 't': 2, 'r': 1}
Add a condition after the iterable to filter which items are included:
{key_expression: value_expression for expression in iterable if condition}
vowels = 'aeiou'
word = 'superpower'
vowel_counts = {letter: word.count(letter) for letter in set(word) if letter in vowels}
# {'o': 1, 'u': 1, 'e': 2}
Sets
A set is an unordered collection of unique, hashable values. Sets are useful for membership testing, deduplication, and computing relationships between groups — union, intersection, and difference.
Creating sets
Apply curly braces to create a set, or pass any iterable to set(). Sets eliminate duplicates automatically:
empty_set = set() # {} creates an empty dict, not an empty set
evens = {2, 4, 6, 8}
odds = {1, 3, 5, 7, 9}
# Convert other data structures to sets — duplicates are dropped:
set('letters')
# {'e', 's', 't', 'r', 'l'}
set(['Leonardo', 'Donatello', 'Raphael', 'Michaelangelo'])
# {'Leonardo', 'Michaelangelo', 'Donatello', 'Raphael'}
set(('dog', 'cat', 'fish'))
# {'cat', 'fish', 'dog'}
# From a dict — only keys are included:
set({'John': 'Lennon', 'Paul': 'McCartney', 'George': 'Harrison', 'Ringo': 'Starr'})
# {'John', 'George', 'Ringo', 'Paul'}
Set functions
The following examples cover adding, removing, and measuring set contents:
evens = {2, 4, 6, 8}
len(evens) # 4
evens.add(10) # {2, 4, 6, 8, 10}
evens.remove(10) # raises KeyError if not present
evens.discard(99) # no error if not present
Deduplication and set comparison
Converting a list to a set removes duplicates. The following example finds unique signups and identifies which users already exist in the system:
new_signups = ["alice@example.com", "bob@example.com", "alice@example.com", "carol@example.com"]
existing_users = ["bob@example.com", "dave@example.com"]
unique_signups = list(set(new_signups))
# ['alice@example.com', 'bob@example.com', 'carol@example.com']
already_exists = set(new_signups) & set(existing_users)
# {'bob@example.com'}
truly_new = set(new_signups) - set(existing_users)
# {'alice@example.com', 'carol@example.com'}
Iterating and filtering with sets
Sets are often nested inside dictionaries when values represent groups. The following pizza menu example demonstrates iteration and set-based filtering:
menu = {
'classic': {'pepperoni', 'cheese'},
'italian': {'sausage', 'peppers', 'onions'},
'veggie': {'peppers', 'onions', 'mushrooms', 'olives'},
'supreme': {'pepperoni', 'ham', 'beef', 'sausage', 'peppers', 'onions', 'mushrooms', 'olives'}
}
# Find pizzas that include onions:
for pizza, toppings in menu.items():
if 'onions' in toppings:
print(pizza)
# italian, veggie, supreme
# Find pizzas with onions but without sausage:
for pizza, toppings in menu.items():
if 'onions' in toppings and not ('sausage' in toppings):
print(pizza)
# veggie
Set operators
The following examples demonstrate the set operators. Each operator has both a symbolic form and an equivalent method:
fave = menu['italian'] # {'peppers', 'onions', 'sausage'}
worst = menu['veggie'] # {'peppers', 'mushrooms', 'onions', 'olives'}
best = menu['supreme']
# Intersection (&) — items in both sets:
fave & worst # {'peppers', 'onions'}
fave.intersection(worst) # {'peppers', 'onions'}
# Union (|) — all items from both sets:
fave | worst # {'peppers', 'olives', 'onions', 'sausage', 'mushrooms'}
fave.union(worst) # same result
# Difference (-) — items in the left set but not the right:
fave - worst # {'sausage'}
# Symmetric difference (^) — items in one set but not both:
fave ^ worst # {'olives', 'sausage', 'mushrooms'}
# Intersection with the & operator to filter a dict loop:
for pizza, toppings in menu.items():
if toppings & {'peppers', 'onions'}: # any overlap triggers True
print(pizza)
# italian, veggie, supreme
# Subset (<=) — True if all items in fave are also in best:
fave <= best # True
fave.issubset(best) # True
# Proper subset (<) — True if fave is a subset and not equal to best:
fave < best # True
# Superset (>=):
fave.issuperset(worst) # False
Set comprehensions
Set comprehensions build a set from an iterable. The format is:
{expression for expression in iterable}
Add a condition after the iterable to filter items:
{expression for expression in iterable if condition}
even_set = {number for number in range(1, 20) if number % 2 == 0}
# {2, 4, 6, 8, 10, 12, 14, 16, 18}
Immutable sets
Apply frozenset() to create a set whose contents cannot change after creation. Because they are hashable, frozensets can be used as dictionary keys:
frozen = frozenset([1, 2, 3])
# frozenset({1, 2, 3})
frozen.remove(3) # AttributeError: 'frozenset' object has no attribute 'remove'
frozen.add(4) # AttributeError: 'frozenset' object has no attribute 'add'
# Frozensets as dictionary keys — useful for representing unordered groups:
topology = {
frozenset({"web", "api"}): "front-tier",
frozenset({"db", "cache"}): "data-tier",
}