Python: Dictionary Comprehensions

Python: Dictionary Comprehensions

ยท

6 min read

In this post we'll go over Dictionary Comprehensions.

Please Note: We will be using 2 lists that are available at the end of the post so you can quickly copy them over and follow along in your own editor.

First, let's take a quick look at what a Python dictionary is before we start building them.

A dictionary uses {} to mark the start and end of the dictionary. Inside is a set of key - value pairs. The key can be almost anything. One thing it can't be is duplicated. Each key must be unique. The value has fewer restrictions. The biggest difference between a Dictionary and List comprehension is instead of a single value to added to the List we're adding the Key:Value pair. The Key:Value pair is separated by a : and there is a , between each Key:Value pair. The , will be added automatically by the Dictionary Comprehension, but the : we'll use to separate the Key and Value.

my_dict = {
  'key1' : 1,
  'key2' : 2
}

Let's look at how we can use a for loop to go over a pair of lists to create a dictionary. Starting with an empty dictionary, we'll use a for loop to iterate the pair of lists added each pair to the dictionary.

my_state_dict = {}

for name, abv in zip(state_names, states_abv):
  my_state_dict[abv] = name

This would create a dictionary that would start like this.

{
  'AK' : 'Alaska',
  'AL' : 'Alabama',
  'AT' : 'Arkansas',
  ...
}

Now let's convert this into a Dictionary Comprehension.

Side Note: I'm spreading out the comprehension across multiple lines to make it a bit easier to see what's happening. While you are welcome to put the entire comprehension on a single line, you'll find this makes it harder to read.

my_state_dict = {
  abv : name
  for name, abv in zip(state_names, states_abv) 
}

First line we're telling Python the key:value pair will be the variables called abv:name and in the next line we're telling Python where the data is coming from. The for loop in the Dictionary Comprehension is exactly the same as the for loop above. We're just ordering the steps in a slightly different way.

An advantage is that we don't need to initialize a dictionary to start with. Also since this pattern is bult into Python, optimization can be done by Python under the hood.

Pretty simple example, but what if we what to make changes while the dictionary is being created? For example let's change North to N.. (we'll chain on South for good measure)

my_state_dict = {
  abv : name.replace("North", "N.").replace("South", "S.")
  for name, abv in zip(state_names, states_abv) 
}

This next change might seem silly, but let's say as the dictionary is being created we want to change the name of Minnesota to it's state motto L'etoile du Nord. It's French for Star of the North. For the for loop it'll be just a matter of adding an if statement.

For the time being we'll take the replace() out, but we'll put it back. ๐Ÿ˜€

my_state_dict = {
  abv : (name if name != 'Minnesota' else "L'etoile du Nord")
  for name, abv in zip(state_names, states_abv) 
}

We're saying use name if the name isn't Minnesota otherwise use L'etoile du Nord.

What if we want to change New York's name to it's state motto also? Don't worry we won't do all fifty. But We are going to flip some things around.

Now here we're saying make the value L'etoile du Nord if the name is Minnesota otherwise make the value Excelsior if the name is New York otherwise use name. We could go on and on. We do want to keep readability. If you needed to do all 50, then the length would make a dictionary comprehension unreadable.

my_state_dict = {
  abv : ("L'etoile du Nord" if name == 'Minnesota' else "Excelsior" if name == "New York" else name)
  for name, abv in zip(state_names, states_abv) 
}

We've looked at modifying the value. The key would work the same way, if we needed to modify it.

But what about if we wanted to skip any states that start with A.

my_state_dict = {
  abv : ("L'etoile du Nord" if name == 'Minnesota' else "Excelsior" if name == "New York" else name)
  for name, abv in zip(state_names, states_abv)
  if not name.startswith("A")
}

We're adding an if statement after the for loop. This acts like a filter. In this case we're only including iterations where the name doesn't start with A.

Examples

In this example we're building a dictionary that has a letter as the key and a number as the value.

alphabet_code = {
  letter: code 
  for code, letter in enumerate(string.ascii_lowercase, start=1)
}

This is the dictionary what is created.

alphabet_code = {
  'a': 1, 
  'b': 2, 
  'c': 3,
  ... 
  'y': 25,
  'z': 26
}

This next example is a bit more complicated. We're using a list comprehension that iterates over the users list and creates a dictionary for each element in the list. The list keys is used to create key value pairs inside each element.

data_points = ["id", "username", "password"]
users = ['russ', 'vish']

data = [
  {
    data_point: (
        index if data_point == "id" else 
        user if data_point == "username" else 
        ''.join(random.choices(string.printable, k=10)) if data_point == "password" 
      else None
      )
    for data_point in data_points
  } 
  for index, user in enumarate(users))
  ]

Below is an example of the list of dictionaries that would be created.

data = [
  {
    'id': 0, 
    'username': 'russ', 
    'password': 'i%)j[/;s>,'
  }, 
  {
    'id': 1, 
    'username': 'vish', 
    'password': 'b%Woo\rR+v9'
  }
]

Conclusion

As you can imagine, these comprehensions can become very complicated. Comprehensions can be used to tighten the code to make it obvious what is being done. But they can also make it very hard to follow. In my opinion the most important rule is to make the code readable. One tactic with dictionary comprehensions is not to use k, v or key, value as the variable names; make the variable names fit what is inside. I like the idea of spreading out the comprehensions. While you lose the single line awesomeness, the improvement in readability is massive.

Data we are using

states_abv = [ 'AK', 'AL', 'AR', 'AZ', 'CA', 'CO', 'CT', 'DE', 'FL', 'GA', 'HI', 'IA', 'ID', 'IL', 'IN', 'KS', 'KY', 'LA', 'MA', 'MD', 'ME', 'MI', 'MN', 'MO', 'MS', 'MT', 'NC', 'ND', 'NE', 'NH', 'NJ', 'NM', 'NV', 'NY', 'OH', 'OK', 'OR', 'PA', 'RI', 'SC', 'SD', 'TN', 'TX', 'UT', 'VA', 'VT', 'WA', 'WI', 'WV', 'WY'
]

state_names = ["Alaska", "Alabama", "Arkansas", "Arizona", "California", "Colorado", "Connecticut", "Delaware", "Florida", "Georgia", "Hawaii", "Iowa", "Idaho", "Illinois", "Indiana", "Kansas", "Kentucky", "Louisiana", "Massachusetts", "Maryland", "Maine", "Michigan", "Minnesota", "Missouri", "Mississippi", "Montana", "North Carolina", "North Dakota", "Nebraska", "New Hampshire", "New Jersey", "New Mexico", "Nevada", "New York", "Ohio", "Oklahoma", "Oregon", "Pennsylvania", "Rhode Island", "South Carolina", "South Dakota", "Tennessee", "Texas", "Utah", "Virginia", "Vermont", "Washington", "Wisconsin", "West Virginia", "Wyoming"
]

Resources

Python Simplified - Dictionary Comprehension - Create Complex Data Structures Step by Step (Video)

GeeksForGeeks - Comprehensions in Python (Post)

PEP 274 - Dict Comprehensions

Dictionary on Python.org

Did you find this article valuable?

Support Russ Eby by becoming a sponsor. Any amount is appreciated!