Python __str__ & __repr__ & __format__

Python __str__ & __repr__ & __format__

ยท

7 min read

Python gives all objects an assortment of methods. Some methods have two double-underscores in front and behind the name. The double-underscores are sometimes called dunders. And the methods are called dunder methods. To find a list of these methods search for the term Python data model because that's where they are listed in the the Python documentation. These methods have some magic to them. They can be automatically called when Python needs certain operations to happen. For example, the __str__ method is called when the object is passed to the print() function. The power the dunder methods have is that we can override the method and create our own version.

Using the dir() function we can look at the attributes of an object. Even the number 7 is an object with many methods attached.

print(dir(7))

['__abs__', '__add__', '__and__', '__bool__', '__ceil__', '__class__', '__delattr__', '__dir__', '__divmod__', '__doc__', '__eq__', '__float__', '__floor__', '__floordiv__', '__format__', '__ge__', '__getattribute__', '__getnewargs__', '__gt__', '__hash__', '__index__', '__init__', '__init_subclass__', '__int__', '__invert__', '__le__', '__lshift__', '__lt__', '__mod__', '__mul__', '__ne__', '__neg__', '__new__', '__or__', '__pos__', '__pow__', '__radd__', '__rand__', '__rdivmod__', '__reduce__', '__reduce_ex__', '__repr__', '__rfloordiv__', '__rlshift__', '__rmod__', '__rmul__', '__ror__', '__round__', '__rpow__', '__rrshift__', '__rshift__', '__rsub__', '__rtruediv__', '__rxor__', '__setattr__', '__sizeof__', '__str__', '__sub__', '__subclasshook__', '__truediv__', '__trunc__', '__xor__', 'as_integer_ratio', 'bit_length', 'conjugate', 'denominator', 'from_bytes', 'imag', 'numerator', 'real', 'to_bytes']

For this post, we'll be looking at three methods are that somewhat related. __repr__, __str__, and __format__ dunder methods all are expected to output some text representation of the object.

Order of fallback

They are also related in that Python will fall back to a different method in a specific order, if the one expected is not present. Let's assume we created a new class called Person and made an instance called Rob, just for this step through.

class Person:
  def __init__(self, name, age):
    self.name = name
    self.age = age

Rob = Person('Rob', 57)
print('{}'.format(Rob))   # This will try to call `__format__`
print(Rob)                # This will try to call `__str__`
print(repr(Rob))          # This will try to call `__repr__`

<__main__.Person object at 0x7f8ab9642dc0>
<__main__.Person object at 0x7f8ab9642dc0>
<__main__.Person object at 0x7f8ab9642dc0>

All of them will output the exact same thing. We didn't add any of the methods, so Python had to keeping falling back. __format__ to __str__ to __repr__. It finally was able to find a working __repr__ at the object level which just passes back the name and location of the object. So far not exciting.

__repr__

__repr__ is intended to be the string representation of the object. The idea being that it can be used to recreate the object quickly. This isn't a hard rule and there are different ideas on what this means. I'll give you two examples.

  def __repr__(self):
    return f"Person('{self.name}', {self.age})"

Adding this to the Person class will allow us to do this.

print(repr(Rob))

Person('Rob', 57)

Not exciting enough? Well, notice it's the same command we used to create the object at the start.

Rob2 = eval(repr(Rob))

The eval function will evaluate the string returned by repr(Rob) and execute it. Thus, providing us a new object with the same data as the first.

The next example provides a dictionary of the data in the object.

  def __repr__(self):
    return f"{{'name':'{self.name}','age':'{self.age}'}}"

{'name':'Rob','age':'57'}

You might be familular with another dunder method; __dict__. This provides a dictionary exactly like the one we just created.

print(Rob.__dict__)

{'name': 'Rob','age': 57}

So what's the difference? __dict__ is a mapping object that is used to store the objects writable attributes. What we created with second version of __repr__ is just read only, but we can control what is presented. Let's say instead of just age, the class had a birthday that we used to fill in the age. But it's not required to recreate a new object. So while __dict__ would contain both age and birthday, we can set __repr__ to only show one of them.

This is completely as needed for the project you're working with. In some cases, __repr__ will only include data that the developers need for debugging. This is a very common use also.

__str__

While __repr__ is mainly for developers, __str__ is for end users. When creating the method, you should be thinking, when this is printed to the screen, what is expected and useful?

  def __str__(self):
    return f"My name is {self.name}, and my age is {self.age} years old."

My name is Rob, and my age is 57 years old.

This can be used to simplify the output of an app that may need to output the same information in different places in the code. Maybe a list on the Employee selection page and on a summary form including all their data. The __str__ can be very complicated, if need be. Keep in mind, it'll be used anytime that the object is outputted or added to a string. And always the same way.

As a reminder about order, if the __str__ is not included, the __repr__ would be used instead.

__format__

Having a common output for a class is great, but we don't always want it to look the exact same way every time. As with other object types, we can add formatting to the output. For example:

print(f"{3657:,}")
print(f"{4332.8123456:,.6}")

3,657
4,332.81

In both of these cases we've adding formatting specifications that changes how the numbers are displayed. So, where is that happening? If we look at a different (older) style we might see how.

print(format(3657, ','))
print(format(4332.8123456, ',.6'))

3,657
4,332.81

In both of these versions the format method will use the __format__ of whatever type that the value being formatted is. So in the case of 3657 since it's an int type the format method used is int.__format__. This is the beauty of the dunder methods. When we create these methods for our classes, Python will automatically use them exactly as it would built in object types.

What can we do? Well, we could create our own formating styles. Maybe for the Person call we want different styling for a list versus a summary page. So let's do that.

  def __format__(self, format_spec=None):
    if format_spec in [None, '']:
      return self.__str__()
    if format_spec == 'summary':
      return f"{self.name}\n\tAGE:{self.age}"
    if format_spec == 'list_person':
      return f"{self.name} -- {self.age} years old"
    raise Exception(f"Invalid Format Syntax: {format_spec}")

print(f"{Rob:summary}")
print(f"{Rob:list_person}")

Rob
    AGE:57
Rob -- 57 years old

A lot is happening here, let's break it down.

def __format__(self, format_spec=None):

The self will contain the object itself and the format_spec will contain what formatting options are being passed in. By making it optional we allow the user not to provide any formatting options.

  if format_spec in [None, '']:
    return self.__str__()

If no format_spec is passed in, I'm having the method fall back to the __str__ method. Of course, if you didn't make one, then you could replace it with __repr__. But because Python will fall back to __repr__ anyway it makes sense to use __str__ and if at a later time one is added it'll already be set to use it.

  if format_spec == 'summary':
    return f"{self.name}\n\tAGE:{self.age}"
  if format_spec == 'list_person':
    return f"{self.name} -- {self.age} years old"

We're allowing only two options; you can of course make this as complicated as you need to. With each option we're create a custom string to return. We also may want to provide some error handling.

  raise Exception(f"Invalid Format Syntax: {format_spec}")

And finally, if some type of formatting did get passed, but the method wasn't able to break it down, we'll raise an error.

There is the option that instead of raising an error we just default to __str__ instead. Just put return self.__str__() instead of the raise Exception and you're good to go. This should be based on your requirements.

Adding your own formatting spec.

Adding formatting specifications to your class will being your code to a new level. An important idea is as you add formatting that you are follow along with what is already being done in Python. For example, if you are adding alignment to your format spec, please use <, ^, >. Why? Because these are already being used in Python formatting specs. Using the same specs will make it much easier for developers to jump in and use your code quickly. Find the spec at Python Docs: Format Specification Mini-Language.

Resources

Python Docs: Format Specification Mini-Language

Python Data Model documentation

Did you find this article valuable?

Support Russ Eby by becoming a sponsor. Any amount is appreciated!