Classes and Object-Oriented Designs

a Tutorial about Python 3.x Syntax

Hiya!

On this page we provide some tutorials and examples about Object-Oriented Programming in Python, and how we can design and use classes.

Python is an interpreted language, so you can write code "live" in a shell-like interface to the interpreter, or create scripts, or you can create your own well-organized modules and packages. At every "level" of complexity in the code structure, though, you're going to be highly dependent on classes and objects.

And if you want to get the most out of Python, and enhance or optimise your code for readability and reuse, you're going to want to design and create your own classes. And you're going to want to aim for terse, highly automated code.

We won't go so far as to say "no code is the best code" -- that makes no sense -- but we will say that when you develop in Python, especially working with classes, you really do want to be careful about not creating huge, huge classes.

If you find yourself doing a lot of the same thing, and typing a lot of code in Python, you really maybe should question if your design is lacking some editing.

Could a base class remove a lot of repeated functions?

Could you collect a lot of @property methods into sub-objects like a dict or list?

Do you have a lot of similar but unrelated classes? Could you use a "factory" class to create the code for you?

Is there something you're repeating in lots of functions? Could wrappers and decorators help? What about creating a "wrapper" class instead of passing around lots of objects to lots of functions?

Hopefully this page gives you some helpful examples and ramblings that can clarify how to get the most out of Python classes as you develop your own code and software. Good luck!

image/svg+xml image/svg+xml

Table of Contents

Click on Section Title to Expand Contents after Navigating with the Table of Contents Link

Before working with, and designing, Python Classes, it's important to have a clear understanding about aliases, objects, and the underlying design of the Python interpreter.

Python is a Dynamically-Typed, Object-Oriented, Interpreted Programming-Language where every "construct" is an object -- with "objects" being instances of "Classes".

In Python, there are no "variables", there are aliases. An alias is the access-handle to a stored/accessible object in memory as managed by the Interpretor.

The Python interpreter is the python commandline application itself -- which is manageable through the importable sys (Python System) module.

In a statically typed language like C++, there are primitive types, custom types (classes), function pointers, and (data) pointers as the different kinds of variables. A C++ class creates a custom data type that can be used to create new variables ("objects") that can hold and interact with other data types.

In Python, everything is an object, everything is an instance of a class. But, all the aliases are "dynamic", in the sense that they're not statically connected to the underlying memory. They're aliases not variables. An alias in Python can be reassigned to objects of different types without any type-casting. So, in Python when you create a new Class, you're not really defining a new data-type, you're defining a new blueprint for objects.

All Python methods are "generic" in the sense that they operate based on object-methods, not based on the types of the objects. All dyadic (two-argument) operators function as left-hand method calls. The left-side (first) argument is checked first to see if it has a method associated with the operator. There are a few "right-hand" methods, but these are only conditionally used by the interpreter under special conditions.

The process for the interpreter is such: if a dyadic operator doesn't have a corresponding method in the Class of the left-side argument, the right-side (second) argument is checked for whether or not it has a method associated with that operator. If it does, then the method is called, if not, then an error is raised because neither argument has a method to handle the operator. At no time is the "type" of either of the objects checked -- at least not in the way that you'd expect in C++.

The other sections on this page go further in-depth with examples and tutorials about how to create, manage, and use Classes in Python. Since everything in Python is an object, Classes are extremely meaningful and are a huge part of creating sensible, reusable code -- so we hope this information is helpful.

One thing you've probably heard about related to Object-Oriented Programming is "encapsulation" or "compartmentalisation", where it's suggested that well-designed code tries to "hide" the inner workings in reusable, inheritable classes. This is the optimal design approach in a language like C++, where you have to write a lot of code and you want to be able to interact with a lot of different layers and "modules" of your software, so you need to have a rational approach to internal coordination in the codebase.

We're not saying that doesn't apply to Python, but an important thing to keep in mind is that nothing is every really "out of reach" in Python. As long as there's an alias, everything is public in Python. And this changes the assumptions about how and why we design classes the way that we do.

In Python, it's preferable to be terse and non-repetitive, especially because it's a high-level, interpreted language. We're further removed from the CPU, so we need to truly make sure we aren't trying to do the computer's job for it. We want to be thoughtfully sparing in how much code we write, because the more repetitious work can and should be done by the computer.

So, in Python, we really want to use classes to reduce duplication, not necessarily to restrict access or tacitly enforce "contracts", as we would in C++, for example. Reducing code duplication in C++ is often handled by functions and templates, and even then we're often limited in how much we can hand-off to the computer.

By design, Python does not supporting Templating like C++ does. Generic functionality is built into the language because none of the function calls (innately) do any type-checking, so there's no need to create C++-style templates that would otherwise generate multi-typed versions of the same code.

Note, though, that Python does support Multiple Inheritance, similar to C++, but due to the nature of aliases, Python does not suffer from the "Diamond Inheritance Problem".

All aliases in Python support overriding, while Python does not support overloading -- there's more about this in our Python Functions article. Since all aliases will innately override, the "Diamond Inheritance Problem" is irrelevant, because whatever alias was inherited last, wins. Every time you redefine an alias, you're overriding any previous versions of it, so there's never any ambiguity, as long as you finish a complete Python statement.

Python, R, and C++ are more similar in their Object-Oriented designs than other languages like C, D, Lua, JavaScript, and go. That's important to keep in mind, because what's optimal here for Python isn't going to translate to these other languages.

Hopefully this page helps provide some clarity to Python Classes. The other sections have concrete coding examples to help, and you can always visit our GitHub account to access free, open-source software and coding examples in Python and other languages:

github.com/tommypkeane

First, let's just make some simple, operational classes that provide generic functionality and show how the syntax works.

These aren't going to be great designs, and they aren't going to necessarily be super useful. But, sometimes, when you're just trying to do some prototyping or play around with the features of the language, it can be good to make simple, "ugly" classes that help you get the job done.

So first, let's just make a simple example of a NameTag class, where we construct it with a given name and then when we print the class, we use its __str__ method to show the name.

class NameTag(object):
  name = None;
  def __init__(self, myname):
    self.name = myname;
    return (None);
  # fed
  def __str__(self,):
    return (str(self.name));
  # fed
# class

Super simple.

All classes in Python derive (implicitly) from the base-class called object, so, personally, out of explicitness, we like to just put this in the declaration to show what the base class is. You can leave it out, and nothing will change, because it will be added-in for you, by the interpreter.

We argue that the benefit of showing the base-class name is that it clarifies where a lot of the functionality of a Python class is coming from. It could help clarify what to search for in the documentation for anyone new to Python, and being implicit in Software Engineering just always feels like being on the wrong side of history. Someone's gonna look back on your code, including yourself, and if your crime is that it's overly-explicit and overly-well-documented, then that's probably not a bad thing.

(Ignoring the fact that we didn't document this class ... for the sake of this article's already obliterated brevity.)

Ok, so now our NameTag class exists, we have the pre-construction method (__init__()) and the string "conversion" method __str__().

So let's put those two to use and make a couple objects and show what would happen if we pass them to the print() method:

nametag_a = NameTag("What");
nametag_b = NameTag("Who");

name_intro = "Hi, my is";

print(name_intro, nametag_a);
# Hi, my name is What

print(name_intro, nametag_b);
# Hi, my name is Who

Perfect! Exactly what we said. We made a class, we used it to construct two new objects that we assigned aliases to, and then we used those aliases in some print() statements which called str() on each instance (object), which in turn called the internal __str__() method to do the conversion.

Let's just clarify a few things, to show what we've done.

What if we redefine the class and take away the __str__() method?

class NameTag(object):
  name = None;
  def __init__(self, myname):
    self.name = myname;
    return (None);
  # fed
# class

What happens if we made the same objects and printed them all the same?

nametag_a = NameTag("What");
nametag_b = NameTag("Who");

name_intro = "Hi, my is";

print(name_intro, nametag_a);
# Hi, my is <__main__.NameTag object at 0x11028e880>

print(name_intro, nametag_b);
# Hi, my is <__main__.NameTag object at 0x11023d970>

Blorp! What's all that??

So, object-derived classes do not (by default) have a __str__() method, but they do have a __repr__() method.

The special method __repr__() refers to the object "representation", which is meant to be used to indicate the "unique" representation of the instance -- the object.

When there's no __str__() method, the fallback call is to use the result of the __repr__() method, to convert any object to a printable string. Since that method is defined in the object class, which is an implicit base-class to every custom and built-in class in Python, then every object has a __repr__() method, unless it's been explicitly deleted by you.

That's good, because that means everything can be printed.

The problem is that for most class, __repr__() isn't overridden, and the default version is used. The default, as you can see in the code listing above, just prints out the name of the class, scoped to wherever it resides, and then the hexadecimal memory address for the start of the object. That address makes it so that every representation will be unique for differing objects, and so that you as the developer can tell if your aliases are pointing at the same object or different ones. This is a very useful debugging tool.

However, a lot of people came to realize that since all Python class methods and members are public, we can override the __repr__() method, and actually a conventional way to override it has come-up.

Now, this isn't prescriptive -- you don't have to do this, and there are reasons not to -- but, a common use for the __repr__() method is to construct and return a string that equates to the __init__() call that would reconstruct the current object.

Let's show an example with our NameTag class. So instead of creating a __str__() method, we're going to override __repr__() from object -- and remember that there's no special syntax for an override or reassingment of an alias (the last one to be defined wins!).

class NameTag(object):
  name = None;
  def __init__(self, myname):
    self.name = myname;
    return (None);
  # fed
  def __repr__(self,):
    return (
      self.__class__.__name__
      + "(\""
      + self.name
      + "\")"
    );
  # fed
# class

So now, let's just print just the objects, to see what __repr__() provides:

nametag_a = NameTag("What");
nametag_b = NameTag("Who");

print(nametag_a);
# NameTag("What")

print(nametag_b);
# NameTag("Who")

Cool! As we said, this is a convention (not a rule) that you'll see a lot of code follow, where we use the __repr__() method to basically print out the valid Python code needed to recreate the current object.

The downside to this approach, is that now we can't see the memory address and find-out if we're looking at the same object or just multiple deep-copies of it. For debugging purposes, it would require extra work to find out something that we could've known explicitly.

So, again, this is a convention that some people use, but you don't have to copy the approach. If you're debugging issues of copied objects or losing track of your aliases, this could actually make things worse for you. But, if you're trying to provide a class that gives you a handy reference on how to recreate it, this can be a great way to do that. You could even go a few steps further and use keyword arguments in the printed-out "constructor" call to show what each argument means, and give yourself a bit of referential documentation in the process. There's a lot that could be done here. But that's the basics of how to construct and print a class.

Again, __str__() is called if it exists, but if not, then __repr__() will be called. Note that if you put objects into a primitive iterable like a list object, then call str() (or print()) on that list, the list's default behavior is to call repr() on the inner objects (which calls their __repr__() method, even if they have a __str__() method).

If you're familiar with other Object-Oriented Programming-Languages, you may be expecting that Python would also have "access restrictions" for its member elements and functions.

By design, all members (data and functions) of classes are public in Python.

One of the problems with public members in classes is that if you don't establish good practices and design patterns straight-away, you're going to run the risk of having derived classes that don't gain a lot of the polymorphic reduction in code duplication.

By design, you want to embed as much foundational functionality into the base class as possible, and then only relatively-incrementally add functionality in derived classes. The issue of using a public member instead of a public accessor-function, is that you risk missing-out on well-defined, encapsulted functionality that goes along with the accessor methods. Things like logging, scaling, and conversions all become much easier to handle through base accessor functions passed along to derived classes.

The following sub-sections talk about how to create "protected" and "private" members. This is not necessary in Python, because they don't really truly do anything; but it can be good if you feel compelled to follow a C++ style approach and you want to obscure things, even just a little bit.

Documentation is always the best way to convey a design contract in your software, but these stylistic conventions can of course be used in tandem with good documentation to make your code more readable and more "standardized".

Protected Members

There is no such thing as a "protected" member in Python, though by convention it is suggested that protected elements be named with a single underscore as a prefix for the alias. Again, this does not actually "protect" the member from being accessed publicly. It's still accessible and modifiable through the dot operator on any class instance.

Here's a quick example to show this syntax convention:

class Vector2D(object):
  _x = None;
  _y = None;
  def __init__(new_array):
    self._x = new_array[0];
    self._y = new_array[1];
    return (None);
  # fed
  @property
  def x(self,):
    return (self._x);
  # fed
  @x.setter
  def x(self, x,):
    self._x = x;
    return (None);
  # fed
  @property
  def y(self,):
    return (self._y);
  # fed
  @y.setter
  def y(self, y,):
    self._y = y;
    return (None);
  # fed
# ssalc

In the above design, we've created a custom class for a 2D Vector, which holds the x and y values as "protected" members _x and _y. Of course, we can still access the "protected" members however we want, but what the class design shows is a preferred "contract" for how developers should use the class. The @property decorators provide setter/getter access to _x and _y through the public accessors x and y. The idea should be that we leave _x and _y alone, and we really shouldn't need to touch them.

Again, they're not actually "protected", but the underscore indicates that for developer purposes they should be treated as such. It's just a syntax convention, so you can follow it or not, but it tends to add extra readability by using a single underscore.

Private Members

Now, despite the fact that all members are public by default, there is a "special" syntax that is supported by the interpreter to create "private" members.

Again, it's probably not surprising to find-out that these "private" members aren't actually private.

In this case, we prefix any alias that we want to treat as a private member by using double-underscores. Only as a prefix though. Remember, double-underscores as a prefix and suffic are specially reserved names that should only be established by the interpreter.

Once we create our "private" member-alias, the interpreter will do some alias-shuffling, similar to decorators, and actually create a new alias for us, based-on what we originally provide. In this way, if you look at the code as is, you'll see a member that starts with a double underscore but if you try to access it, it will be declared as "not found".

Let's recreate the same Vector2D class, and change _x and _y to "private" members __x and __y:

class Vector2D(object):
  __x = None;
  __y = None;
  def __init__(self, new_array,):
    self.__x = new_array[0];
    self.__y = new_array[1];
    return (None);
  # fed
  @property
  def x(self,):
    return (self.__x);
  # fed
  @x.setter
  def x(self, new_x,):
    self.__x = new_x;
    return (None);
  # fed
  @property
  def y(self,):
    return (self.__y);
  # fed
  @y.setter
  def y(self, new_y,):
    self.__y = new_y;
    return (None);
  # fed
# ssalc

This new class is still valid because __x and __y can be used in our code within the definition(s) of the class, and it's only after the class is finalized by the interpreter, that the aliases are modified. This avoids and reference issues, but there's the esoteric nature of how "private" members are created.

As written we can actually call the following:

vec = Vector2D([9, 3,],);

print("x:", vec._Vector2D__x)
# x: 9

print("x:", vec.__x)
# AttributeError: 'Vector2D' object has no attribute '__x'

As you can see, we actually have public access to a _Vector2D__x member, but when we try to access the __x member, per how we had originally written the class, it doesn't work.

What the interpreter did was take the original name __x and delete the original alias after re-assigning the member object to a new alias with a prefixed underscore followed by the name of the class.

The new alias format is created essentially with this code:

exec(
  "vec" + "." + "_" + vec.__class__.__name__ + "__x"
  + " = vec.__x;"
);

If you wanted to undo the privatizing, you could call the following:

exec(
  "vec.__x = "
  + "vec." + "_" + vec.__class__.__name__ + "__x;"
);

Note that we're using exec, not eval, because we are doing an assignment statement, not providing a standalone expression.

While this may not be useful, let's just go one step further and create a deprivatize() method that uses the object __dict__ to find all members matching the "private" member alias pattern, and recreate their original aliases.

import copy;

def deprivatize(obj):
  """Deprivatize the implicitly private member aliases of a class instance.
  We need to deep-copy the original object's member dictionary so that we
  don't step on our toes by trying to iterate through a dictionary that
  we are modifying in each iteration.
  """
  classname = obj.__class__.__name__;
  obj_dict = copy.deepcopy(obj.__dict__);
  for key in obj_dict:
    if (key.startswith("_" + classname + "__")):
      exec(
        "obj." + key[len("_" + classname)::1]
        + " = obj." + key + ";"
      );
    # fi
  # rof
  return (None);
# fed

And here's what you'll see if you use this method:

vec = Vector2D([9, 3,],);

print(vec.__dict__)
# {'_Vector2D__x': 9, '_Vector2D__y': 3}

deprivatize(vec)

print(vec.__dict__)
# {'_Vector2D__x': 9, '_Vector2D__y': 3, '__x': 9, '__y': 3}

We could update the deprivatize() method to be fancy and delete the "mangled" aliases as we go, to entirely undo the interpreter's alias-shuffling that's gone on, but this whole thing was just an illustrative example.

You should really have a reason to do this. If you don't like this aspect of Python, then the easiest option is to avoid using aliases in classes that start with double-underscore, and then you'll have really explicit code.

Personally, we find this whole alias shuffling thing to be a bit too much, and definitely too esoteric. For readability's sake, we would not suggest using the double-underscore "private" aliases. It's probably going to cause more trouble than it's worth, especially since nothing really ends-up private anyways.

We would suggest you be explicit and document well. If you want to impose a design contract on your class, don't rely on the code to be "self-explanatory" -- it rarely ever is. You should always document any expectations for usage of the class, its members, and its methods, especially if you have certain design expectations. Even the @property decorator can be seen as optional by someone who doesn't like using it. So, it's good to establish a pattern, to show the usage explicitly, but also to document it.

Worst case scenario, someone doesn't read any of the documentation and they mess it all up. And if they complain that your code is broken, you can point to the documentation and provide the Manager's Mantra: "read the friggin' manual". No need to worry about what you know, as long as you know how to read, and it's been written down, then you're all set. So never skimp on the documentation.

(Again, our code here is very very lacking in documentation, but that's only because there's so much rambling all around it, we need to make it clear and concise and so you're able to just copy-paste it into your interpreter and run it as is. The "production" version of this code would all be heavily documented, and should also have unit-tests to prevent feature regressions with any future development.)

image/svg+xml