Object-oriented programming is very bad for high-performance programs
like graphics- and simulation-intensive video games. Or, at least,
traditional object-oriented design is.
The main culprit is the cache. If I have some thing in a game—say,
an enemy character—then it needs a lot of data associated with it: its
3D meshes and textures, its physics information, its current position, its
inventory, its AI state… But not all that information is necessarily
being used at any given time. For example, when I am running the
AI processing step, or calculating physical forces acting on
characters, or drawing things in the game, then I'm only using a small
part of all the information associated with a thing.
In a proper object-oriented design, I would take all the information relevant
to a thing and package it up in an object: for example, in a
Game_Unit
object. That means, however, that in order
to operate on just a subset of that information, I need to pull
the object into the cache, which puts all the object's information
in the cache. This means the cache gets full of extraneous data,
and consequently cache misses become much more common, making the
program slower overall.One way of avoiding this problem is using so-called
Data-Oriented Design,
which involves pulling apart the data stored in objects and storing it
all separately. This has the advantage of being much more cache-friendly,
but it generally doesn't have language support, and so has to be done
manually. The advantages of typical object-oriented design—like encapsulation
and data hiding—are much more difficult to retain, and the language you
use might actively fight against you in these cases.
Following a similar path, some programmers have started using and advocating
Component-Entity Systems,
which are a few steps beyond Data-Oriented Design in roughly the same direction,
but also a radically different way of structuring programs. At a high level,
Component-Entity Systems work like this:
- An entity is an abstract identifier. It has no other data directly associated with it.
- A component is a table which maps entities to a collection of data, ideally to pieces of simple scalar data. Not every entity needs to appear in a given component.
- An entity can be associated with several components. That means the entity can be used as an index into the component, allowing the programmer to retrieve or modify the associated data stored in the component.
- Operations are written in terms of one or more components. These are sometimes called systems.
This all sounds very abstract, so what does this look like in practice? Let's
describe a basic game. For the sake of simplicity, let's assume a
we have four salient pieces of information for an in-game unit:
its position, its current health, its appearance, and the state of
its AI (including current goals and actions):
component Position(int x, int y);
component Health(int x);
component Appearance(Image img);
component AI_State(State st);
Every entity in the game will be associated with one or more
of these components: scenery will have
Position
and
Appearance
data; invulnerable characters might have Position
,
Appearance
, and AI_State
, but no Health
; a player
character will have Position
, Health
, and Appearance
,
but no AI_State
; and so forth.
Well: I've been saying that entities have this data, but it's more
appropriate to say they're associated with that data. All
the relevant data is stored in the components, and the entity
is being used as the index used to access that data.
If this is hard to visualize, think of components as database
tables, and your entity as the primary key used in all your tables,
which you can use to access or update that data.
You of course wouldn't want to actually implement a game like
that, but it's similar in spirit.Now, to write the salient operations of our game, we can
write them in terms of one or more components: when we draw
our game, we write the draw operation in terms of
Appearance
,
which allows us to loop over everything that has image data
associated with it. On the other hand, when we want to move units
around, we'll write an operation in terms of both the AI_State
and Position
components, because we need to know what
the unit plans to do in order to update its position.The advantages of Data-Oriented Design that I described earlier are
still in effect, because all the data associated with a component
can be stored packed together: looping over the
Appearance
component won't bring any non-Appearance
data into the cache.
But Component-Entity Systems have an extra advantage
over pure Data-Oriented Design: you gain compositionality in
a way that's not necessarily present in other designs.
This blog post about applying component-entity design to Bomberman
has a section called
Consider the possibilities of your new Components, in which
the author explores the compositions of components as interesting avenues
of discovering new gameplay. It also goes into a lot more detail
about what a component-entity approach would look like.
For example:
you might design a game with a
Health_Pickup
component for
items that restore health when a player interacts with them. An
entity that is associated with both the Health_Pickup
and
AI_State
components will act as a mobile health powerup
that can choose how to move around using some kind of AI routine.
On the other hand, an entity associated with both
the Health_Pickup
and Health
components is a health pickup
which can be destroyed, perhaps so that it cannot be used by an
opposing player. In both those cases, no extra implementation work
would need to be done for these conjunctions of components: the new,
interesting functionality falls out naturally from implementing
each feature in isolation.While Component-Entity Systems are interesting,
there are no languages that are inherently component-entity-oriented.
At least, if there are, I don't know about them.
Component-Entity System are usually implemented using existing
object-oriented languages.
So my pseudocode examples above which used the
component
keyword
were all pure fiction. But what would a component-entity language
look like?I'm going to change pace a bit and discuss an old and sadly
mostly-forgotten paper:
Harrison and Ossher's 1993 OOPSLA paper,
Subject-Oriented Programming: A Critique of Pure Objects.
The primary motivation behind the paper is addressing
what they see as a flaw in object-oriented design: most
object-oriented languages include inheritance, and therefore involve
a tree of is-a relationships. However, objects don't exist in
just a single place in a single ontology: a set of objects can be seen
as occupying multiple places in multiple ontologies.
As a concrete but slightly fanciful example: do we choose a culinary
ontology for our program and write
Tomato extends Vegetable
,
or do we choose a biological ontology and write Tomato extends Fruit
?
What if one part of the program needs one and the other needs
the other?More concisely, as Jorge Luis Borges said:
From the Jorge Luis Borges essay
The Analytical Language of John Wilkins.
It is clear that there is no classification of the Universe that is not arbitrary and full of conjectures. The reason for this is very simple: we do not know what kind of thing the universe is.
Harrison and Ossher propose that, instead of dealing with objects
that are instances of classes, we deal with subjects, which
can be seen as instances of classes. Any time an object is
interacted with, it is interacted with in a subjective context,
in which the data and operations associated with it can be different
depending on the subjective context. The only thing shared between
the context is the identity of the objects:
The essential characteristic of subject-orinted programming is that different subjects can separately define and operate upon shared objects, without any subject needing to know the details associated with those objects by other subjects. Only object identity is necessarily shared.
Harrison and Ossher's proposed system is in some respects very
similar to component-entity systems, but is in other respects quite
different. There is certainly commonality: what a component-entity
system calls an entity, a subject-oriented language calls an
object-identifier or an oid, and both consider entities
or oids to be abstract identifiers with no directly associated
information.
In a subject-oriented language, an operation must exist within a
given subject activation, which exposes pieces of information
associated with the entity and a set of operations. An entity can,
within a given subject activation, have fields and methods, and
those fields and methods can be entirely distinct from the fields
and methods exposed by a different subject activation. Every
method invocation, therefore, has to exist within a subject
activation, so that we know the actions and fields available
within that subjective frame.
A salient difference is the way that subject-oriented programming allows
certain pieces of information, or certain operations, to be shared
among different subject activations. Harrison and Ossher's repeated
example involves a
Tree
object shared by, among others,
a Woodcutter
subject and a Bird
subject. A Woodcutter
's
view of the tree has an estimated value, which the woodcutter
might use to determine whether the tree is worth cutting down. On
the other hand, a Bird
's view of the tree involves its
suitability for building a nest in. In both cases, though, they might
care about a piece of information like the tree's height.However, Harrison and Ossher's approach to this issue seems awkward:
they suggest that,
rather than straightforwardly sharing the height between the subjects,
the
Bird
subject and the Woodcutter
subject should both have their own copy of a field representing
the tree's height, and
that the two must be made to agree: they must return
the same value, or some compatible value (for example, by returning
some value which is commutative.) If they fail to agree, the
program throws an exception. This is almost certain to be a
source of frustration in practice, or at least a major source of
gotchas.The Harrison and Ossher approach also describes how to mediate
two distinct object hierarchies, so that different subject
activations can use inheritance over the same set of classes
in very different ways. (The
Cook
subject, for example,
could use Tomato extends Vegetable
, while the Botanist
subject could use Tomato extends Fruit
, so a given
oid can be seen by both as being situated within different
hierarchies.) They then go on to describe how one subject's
class hierarchy might be incomplete with respect to another
subject's hierarchy, and describe how to match those hierarchies
together, or infer class hierarchies based on interfaces or other
mechanisms.I would argue that the best thing to do is to combine the high-level
details of Harrison and Ossher's subject-oriented language design
with the specific mechanisms used in component-entity systems.
They clearly have a common starting
point and a similar approach to modeling the world, in which abstract
entities can be viewed in some context as having associated operations
and information. The Harrison and Ossher approach unfortunately gets caught
in a quagmire of hierarchies and modeling, but much of that complexity can
be alleviated if we treat subject activations like
sets of components: suddenly, the
Bird
and the Woodcutter
subjects/systems can simply share the TreeHeight
component,
without having to resort to awkward and complicated agreement
strategies on the hierarchies or results involved.As for the specifics of what a component-entity language might
look like, I leave that as a creative exercise for the reader.
I do have ideas. Someday I will implement them.