Sending data over a network connection in the enterprise space leaves us with many options. We’re essentially spoiled for choice and can quickly set up a schema for sending and receiving messages with little to no in-depth knowledge of how the data gets converted into bits for sending. XML, JSON and Protocol Buffers, to name only a few, provide tools that allow you to quickly and easily define the schema used and take care of serialising your data into that schema before it gets converted into a byte stream. What makes these tools shine, in my view, isn’t that they are “fast”, but rather that they are fast to use as a developer. We introduce a bit of overhead in our data transmission because it makes debugging our data transmissions easier, be it by being able to convert the binary data to text quickly or not having to write code to handroll custom conversion of many different data types that might be present in a single message.
Depending on the game they’re making, Game developers have a slightly different set of constraints compared to your average enterprise application. Performance and bandwidth tend to play a more critical role than debuggability or having easily readable messages. I’m not saying that these considerations aren’t present in the enterprise space, but rather that it’s not the first thing we think of when building an API.
The Case Study
To make the demonstration a little more visual, I will be using a small simulation to demonstrate the concepts I’ll be using. The simulation will consist of a single entity type, and the small environment the entity can move in will drive its state.
Rules for the entities are as follows:
- Entities can spawn every 1 second
- A max of 16 entities can exist
- Entities spawn in a zone at the top left of the screen
- Entities spawn with a random velocity of which the X & Y components range from a value between 1 and 7 (inclusive)
- Each frame an entity’s position is incremented by its velocity
- An entity can not fly off the edge of the screen
- The entity will “bounce” away by having either the X or Y component of its velocity multiplied by -1
- When the entity bounces away, its colour will also be inverted.
- Enemies moving into a zone at the bottom right of the screen will be destroyed
- The simulation space is 800 x 480 units
- The simulation is done framerate dependent, enabling using
intas the numeric type of choice
I chose some of these rules to demonstrate some concepts, so while some of the numbers seem a little arbitrary, they will serve a specific purpose. These rules also highlight how game developers tend to think creatively working within the platform’s constraints.
I’m slightly altering the simulation loop compared to how I’d usually set up a simulation loop. Typically, I’d implement it by updating the simulation state and then rendering for each frame, but we’re dipping our toes into the realm of sending network messages here. Showing how to set up a server and client might be interesting, but it’s outside the scope of this exercise. So instead of trying to synchronise a simulation loop on two separate processes, I’m joining the two together with an in-memory byte array as my Frankenstein suture of choice.
It’s going to change my simulation loop from a 2 step process to a 4 step process:
- Update game state
- Serialise game state to a byte array
- Deserialize game state from the byte array
- Render game state
Steps 1 & 2 serve as the server’s simulation loop, and steps 3 & 4 serve as the client’s simulation loop.
Serialization & Deserialization
Data serialization & deserialization is handled by two interfaces:
These interfaces allow the simulation to pick which strategy to use and remove the need to write a bunch of glue-code. The overall architecture here isn’t critical, but I’m highlighting the existence of the two interfaces to make it easier to find where the important code lives.
My baseline approach is to use the built-in .NET JSON Framework. It’s a robust tool that’s tightly integrated into many aspects of ASP.NET, and I’m sure most developers who’ve worked with ASP.NET recently have had contact with this.
I was a little naughty and allowed the use of some MonoGame types in my “domain”, but I chose to do so because I can’t be arsed to go handroll mathematical types if the framework already provides a great baseline. It does mean that I now couldn’t serialise and deserialise directly to the “domain types”, but the introduction of a simple
EntityDTO made simple work of my problems.
A glance at the
GameStateSerializer and you’ll notice the majority of the code consists of configuring the serialiser and mapping the game entities to the
EntityDTO type. The configuration highlights the tool’s power, and it can you can further trim it down if you’re not being as pedantic about avoiding JSON attributes as I am.
The same is true of the
GameStateDeserializer. Converting the byte array back to an array of
EntityDTO objects is a mere two lines of code.
Running the sample should show the simulation state is serialised to a minimum packet size of 171 B to represent a simulation with only one entity, and it grows to 1.5 KB for a total of 16 entities. Right now, these numbers mean nothing, but I hope to show the impact a different approach can have.
The JSON data is currently a very “dumb” approach. Depending on how your JSON framework handles
null objects, you might see they’re omitted wholly from the resulting JSON, or a null’ entry represents the entity in the resulting JSON. I’m also moving away from representing data as text. Text brings a lot of overhead, especially if the majority of your data is numeric. Let’s compare the impact of representing the number 127 in ASCII text vs binary. ASCII text represents each character using a single byte 1, causing 127 to be represented by 3 bytes if converted to text before being converted to bytes. On the other hand, the number 127 still falls between 0 and 255, so converting the number directly to binary will use only a single byte.
The example in the previous paragraph shows how there’s a difference but doesn’t quite explain why the binary protocol saves bandwidth. It’s down to the fact that a JSON string embeds the entire schema. All of the property names and syntax symbols all add extra fluff that gets sent across the wire. The reality is I’m still sending
int values, and they’ll use 4 bytes regardless of how small the number is and will only start saving bandwidth if your data is pushing over into being five and more digits.
I’m leveraging knowledge like this and using the
BitConverter helper class to convert the various numeric types. I’ve also omitted the
null entities to simplify the process a little. The rendering side of the simulation doesn’t care about destroyed entities. You will note I’m moving away from the comfort of a tool that can spit out a byte array, and I have to dig a little into using a
MemoryStream to take care of the process.
This serialiser is still very compact, and it gets the job done. The availability of the
BitConverter helper class is of enormous help in not having to figure out how to do byte conversions myself, but I’m also not putting it outside the realm of possibility if I ever need to do it. Once you have the collection of helpers, it makes your life a lot easier, and you can get quite a bit done once you’ve built up a library of helper functions to convert the majority of numeric types.
Deserialisation is the really scary bit. You’re a little blind if you don’t know how to interrogate a byte array. What’s important is that your code to deserialise stays in sync with the code that serialises. I’m matching the steps I took to serialise the data exactly, the only difference being that I’m reading and not writing. The caveat is also to consider the sizes of the numeric types2 you’re serialising. Note that because I’m following the same process to deserialise, I don’t have to worry about how many entities there are. I can keep reading from the byte array until I get to the end. It’s one way to know you’ve got a bug in your deserialising code because you’ll notice you’re either stopping short of the end or overshooting the size of the array.
The simple step of moving away from a text representation of the message has an immediate impact. The minimum packet size is now down to 21 B, and it grows to 336 B to represent a total of 16 entities. On the surface, this is a significant gain, but you’ve gained it at the cost of having to handroll the reading/writing code whenever your schema needs to change. You’re also unable to run a client and server that’s out of sync on the schema even if you consider the schema backwards compatible.
If you need to bring down bandwidth usage, this is a powerful tool to have in your toolbelt. We’ve also taken a step toward omitting data that isn’t relevant to the receiving side, but this is contextual, and that’s why it’s important to have a good understanding of your domain.
Here is where we flex our muscles as software engineers to identify some further optimisations we can make. There are a few pieces of data that can have their footprints reduced further, and we can omit some pieces of data that completely:
- The simulation space of 800 x 480 units is well within the range of representing each axis as a 2 byte
shortvalue. The small simulation space halves the data footprint of position components, which we previously represented as 4-byte
intand saves us a total of 4 bytes per entity.
- The rendering code doesn’t care about the entity’s velocity, and we can omit it altogether, saving a further 8 bytes per entity.
- The entity’s colour is represented by 4 bytes, but we use 1 byte for the alpha component of the colour, which is left untouched. Extracting the Red, Blue & Green components as byte values only allow us to get a further 1 byte per entity reduction.
In total, get a 13-byte reduction per entity! At the small scale, we’re simulating, that isn’t much, but scaling up the simulation means these returns could be valuable. Huffman coding can be a powerful tool if your data has a handful of patterns that compress nicely, but it will have a more significant impact if you can find a way not to send unnecessary bytes.
Notice that omitting velocity saves us from needing a few lines of code here, but it is also important to notice the cast to
short. Casting to
short also ensures that the
BitConverter will output a byte array containing only 2 bytes instead of the 4 bytes if you tried converting the position components as
This is the point where it’s important to mention that if you’re omitting some data, the receiving side should know how to fill in the blanks. It’s OK not worrying about velocity, but with colour, we were lucky and could rely on the provided
Color type in the MonoGame framework to initialise correctly with only the Red, Blue and Green components.
Notice how we’re squeezing more and more out by building a lightweight data-transfer protocol, but we are coupling the server and client-side very tightly to one another. It’s important to consider these trade-offs because it’s part of the process of writing your binary serialisation code.
The simulation size is now down to only sending 8 B for a single entity and scales up to 128 B sent for 16 entities. The basic binary and domain-specific binary implementations here scale pretty linearly, but the gains over JSON are clear.
Delta Encoded Binary
The concept of delta encoding is taking your knowledge of this domain and stretching it almost as far as it can go. It’s not a silver bullet, but my understanding is as follows: sending data that change by small amounts as the difference to a previously known state. I’ll admit I’ve carefully chosen the rules to compact the gains here as much as possible.
Knowing that an entity’s velocity in a single axis of movement can’t be greater than 7 in a single direction opens up some interesting possibilities. Representing a number between -7 and 7 can be done with only 4 bits (a nibble), but I opted to use a
byte instead. The velocity is now also a handy delta to use, seeing it is already the amount an entity’s position gets incremented by on each frame.
The colour is the next piece of data we can inspect. When created, an entity gets assigned a colour, which gets inverted whenever the entity hits the edge of the screen. An entity can only have two colours associated with it during its lifecycle, making it the perfect candidate to be represented by a
All this fun does introduce a problem: you need to synchronise the initial state to the client. You can’t send a delta to the client if the client doesn’t even know what the starting state is. I opted to break the protocol up into three sections:
- Spawned Entities
- Destroyed Entities
- Updated Entities
For a spawned entity all the relevant state is sent so that the client can cache that value to use as a reference for future deltas it receives. Updated entities will be a list of all entities that have to be updated, and lastly, removing all the destroyed entities so that the client knows to stop rendering them.
It should be immediately apparent that delta encoding is quite a bit more complex, and it becomes important to capture the previous state because of the need to make comparisons to calculate the delta. There are various approaches to this, and it might require a bit of experimentation to find a good balance before you get a solution that works reliably.
I’m only scratching the surface. I’ve often struggled to find some inheritance hierarchy that would nicely serialise to JSON and allow me to send “different” types of messages to the same endpoint. Exposing the serialisation process will enable me to exploit it and inject different strategies for creating a message and rebuilding the data on the other side. It opens up the possibility to include discriminators along with the size of what you’re trying to deserialise, allowing you to put multiple messages into a single byte array. I’ll leave that as an exercise for the reader to go and discover, but it’s something that I’ve found helpful when designing a protocol for pushing a high volume of messages, even if I’m using a more traditional tool like JSON.
This approach only addresses half of how a game developer might send data for a multiplayer game, but it’s more than sufficient to do some interesting things. We tend to send data via TCP/IP, a reliable messaging protocol, but it’s possible to establish a reliable messaging protocol on top of UDP and benefit from very low latency in your messages. TCP handles delivering messages in order, which could introduce an unnecessary amount of latency and jitter into your data. Establishing a lightweight acknowledging protocol can allow the server to send delta encoded messages to that latest message. The client doesn’t have to care about receiving messages in order, and it can simply take the latest message received, apply the delta, update the latest message received and continue with its business, discarding any older messages than the latest received.
Maybe one day, I’ll take a stab at doing a write-up that delves into more detail, but I’ve hopefully equipped you with enough knowledge to experiment a little and see for yourself that it’s not as daunting as you might’ve thought.
Glenn Fiedler has a great set of articles on his website, Gaffer On Games, that I’d recommend as supplemental reading. He goes into more depth about the networking side and shows some other strategies for compressing data.
An ASCII table is essentially an agreed-upon table that converts values from 0 to 255 into characters. Thus the one-to-one mapping represents each character by using a byte. ↩︎
The built-in C# integral numeric types have a specific size associated. It’s handy to be familiar with this when you’re manually serialising to binary because you can easily introduce bugs by either not reading enough bytes or reading too many bytes and converting back to your expected numeric type.