Data-Oriented Tech Stack

How useful Unity's Data-Oriented Tech Stack as it's nearing release? I'll take a look at what DOTS provides and how useful that can be for your own game development endeavours.

13 October 2019 - 8 min read

DISCLAIMER: I started this way back and I’m just giving up on trying to edit it into a coherent piece. There’s probably still some value in what I wrote here, but I felt it better to just release it as is so that I’m not stagnating on trying to constantly want to rewrite this.

I started writing this shortly after Unity’s Keynote at GDC knowing that some awesome things are coming to Unity, of which I think the coolest piece of tech is their new Data-Oriented Tech Stack or DOTS for short. It consists of the C# Job System, the Burst Compiler and an Entity-Component-System or ECS framework for short. I’m going to cover the C# Job System and Burst Compiler in short, but ultimately they’re supporting the ECS framework in upholding the performance by default mantra and I don’t quite feel qualified to delve deeper into what the two components provide. Initially, I had wanted to provide a bunch of examples of using ECS, but I’ve realised that would be nothing more than a manual. My main focus will be the ECS framework and how nicely it’s integrated with the rest of the Unity engine as well as the different bits and bobs available right now.

C# Job System

The C# Job System is taking a somewhat different approach to multithreading compared to most approaches I’ve come across in my day-to-day. The basis is taking a much more data-oriented approach towards understanding how pieces of data can be processed without requiring write access to some shared bit of memory. I would go as far as comparing the bulk of what the C# Job System does to Parallel.ForEach except not immediately starting the processing of the work. There’s a scheduler that will ensure the work is started at some later stage that makes sense.

The scheduler is also very clever in detecting when two jobs would be writing to the same piece of memory and warn the developer about this. It’s essentially taking care of race conditions before it even happens and can be a lifesaver. There are some very strict rules applied to the types of data used, but this is to ensure that you as a developer is always writing the most optimal code.

Burst Compiler

This is a piece of technology that blows my mind. It’s a special compiler that can take the IL code that gets output from the C# compiler and turn that into assembly code. Now one thing to keep in mind is that Unity does require one to use a very specific subset of C# they’ve dubbed High-Performance C# or HPC# for short.

One thing I like is that the Burst Compiler is also working on vectorizing your code to leverage SIMD operations available on processors these days. They believe that when code doesn’t vectorize it shouldn’t just compile and be 4 or more times slower, it shouldn’t compile at all. Now I’m not completely sure how true this is of the Burst Compiler yet, but I do think this will be a great addition to the compiler technology available today.

ECS

Now, this is the meat and potatoes of DOTS. I’ll quickly explore how the ECS framework is put together, but first a quick recap of what ECS is. It’s essentially an architecture that decouples the state of the application from its behaviour. The gist is as follows:

An entity is built up using a bunch of components, but this doesn’t mean that an entity should be anything more than some index. The components attached to the entity serve nothing more than a data container that serves as a logical grouping of certain attributes. Systems then operate on these components and apply behaviour that updates the state of the components.

ECS building blocks

The DOTS ECS framework is built with a few crucial building blocks in mind. The centre of everything is the EntityManager that takes care of the memory management side of storing entity data as well as provides an API to query for data. Alongside this there are two system types:

ComponentSystem - systems that operate on data on the main thread
JobComponentSystem - systems that provide a way to schedule jobs while maintaining a dependency chain of jobs that have been scheduled by other systems.

Systems expressing the data that they’re interested in through the EntityManager, helps the job scheduler figure out how to schedule jobs in a desirable order that prevents race conditions.

Depending on the job type, data is either manipulated in place or through the use of an EntityCommandBuffer that allows changes to be played out at a later stage during system execution order.

Tables of data

Thinking about this, one could easily make the analogy that ECS treats your data as a SQL table, breaking your data processing into row-by-row steps. This is something that promotes a good memory layout to leverage CPU prefetching and it’s a layout that can have multithreaded principles applied without the large risk of race conditions.

One thing to remember though-and I’ll admit I’ve fallen into this trap - is that not all problems fit into this mindset. It’s furthering linear memory access to gain significant performance gains, but some problems access random memory merely because of the nature of the problem. Something like an AI planner that needs to look up arbitrary information on an entity is a good example of this. Unity has made sure that the Burst Compiler still optimises this code so using the ComponentDataFromEntity API doesn’t cause you to lose out on performance.

System State

The video embedded below is a very good example of an ECS framework that Blizzard had implemented and my understanding is that it’s gotten quite mature. I’d advise watching it at some stage, but I’ll summarize some of the key takeaways I had from the talk.

An interesting problem that I had to deal with when using the DOTS ECS framework was how to handle system state. Unity has introduced the concept of a singleton component which provides the same read-write checks, but there’s still a hard constraint on having to use HPC#.

Now, this is not the only way one can go about this, but the other way would be to maintain state on the system and just get a reference to the system and read state from there. This is still very usable, but one has to remember that the state data doesn’t always support being used from a job context and there’s little to no seamless support for the job scheduler to know when systems would be reading and writing to that data. Ideally one should strive for not altering the state of one system from another system, but it still doesn’t solve the case of having to manually maintain dependencies when job-friendly containers are used.

This is probably my biggest critique of the framework at the moment. There’s likely a big need to share system state among other systems, but not an API that is easy to use and will manage the dependencies to be sure that data isn’t read while it’s being written to.

Engine integration

With only the core API out of preview, there’s still a long way to go for this framework. Unity’s built a multitude of engine components that aren’t able to leverage the power of DOTS yet, but they’ve laid out a roadmap that’s aiming to deliver quite a lot in the next few years. Unfortunately, it means that not everything is there yet.

They’ve partnered with Havok in providing an ECS physics engine that’s currently in preview and it’s possible to render 3D objects. Using Project Tiny it’s possible to do quite a bit in 2D, but from what I understood there’s still more work to be done there and the platforms Project Tiny supports is also limited.

Lastly, there has been work on integrating certain components to enable using Jobs for these, but it still feels very removed from the ECS framework as the majority of it is concerned with providing a way to manipulate the components using the job system.

All that said it’s still very possible to have systems interact with regular Unity components on the main thread, it’s just that there isn’t necessarily so much of a performance gain. It also does present some odd cases in handling the more “event” driven architecture that they’ve opted for on the current 2D and 3D physics engines, but that might be my lack of understanding on how to approach problems like this.

Final thoughts

As a whole I’ve been following along development of DOTS since I caught wind of it around the end of 2017. It’s pushed me to become a much better developer as well as have a deeper respect for the hardware that our software is running on. With the core API out of preview now I’m very excited to start seeing more of a focus on writing game code in this manner and enabling smaller teams to achieve larger-scale games knowing that they have an engine team backing them.

It’s also opening up Unity to building a decent visual scripting tool that’s leaving me super excited to start using. As with anything, there’s a big balance to strike with its usability, but it’s something that might enable someone to do a first pass on an idea that can then be more easily optimized at a later stage. I’m going to be thinking about using DOTS more regularly for my projects.