C# Mutability Hacks


I’ve recently been working on a little roguelike framework/library/package for Unity to make the creation of roguelikes much simpler than starting from scratch every time. I am following a very good Rust tutorial by Herbert Wolverson. If you’re interested in reading more about him feel free to go have a look on his blog for more information.

The process of porting tutorial code that’s in Rust using a library very similar to Unity’s Entities package has been a very interesting one. I’ve come to develop a great deal of respect for the Rust language, but I’ve also fallen in love with some of the capabilities that C# has. Unity has done some cool things to leverage C# features to make a cool API.

Unity being clever

My main focus for this post will be the ref and in keyword in C#. When writing a system for some process in Unity you can use these keywords to indicate how you intend to use the data the system operates on. I’ll use a simple example below:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
public class MoveSystem : JobComponentSystem
{
    protected override JobHandle OnUpdate(JobHandle inputDeps)
    {
        return this.Entities
            .ForEach((ref Position position, in Move move) =>
            {
                position.Value += move.Value;
            })
            .Schedule(inputDeps);
    }
}

I’ll quickly unpack what this system does in case it’s not quite obvious. My game world has a bunch of entities that can have various components attached to the entity. This MoveSystem is expressing that it wants all entities that have a Position component and a Move component. It’s then taking the value of the move and adding it to the position’s value. The cool thing here is the use of ref and in is expressing if I’m writing/reading the component value.

Unity’s Entities package is using this information to cleverly schedule a multi-threaded job that can split this work across multiple cores. If I have a few thousand entities that are moved using this system that can become quite a workload, but we know it’ll be safe to split the work across cores without having race conditions. The important part is knowing if subsequent or prior systems can operate on the same data. This is why expressing read/write intent is so important.

C# Compiler niceties

This post is focused specifically on the semantics of working with value types and how they are passed into methods. Let’s take an initial example that shows how value types are copied when being passed as parameter first:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
using System;
					
public class Program
{
	public static void Main()
	{
		var test = new Test { Value = 1 };
		
		Console.WriteLine("Before static increment: {0}", test);
		
		Increment(test);
		
		Console.WriteLine("After static increment: {0}", test);
	}
	
	private static void Increment(Test test)
	{
		
		test.Value++;
	}
}

public struct Test
{
	public int Value;
	
	public override string ToString() => $"Value: {this.Value}";
}

Running this code the output is as follows:

1
2
3
4
Before static increment: Value: 1
Inside static increment before: Value: 1
Inside static increment after: Value: 2
After static increment: Value: 1

The output is due to the Increment method getting a separate copy of the test variable. It correctly increments the integer stored in the structure, but the variable in the Main method isn’t affected.

Enter ref

The ref keyword is clever in that it’s more or less syntactical sugar for a pointer without the dangers of using a pointer. When a method expresses that it accepts a parameter by reference the method is very likely going to want to mutate the value. Updating the above examples to use ref we’ll get the following:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
using System;
					
public class Program
{
	public static void Main()
	{
		var test = new Test { Value = 1 };
		
		Console.WriteLine("Before static increment: {0}", test);
		
		Increment(ref test);
		
		Console.WriteLine("After static increment: {0}", test);
	}
	
	private static void Increment(ref Test test)
	{
		Console.WriteLine("Inside static increment before: {0}", test);
		
		test.Value++;
		
		Console.WriteLine("Inside static increment after: {0}", test);
	}
}

public struct Test
{
	public int Value;
	
	public override string ToString() => $"Value: {this.Value}";
}

The output is as follows:

1
2
3
4
Before static increment: Value: 1
Inside static increment before: Value: 1
Inside static increment after: Value: 2
After static increment: Value: 2

It shows clearly that the original variable had been affected, but what’s nice is that ref kind of makes it clear this could happen.

Enter in

The in keyword can be of great help if you want to mark a property as “read only” while potentially getting the same pointer like behaviour as ref. I’ll explore why I say there’s only the potential of getting pointer-like behaviour, but let’s first focus on what the in keyword can give us from a compilation standpoint. If I try to mutate a simple structure passed to a method using the in keyword there will be compilation errors. This is nicely demonstrated below:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
using System;
					
public class Program
{
	public static void Main()
	{
		var test = new Test { Value = 1 };
		
		Console.WriteLine("Before static increment: {0}", test);
		
		Increment(in test);
		
		Console.WriteLine("After static increment: {0}", test);
	}
	
	private static void Increment(in Test test)
	{
		Console.WriteLine("Inside static increment before: {0}", test);
		
		test.Value++;
		
		Console.WriteLine("Inside static increment after: {0}", test);
	}
}

public struct Test
{
	public int Value;
	
	public override string ToString() => $"Value: {this.Value}";
}

When trying to compile this I get the following compilation error: Compilation error (line 20, col 3): Cannot assign to a member of variable 'in Test' because it is a readonly variable. It’s cool that the compiler can “protect” myself from myself here, but playing around a little with this I realised there are some inconsistencies with how the compiler does these checks.

ref and in goals

Reading the documentation on writing safe and efficient code it became clear to me these keywords exist more as a way for the compiler to figure out how to do fewer copy operations when calling methods. There are some caveats to how in functions, but it is mentioned in the documentation.

Looking back to the Unity example above you’ll notice all of the structure manipulations are defined within the system. The compiler can protect us here, but if we want to have some reusable functions you might bump into some strange behaviour.

Encapsulated manipulation

There might be some cases where you’d go and write a method within your struct so that it’s easier to share across your codebase. Take the following example:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
using System;
					
public class Program
{
	public static void Main()
	{
		var test = new Test { Value = 1 };
		
		Console.WriteLine("Before static increment: {0}", test);
		
		Increment(in test);
		
		Console.WriteLine("After static increment: {0}", test);
	}
	
	private static void Increment(in Test test)
	{
		Console.WriteLine("Inside static increment before: {0}", test);
		
		test.Increment();
		
		Console.WriteLine("Inside static increment after: {0}", test);
	}
}

public struct Test
{
	public int Value;
	
	public void Increment()
	{
		Console.WriteLine("Inside struct increment before: {0}", this);
		
		this.Value++;
		
		Console.WriteLine("Inside struct increment after: {0}", this);
	}
	
	public override string ToString() => $"Value: {this.Value}";
}

This compiles successfully, but the output is now not quite what one would expect:

1
2
3
4
5
6
Before static increment: Value: 1
Inside static increment before: Value: 1
Inside struct increment before: Value: 1
Inside struct increment after: Value: 2
Inside static increment after: Value: 1
After static increment: Value: 1

The in keyword is suggested to work best with readonly structures, but that’s out of scope for this post. If you are interested in knowing more the documentation linked above does explore the concept. What’s important to be aware of is that the in keyword could still result in a copy, but it still enforces that you don’t mutate the value that was passed. This compiler check is only done on properties, fields and indexers. Methods on the type itself could make it look like mutations will occur, but these methods end up with a copy of the structure instead of manipulating the one from the calling method.

Let’s change the above example to rather use ref in the static Increment method:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
using System;
					
public class Program
{
	public static void Main()
	{
		var test = new Test { Value = 1 };
		
		Console.WriteLine("Before static increment: {0}", test);
		
		Increment(ref test);
		
		Console.WriteLine("After static increment: {0}", test);
	}
	
	private static void Increment(ref Test test)
	{
		Console.WriteLine("Inside static increment before: {0}", test);
		
		test.Increment();
		
		Console.WriteLine("Inside static increment after: {0}", test);
	}
}

public struct Test
{
	public int Value;
	
	public void Increment()
	{
		Console.WriteLine("Inside struct increment before: {0}", this);
		
		this.Value++;
		
		Console.WriteLine("Inside struct increment after: {0}", this);
	}
	
	public override string ToString() => $"Value: {this.Value}";
}

This yields the following output:

1
2
3
4
5
6
Before static increment: Value: 1
Inside static increment before: Value: 1
Inside struct increment before: Value: 1
Inside struct increment after: Value: 2
Inside static increment after: Value: 2
After static increment: Value: 2

Again we have something different happening to what one would expect. The ref keyword is ensuring the variable is passed by reference, but now the Increment method on the Test type is affecting the original value. I expected that it would, but I then realised it wasn’t quite functioning in the way we’re used to value types functioning.

My main motivation for looking into this is to find a way to express this intent and still have the compiler show me where I screw up. I need a way to have encapsulated code that can be re-used and immediately warn me if I need to tell a system to expect a value by ref instead of just by in.

Using extension methods

One way I’ve found that does enable the compiler to help identify possible issues is with extension methods:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
using System;
					
public class Program
{
	public static void Main()
	{
		var test = new Test { Value = 1 };
		
		Console.WriteLine("Before static increment: {0}", test);
		
		Increment(ref test);
		
		Console.WriteLine("After static increment: {0}", test);
	}
	
	private static void Increment(ref Test test)
	{
		Console.WriteLine("Inside static increment before: {0}", test);
		
		test.Increment();
		
		Console.WriteLine("Inside static increment after: {0}", test);
	}
}

public struct Test
{
	public int Value;
	
	public override string ToString() => $"Value: {this.Value}";
}

public static class TestImpl
{
	public static void Increment(ref this Test test)
	{
		Console.WriteLine("Inside extension increment before: {0}", test);
		
		test.Value++;
		
		Console.WriteLine("Inside extension increment after: {0}", test);
	}
}

This outputs the following:

1
2
3
4
5
6
Before static increment: Value: 1
Inside static increment before: Value: 1
Inside extension increment before: Value: 1
Inside extension increment after: Value: 2
Inside static increment after: Value: 2
After static increment: Value: 2

Now I can change the static Increment method to using in:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
using System;
					
public class Program
{
	public static void Main()
	{
		var test = new Test { Value = 1 };
		
		Console.WriteLine("Before static increment: {0}", test);
		
		Increment(in test);
		
		Console.WriteLine("After static increment: {0}", test);
	}
	
	private static void Increment(in Test test)
	{
		Console.WriteLine("Inside static increment before: {0}", test);
		
		test.Increment();
		
		Console.WriteLine("Inside static increment after: {0}", test);
	}
}

public struct Test
{
	public int Value;
	
	public override string ToString() => $"Value: {this.Value}";
}

public static class TestImpl
{
	public static void Increment(ref this Test test)
	{
		Console.WriteLine("Inside extension increment before: {0}", test);
		
		test.Value++;
		
		Console.WriteLine("Inside extension increment after: {0}", test);
	}
}

This yields a compilation error: Compilation error (line 20, col 3): Cannot use variable 'in Test' as a ref or out value because it is a readonly variable. This is great news! I can now have my cake and eat it, but it does mean my structures need to have public properties, but for my purposes, it’s not the end of the world.

C# 8 to the rescue (sort of)

C# 8 has introduced the ability to mark property getters and methods on a structure as readonly like follows (this is also explored in the documentation linked earlier):

1
2
3
4
5
6
public struct Test
{
	public int Value;
	
	public readonly override string ToString() => $"Value: {this.Value}";
}

This is a good first step, but it still doesn’t fix the problem that an in parameter could have a mutating method available that won’t cause an error when compiling. It’s going to be interesting to see if the language further develops these concepts to include mutability checks on methods that aren’t marked as readonly. The other problem is that Unity doesn’t support C# 8 yet so it’s not useful for my current use-case.

Conclusion

Now you might think I’m mad for worrying about these things so much, but when building games having your tools help you as much as possible is crucial. It’s even more important if you’re doing dumb things like screwing around with pointers in structures. This isn’t even necessarily a bad thing because sometimes some manual memory management is just a more performant solution towards a problem.

I am very happy with having the extension method approach as an option, but it does come with its own set of constraints that might cause some issues. The Rust language has some superb rules that help you as a developer to reason about data ownership. The plus side of this is that if this is easier multi-threaded programming also becomes easier. Here’s to hoping more tools become available that can do this in other ecosystems!