Often times as programmers, we tend to take for granted the mind-blowing layered complexity of the systems we use. A great example of this would be the Unity game engine.
Often times the only thing we Unity users think about are the C# scripts we write for our game, but we don’t realize that the “impressive” systems we make from those C# scripts are nothing compared to the complexity of just the platform we’re using. After all, the Unity engine comprises of about 2 billion lines of code, most of which is C++. And that C++ language is built on top of C, which in turn was built on top of other things.
And keep in mind that there are bunches and bunches of abstractions, compatibility adapters (so that the different systems can work together) and libraries that go in between all these layers. Not to mention memory management systems and threading mechanics whatnot. Technically, once could trace back the development of the Unity game engine back to the development of C, all the way in 1969.
And yet all those deep layers and abstractions rarely come into our mind when we type in code. That’s mainly because…well…we’re not really supposed to care (most of the time). They’re abstractions and hidden layers because the people who built them made them specifically for that purpose – to make life simpler.
But many times it still does matter to care about the inner workings of things – and one good example is the inner workings of the two data types of C#.
Back on Track
So in C#, any variable you use is one of two types: a value type or a reference type.
With the exception of strings, all basic built-in C# data types like the int, char, float and bool are value types. That means whenever we are dealing with ints (for example), we are dealing directly with raw values. So for example:
int numberOne = 1; int numberTwo = 2; numberOne = numberTwo; numberOne++;
This code would only result in the value of numberOne
being changed, not numberTwo.
Even though we called numberOne = numberTwo
; those variables are still completely unrelated and unlinked – they just happen to have the same value in them. They have their own distinct locations in memory; the value of numberTwo
was just copied into numberOne
.
Of course this is what you’d expect from a standard variable, but reference type variables behave differently.
In C#, classes, arrays and strings are all reference type variables. That means that when we use them, we are not dealing directly with data. The variables being used are actually just references or pointers to data somewhere else in the memory.
For instance – presume we had a Character class that had fields like age, name and health.
Character bobby= new Character(); Character jason= new Character(); bobby.age = 3; jason = bobby; jason.age = 2;
By the end of this piece of code, you’d expect the age of bobby
to be 3, but in fact it would change to 2. That’s because these variables aren’t actual data – they’re just references to data located somewhere else. So by calling jason = bobby;
, we weren’t saying “copy the data of bobby into jaosn”. We were saying “make jason point to the same data that bobby is pointing to.”
…So what?
As I’m sure you can imagine, this is a pretty important distinction to know! If you make a program without knowing about this fundamental difference, you could end up messing with class instances and arrays that you weren’t meant to be touching. A classic example:
//This is supposed to check if a game character (myCharacter) has enough space to hold an item (object) in their inventory public bool canCarry(Character myCharacter, Thing object) { myCharacter.AddItem(object); return myCharacter.isTooHeavy(); }
Obviously this developer expected the canCarry
method to make a copy of whatever game character it was given and manipulate the copy to avoid messing with the real game character. But instead, this method would end up just adding the thing object to the actual game character, not a duplicate. If Character
was a struct (that’s a lesser version of a class that is a value type instead of a reference type), then indeed the canCarry
method would have made a copy of myCharacter
for its instructions. But since it’s instead dealing with a reference to the game character itself, it’s a completely different story. The method ends up just making a reference to myCharacter
instead, and modifying it through its reference.
Another example of the implications of value type and reference types would be the different behaviors of the return
keyword and how it can cause behavior conflicting with the private
keyword:
private int number; private Character jimmy; public int getNumber() { return number; } public Character getJimmy() { return jimmy; }
Obviously this programmer intended for number
and jimmy
to be read-only variables. However, even though number
would be read-only, jimmy
wouldn’t. If another class were to call Character michael = className.getJimmy();
and later call michael.health = 0;
, that will affect actually jimmy, because what the return keyword did in the getJimmy()
method is to actually return a reference to the real jimmy variable, not a copy. So in actual effect, it’s just removed the whole point in making jimmy
a private variable in the first place because it exposes it to changes from external classes.
Nullified!
So I bet perhaps a few of you were wondering…so what if I had a piece of code like this?
void makeNull(Character randomCharacter) { randomCharacter = null; } Character henry = new Character(); makeNull(henry);
Would henry
become null?
The answer is no, and that’s because the randomCharacter
variable in makeNull()
is a copy to the argument it is based off of (that’s how all those argument-based variables in methods behave). In this case, randomCharacter
is a copy of a reference. So if randomCharacter=null
is called, just the method’s local copy to the argument is made null and points no nothing. But henry
remains untouched – just the method’s local copy of it is affected.
If I were to want to make henry
null, I’d have to use the ref
keyword:
void makeNull(ref Character character) { character = null; } Character henry = new Character(); makeNull(ref henry);
In this case, henry
itself would become null. That’s because using the ref keyword makes a method affect the argument itself, not a copy. This would even work for value type variables too, like int or bool.
Note, however, that if I have a piece of code like:
Character henry = new Character(); Character bobby = henry makeNull(ref henry);
Henry
would become null, but bobby wouldn’t. Henry just stopped pointing to the same data that bobby
was, but that doesn’t mean bobby
would magically become null; Bobby would still retain its pointer to its data while henry’s is destroyed.
This type of behavior is common among all reference type data, even strings.
Calling it a day
I think that’s a long enough explanation of this! Next two weeks, we’ll go more in-depth and learn about the two memory structures behind all of this nonsense – the stacks and the heap.
See you next time.
Image credit: Freepik.com