Countdown to C# 8: Range and Index

Akos Nagy

Mar 18, 2019

In my previos post, I discussed one of the new features of C# 8: non-nullable reference types. That feature has been out for a while, so I think most of what can be said has already been said. But this time, I'm taking a look at something even newer: the new range expressions in C# 8. Check it out!

What are ranges?

Ranges are just that: ranges. These have been around in many programming languages (e.g. Python) for a while now, but in C#, if you wanted to have a slice of your data structure, you either had to write your own, or use a method like CopyTo(), if you are lucky enough that your source data structure supports it.

C# 8 introduces a new syntax for extractring a range from your data source:

int[] a = { 1, 2, 3, 4, 5, 6, 7 };
int[] b = a[3..4];

And this gives you a new array with the 3rd and 4th elements (zero-based, of course). There is another syntax that you can use if you want to specify a range from the end of the collection:

int[] a = { 1, 2, 3, 4, 5, 6, 7 };
int[] b = a[3..^1];

Here, the ^ sign means that you should take that index from the end of the collection. Pretty cool. You also have the option to omit one end of the range; if you omit the first index, that means to take it from the beginning, and if you omit the end, it means to takeit till the end. You can omit both :)

How do ranges work?

It's a nice feature. Not very crucial, but nice. And when you have a programming language that's been out there for almost 20 years, introducing nice features is not a bad thing. But how do they work under the hood?

Well, when you create a range using this syntax, it is actually translated to an instnace of the new Range type. Here's how it ooks:

public struct Range : IEquatable<Range>
{
  public Range(Index start, Index end);

  public static Range All { get; }
  public Index End { get; }
  public Index Start { get; }

  public static Range EndAt(Index end);
  public static Range StartAt(Index start);
  public override bool Equals(object value);
  public bool Equals(Range other);
  public override int GetHashCode();
  public OffsetAndLength GetOffsetAndLength(int length);
  public override string ToString();

  public struct OffsetAndLength
  {
    public OffsetAndLength(int offset, int length);

    public int Offset { get; }
    public int Length { get; }

    public void Deconstruct(out int offset, out int length);
  }
}

So this is the type behind the range syntax. Couple of interesting things to note here:

It is a value type. Makes sense, it basically holds to integers and is used often.
The Range instance is constructed from two Index instances. That's another new type that we'll look at later.
The type has a method GetOffSetAndLength() that returns a stucture that contains the offset and the length of the range. That's a nice feature, but the best part is that it also has a Deconstruct() method so you can deconstruct it as a tuple!

And what is Index? Well, that's another new type that supports the feature. An index is basically just an integer with additional metadata whether it is counted from the beginning or from the end.

public struct Index : IEquatable<Index>
{
  public Index(int value, bool fromEnd = false);

  public static Index Start { get; }
  public static Index End { get; }
  public bool IsFromEnd { get; }
  public int Value { get; }

  public static Index FromEnd(int value);
  public static Index FromStart(int value);
  public bool Equals(Index other);
  public override bool Equals(object value);
  public override int GetHashCode();
  public int GetOffset(int length);
  public override string ToString();

  public static implicit operator Index(int value);
}

Again a couple key points of interest:

Again, a value type. Makes sense.
It has implicit conversion from int implemented. Cool :) I would have probably create one to integer as well. But again, this is just a preview :)

Range and Index internals

So with that in mind, here's how a little piece of code like this:

int[] a = { 1, 2, 3, 4, 5, 6, 7 };
int[] b = a[3..5];

is actually compiled by the C# compiler (more-or-less, I guess):

int[] a = { 1, 2, 3, 4, 5, 6, 7 };
Index i1 = 3; // here, the implicit conversion operator is called
Index i2 = 5; // here, the implicit conversion operator is called
Range r = new Range(i1, i2);
int[] b = a[r];

Quite understandable. So basically all they have to do is add another overload of the indexer property for the types that they want this feature to support.

Range internal-internals

Well, not quite. Adding a new indexer overload might be possible for most types. But the array is a special case. Whenever you call the array indexer, it is actually translated to the ldelem IL instruction and not the regular property-based indexer. So arrays don't have indexers that can be overloaded. What now?

This was bugging me, so I checked the actual IL code for arrays. Here's a transcribed version of what happens when you use range with arrays:

Range range = new Range(3, 5);
int[] array = a; // this is the source array
int num = range.Start.IsFromEnd ? (array.Length - range.Start.Value) : range.Start.Value;
int num2 = (range.End.IsFromEnd ? (array.Length - range.End.Value) : range.End.Value) - num;
int[] array2 = new int[num2];
Array.Copy(array, num, array2, 0, num2);
int[] b = array2; // this is the target variable

So basically, when it comes to arrays, everything is handled differently. Of course, this is not possible for every type (note the special Array.Copy() call near the end), but not necessary either, because everything else has an overloadable indexer. I checked spans and List<T> in the latest preview. While List<T> doesn't support this yet, spans do. And of course, remember, when you create a slice from a span, that's not a copy — everything's coming together pretty nicely.

Verdict

I think this is a very nice feature, even if not the most crucial one. But here's the sad news: Range and Index will only be part of .NET Core, not .NET Framework. And this means that this feature will only be available when programming for .NET Core. That's not very sad, because it's been clear to all of us that the age of the Framework is basically over and the age of Core has come. But still, a lot of people are and will be using .NET Framework for years to come — I'm not sure why these two types would be that hard to add to it. As far as I can tell, there are no modifications needed to the CLR or other runtime parts to manage this type. They did it with the tuples feature of C# 7, we also have spans as a separate NuGet package, so I'm a little disappointed that this is not going to be part of the full .NET Framework. But I do like this feature.

If you have thoughts on this, feel free to share them below in the comments, and see you next time with the next feature.

Akos Nagy

Posted in C#