Create a filtered index using Entity Framework Code First

Akos Nagy

Apr 1, 2018

Using Entity Framework Code First, you have the option of specifying indices on columns that are translated to migrations and from that you can generate the SQL script or update the database.

With EF 6.2 you can use the new HasIndex() method to define your indices. Before 6.2, you had to apply a special annotation like this:

modelBuilder.Entity<Person>()
            .Property(p => p.Ssn)
            .HasMaxLength(10)
            .HasColumnAnnotation( IndexAnnotation.AnnotationName, 
                                  new IndexAnnotation(new IndexAttribute { IsUnique = true }));

Here's the problem: none of these approaches allows for specifying filtered indices. And here's the challenge: provide a convenient way for specifying filtered indices using EF CF.

Find a way to specify the filtered index in the migration

First I had to define a way to describe how the filtered index should be described in the migration file.

Indices are specified using the CreateIndex() method in the migration:

 CreateIndex("dbo.People", "Ssn", unique: true, name: "IX_IndexName");

CreateIndex has an optional extra parameter called anonymousArguments of type object. These can contain extra information that may be processed by providers. As far as I know the SqlClient provider doesn't do anything with it, but that doesn't mean I can't use them :). I imagined something like this:

CreateIndex("dbo.People", "Ssn", 
            unique: true, 
            name: "IX_IndexName", 
            anonymousArguments: new { Where = "Ssn IS NOT NULL" });

And then all I have to do is write my own processor to process this extra anonymousArguments and translate it to an SQL operation that amends the string specified in the Where property to the create index statements.

Writing a custom SQL generator to process the extra information

To extend/modify how the SQL statements are generated from the migration, you can write your own SQL Server generator, you just have to implement the right base class (I did something like this in my post about soft delete and removing index generation).

If you take a look at the original component built into EF that generates SQL queries for the MS SQL database, SqlServerMigrationSqlGenerator, you'll find an overload of the Generate method that takes a CreateIndexOperation as parameter at line 599. This is the method responsible for translating the result of the CreateIndex() method call in the migration file to SQL statements. Unfortunately, this cannot be modified, just overwritten altogether, so in order to be able to add my own one line of code I had to copy this original solution, amend it with my conditional generation and put this into the override:

public class FilteredIndexSqlGenerator : SqlServerMigrationSqlGenerator
{
  protected override void Generate(CreateIndexOperation createIndexOperation)
  {
    using (var writer = Writer())
    {
      writer.Write("CREATE ");
      if (createIndexOperation.IsUnique)
      {
        writer.Write("UNIQUE ");
      }
      if (createIndexOperation.IsClustered)
      {
        writer.Write("CLUSTERED ");
      }
      writer.Write("INDEX ");
      writer.Write(Quote(createIndexOperation.Name));
      writer.Write(" ON ");
      writer.Write(Name(createIndexOperation.Table));
      writer.Write("(");
      writer.Write(createIndexOperation.Columns.Join(Quote));
      writer.Write(")");
      
      // This condition applies only if the anonymousArguments are specified
      if (createIndexOperation.AnonymousArguments.ContainsKey("Where"))
      {
        writer.Write(" WHERE " + createIndexOperation.AnonymousArguments["Where"]);
      }
      Statement(writer);
    }
  }
}

Now all you have to do is enable migrations, and then apply this custom built generator in the Configuration class:

internal sealed class Configuration : DbMigrationsConfiguration<MyCtx>
{
  public Configuration()
  {
    AutomaticMigrationsEnabled = false;
    this.SetSqlGenerator("System.Data.SqlClient", new FilteredIndexSqlGenerator());
   }
}

And now if you create a migration and amend the CreateIndex() method call to specify the anonymousArguments parameter, the generated index will be amended with the where clause you specify in the Where parameter of the anonymous object. Cool :)

Setting up a new annotation for the filtered index

As cool as it is, it's not very useful. After all, if you have to modify the generated migration, you could just call the Sql() method yourself and pass in the SQL required for the generation of filtered indices. What we need for this to be really useful is an automated way to generate the migration itself without having to modify it later.

I had to experiment quite a lot to make this happen, and I'm not a hundred percent content with the result, but close enough. First, I wanted to mimic how you create an index using EF 6.1.3 or earlier:

modelBuilder.Entity<Person>()
            .Property(p => p.Ssn)
            .HasMaxLength(10)
            .HasColumnAnnotation( IndexAnnotation.AnnotationName, 
                                  new IndexAnnotation(new IndexAttribute { IsUnique = true }));

So I created a new class called FilteredIndexAnnotation inheriting fromIndexAnnotation with a new AnnotationName and a FilteredIndexAttribute inheriting from IndexAttribute. Unfortunately this didn't work, somewhere along the line EF swallows the extra information in the annotation.

So the custom annotation is out, but the the custom attribute can work as a custom annotation in itself. So I created the my custom filter inheriting from IndexAttribute, with a new property, Where, to store the where clause for the filtered index.

public class FilteredIndexAttribute : IndexAttribute
{
  public FilteredIndexAttribute(string name) : base(name) { }
  public string Where { get; set; } 
        
  public override string ToString()
  {
    var x = JObject.Parse(JsonConvert.SerializeObject(this));
    x.Remove("TypeId");
    x.Remove("Order");
    return JsonConvert.SerializeObject(x);
  }
}

One interesting thing might jump out to you: my ugly overriding of ToString(). Well, turns out EF serializes the value of the annotations to store them in the migration, and for that serialization the ToString() method is used. It is overridden in the base class to write out the properties in a key-value format, but my descendant class has an extra property that must be included as well, so I had to override the ToString(). I'm simply using Json.NET because at this point, I was lazy and frustrated as hell :), a nicer, more cost-effective implementation can be given for this.

Why did I omit the TypeId? Because the TypeId changes every time you generate a migration. If the TypeId is part of the serialized annotation, then EF detects this as a model change, even though it is the same, just the TypeId (which is some random generated value serving some administrative purpose as far as I can make out) changes by design.

Why did I omit the Order? Because I have found no way that I can support them. This may be an improvement possibility for the future.

Now you can go ahead and add this attribute as an annotation to the column you want to index:

modelBuilder.Entity<Person>()
            .Property(p => p.Ssn)
            .IsOptional()
            .HasMaxLength(10)
            .HasColumnAnnotation("IndexExtensions", 
new FilteredIndexAttribute("IX_IndexName") { IsUnique = true, Where = "Ssn IS NOT NULL" });

If you generate a migration now, you'll see that the serialized value of the attribute is included in the migration file. Which is not compiling now, because the string literal is riddled with the " characters from the Json :) but at least, you can now see why it was necessary to override the ToString() method.

Writing a custom migration scaffolder to process the new annotation

If you want to create a custom migration scaffolder, you have to again just subclass the right component and override the right methods (again, I did something similar here). The right methods to override in this case are:

Generate(IEnumerable<MigrationOperation> operations, string @namespace, string className): This is sort of a "base" method that gets called when the scaffolding starts. The operations parameter contains all the operation that need to be translated into C# code. My idea was that I should find the CreateTableOperations, look through the columns, and if any column contains the special annotation, deserialize the annotation value and create a new CreateIndexOperation based on the deserialized annotation. Finally, I remove the annotation so it doesn't show up in the scaffolded migration code file. Note that you probably have to do the same thing for the AddColumnOperations, not just the CreateTableOperations if you want to support filtered indices on columns added to the model later, I was just to lazy :)
Generate(CreateIndexOperation createIndexOperation, IndentedTextWriter writer): This is responsible for generating the index from the CreateIndexOperation created in the previous overload. Again, this is not very modular, so I just copied the original version and added a condition to use my version in case the annotation is there.

protected override string Generate(IEnumerable<MigrationOperation> operations, string @namespace, string className)
{
  var createTableOperations = operations
                             .OfType<CreateTableOperation>()
                             .Select(t => new { 
                                        Op = t, 
                                        IndexedColumns = t.Columns.Where(c => c.Annotations.ContainsKey("IndexExtensions")) 
                                               }
                             );
  var newCreateIndexOperations = new List<CreateIndexOperation>();
  foreach (var createTableOperation in createTableOperations)
  {
    foreach (var indexedColumn in createTableOperation.IndexedColumns)
    {
      var jobject = JObject.Parse(indexedColumn.Annotations["IndexExtensions"].NewValue.ToString());
      var x = new CreateIndexOperation
      {
        Table = createTableOperation.Op.Name,
        IsClustered = jobject["IsClustered"].Value<bool>(),
        IsUnique = jobject["IsUnique"].Value<bool>(),
        Name = jobject["Name"].Value<string>(),
      };
      x.Columns.Add(indexedColumn.Name);
      x.AnonymousArguments.Add("Where", jobject["Where"].Value<string>());
      newCreateIndexOperations.Add(x);                    
      indexedColumn.Annotations.Remove("IndexExtensions");
    }
  }
  var newList = new List<MigrationOperation>(operationc.Count+newCreateOperations.Count);
  newList.AddRange(operations);
  newList.AddRange(newCreateIndexOperations);
  return base.Generate(newList, @namespace, className);
}

private void WriteIndexParameters(CreateIndexOperation createIndexOperation, IndentedTextWriter writer)
{
  if (createIndexOperation.IsUnique)
  {
    writer.Write(", unique: true");
  }

  if (createIndexOperation.IsClustered)
  {
    writer.Write(", clustered: true");
  }

  if (!createIndexOperation.HasDefaultName)
  {
    writer.Write(", name: ");
    writer.Write(Quote(createIndexOperation.Name));
  }
  // This is my extra condition for the extra processing
  if (createIndexOperation.AnonymousArguments.ContainsKey("Where")) 
  {
    writer.Write(", anonymousArguments: new { Where = \"");
    writer.Write(createIndexOperation.AnonymousArguments["Where"]
                 .ToString());
    writer.Write("\" }");
  }
}
                 
protected override void Generate(CreateIndexOperation createIndexOperation, IndentedTextWriter writer)
{
  if (createIndexOperation.AnonymousArguments.ContainsKey("Where"))
  {
    writer.Write("CreateIndex(");
    writer.Write(Quote(createIndexOperation.Table));
    writer.Write(", ");

    var compositeIndex = createIndexOperation.Columns.Count() > 1;
    if (compositeIndex)
    {
      writer.Write("new[] { ");
    }
    writer.Write(createIndexOperation.Columns.Join(Quote));
    if (compositeIndex)
    {
      writer.Write(" }");
    }

    WriteIndexParameters(createIndexOperation, writer);
    writer.WriteLine(");");
  }
  else
  {
    base.Generate(createIndexOperation, writer);
  }
}

And finally, you just have to register the code generation in the configuration class:

internal sealed class Configuration : DbMigrationsConfiguration<MyCtx>
{
  public Configuration()
  {
    AutomaticMigrationsEnabled = false;   
    this.CodeGenerator = new FilteredIndexCodeGenerator();
    this.SetSqlGenerator("System.Data.SqlClient", new FilteredIndexSqlGenerator());
  }
}

And this generates the migration file with the extra anonymousArguments parameter. Which is later in turn automatically processed by the SQL generator I wrote before. Now this is cool :)

It's not perfect, but I like it. Maybe you can also introduce an extension method to add an annotation, to make it prettier. Of course the serialization is unnecessarily heavy-weight, but details, am I right? And of course specifying the filter clause as a string is not that fun — maybe expressions could be used and translated to SQL, just like with Linq2Entities?

Like always, you can find the whole code on GitHub.

Akos Nagy

Posted in Entity Framework .NET