Entity Framework Core Cosmos DB provider

Akos Nagy
Apr 11, 2019

The Cosmos DB provider for Entity Framework has been available in preview for a while. I've been wanting to check it out ever since I read the announcement, but I haven't really had the time (and to be honest, I was worried it would be not worth the time because of the limited featureset). But now I have had some time and the provider is only one release away from RTM as per the roadmap, so it's time to prepare.

Creating the basic model

So to start things off, I created a very basic model of two entities:

public class Person
{
  public Person()
  {
    Cars = new HashSet<Car>();
  }
  public Guid PersonId { get; set; }
  public string Name { get; set; }
  public ICollection<Car> Cars { get; set; }
}

public class Car
{
  public Guid CarId { get; set; }
  public string LicensePlate { get; set; }
  public Guid OwnerId { get; set; }
  public Person Owner { get; set; }
}

Next, I added the EntityFrameworkCore.Cosmos NuGet package to the project (which of course downloads the EntityFrameworkCore package as a dependency).
And then create a simple context:

public class PeopleContext : DbContext
{
  public DbSet<Person> People { get; set; }
  public DbSet<Car> Cars { get; set; }
}        

Next up, I added the the appropriate configuration to the OnConfiguring() method. It's refreshingly simple compared to the way you might be used to when using the regular SDK. You simply specify the service endpoint, the primary key to the account and the database name. I also added some logging (yes, I know this is deprecated, but for now it's fine) to see what happens in the background.

protected override void OnConfiguring(DbContextOptionsBuilder optionsBuilder)
{
  optionsBuilder.UseLoggerFactory(
      new LoggerFactory(new[] { new ConsoleLoggerProvider((_, __) => true, true) }));
  optionsBuilder.UseCosmos( "https://localhost:8081",                 
 "C2y6yDjf5/R+ob0N8A7Cgv30VRDJIWEHLM+4QDU5DE2nQ9nDuVTqobD4b8mGGyPMbIZnqyMsEcaGQy67XIw/Jw==", "PeopleDatabase",
  options =>
  {                    
    options.ExecutionStrategy(d => new CosmosExecutionStrategy(d));
  });
}

You might notice one extra thing in this code: the CosmosExecutionStrategy. Whenever you are dealing with cloud-based resources, it is very important to implement a proper retry-strategy to overcome transient failures. The provider comes with a built-in retry-strategy: this for example knows implicitly how to handle the 429 error code and the x-ms-retry-after-ms header when you go over the available RUs. That's very neat :)

I also added two basic value generators to the cotext:

protected override void OnModelCreating(ModelBuilder modelBuilder)
{
  modelBuilder.Entity<Person>().Property(p => p.PersonId).HasValueGenerator<GuidValueGenerator>();
  modelBuilder.Entity<Car>().Property(p => p.CarId).HasValueGenerator<GuidValueGenerator>();                        
}

And then we are good to go. I used the EnsureCreatedAsync() method to create the database and added one entity to each dbset:

using (var context = new PeopleContext())
{
  await context.Database.EnsureCreatedAsync(); 
  var person = new Person { Name = "Akos" };
  var car = new Car { LicensePlate = "ABC-123" };
  car.Owner = person;
  person.Cars.Add(car);
  context.People.Add(person);
  context.Cars.Add(car);
  await context.SaveChangesAsync();

And it works like a charm.

Why the provider will be awesome

Of course, this post wouldn't be much with that conclusion. There are a couple of very cool things to note here.
First of, this is the view of the emulator that shows the created database:

As you can see, everything goes into one collection (or container, if you'd like). This is very cool. Microsoft has been trying to teach people ever since Document DB that you shouldn't think about this as a relational datastore. Storing every entity in a separate collection would be very expensive and it's totally needless. You can store most things in one collection and add an extra field to the documents distinguish between the entities by type. Then you can query based on that field, since every field is indexed by default. And this is exactly how the provider works. If you turn on logging like I did in the configuration above, you can actually see the SQL query that's executed on the server — and it works exatly like this:

Nice to see that Microsoft actually follows its own guidelines :)

So here are the actual entities that are inserted into the database:

{
  "PersonId": "d3f04f00-dd17-4a7e-943b-58bc60252487",
  "Discriminator": "Person",
  "Name": "Akos",
  "id": "Person|d3f04f00-dd17-4a7e-943b-58bc60252487",
  "__partitionKey": "0",
  "_rid": "zrwRAJgjj7kDAAAAAAAAAA==",
  "_self": "dbs/zrwRAA==/colls/zrwRAJgjj7k=/docs/zrwRAJgjj7kDAAAAAAAAAA==/",
  "_etag": "\"00000000-0000-0000-e7c5-68dfe94301d4\"",
  "_attachments": "attachments/",
  "_ts": 1554038815
}

{
  "CarId": "a89a52ca-a6a6-4ecd-b06e-96569d9c23ab",
  "Discriminator": "Car",
  "LicensePlate": "ABC-123",
  "OwnerId": "d3f04f00-dd17-4a7e-943b-58bc60252487",
  "id": "Car|a89a52ca-a6a6-4ecd-b06e-96569d9c23ab",
  "__partitionKey": "0",
  "_rid": "zrwRAJgjj7kEAAAAAAAAAA==",
  "_self": "dbs/zrwRAA==/colls/zrwRAJgjj7k=/docs/zrwRAJgjj7kEAAAAAAAAAA==/",
  "_etag": "\"00000000-0000-0000-e7c5-68e3656c01d4\"",
  "_attachments": "attachments/",
  "_ts": 1554038815
}

As you can see, it has all the Cosmos DB required fields, it has the Discriminator that contains the name of the entity. Also, you can see that the id is generated from the dicriminator and the primary key. Also you can see the __partitionKey, that's the partition key of the collection and its value is 0 by default. As far as I could see in the source code, currently there is no way to change this (a straightforward option would be Discriminator for the partition key). If you turn on logging, you can see how these extra fields and their values are handled:

I have tried if the E-tag based (or _ts based) optimistic concurrency works, but for now, that doesn't work.

There is another nice feature that I have tried. If you create an owned entity, they are automatically embedded into the owning type:

public class PeopleContext : DbContext
{

  public DbSet<Person> People { get; set; }  

  protected override void OnConfiguring(DbContextOptionsBuilder optionsBuilder)
  {
    // ... same goes here
  }

  protected override void OnModelCreating(ModelBuilder modelBuilder)
  {
    modelBuilder.Entity<Person>().Property(p => p.PersonId).HasValueGenerator<GuidValueGenerator>();
    modelBuilder.Entity<Car>().Property(p => p.CarId).HasValueGenerator<GuidValueGenerator>();            
    modelBuilder.Entity<Person>().OwnsMany<Car>(p => p.Cars);
  }
}

And if you do this, you can see in the logs that it automatically recognizes how these entities are connected (even though the name of the entity and the name of the property don't match up exactly; I guess that's an EF Core feature):

And now, the json looks like this:

{
  "PersonId": "3bd07831-4107-4665-9022-22638b3d2a4c",
  "Discriminator": "Person",
  "Name": "Akos",
  "id": "Person|3bd07831-4107-4665-9022-22638b3d2a4c",
  "Cars": [
  {
    "OwnerId": "3bd07831-4107-4665-9022-22638b3d2a4c",
    "CarId": "635dc1c5-a5f0-401c-ab51-b77edc0aacd0",
    "Discriminator": "Car",
    "LicensePlate": "ABC-123"
  }],
  "__partitionKey": "0",
  "_rid": "zrwRAJgjj7kCAAAAAAAAAA==",
  "_self": "dbs/zrwRAA==/colls/zrwRAJgjj7k=/docs/zrwRAJgjj7kCAAAAAAAAAA==/",
  "_etag": "\"00000000-0000-0000-e7c5-230a999601d4\"",
  "_attachments": "attachments/",
  "_ts": 1554038698
}

That's also a very cool feature. Embedding entities like this a key concept that you should use when designing a document-based data solution, and having this supported is crucial.

Verdict

The Cosmos DB EF Core provider will be awesome. It already has some very cool features, and based on the roadmap, there are a lot more coming. For now, it's still early to use it without the partition key controlling feature and the optimistic concurrency, but I can't wait to have the RTM available.

Akos Nagy