Using the Specification pattern with Repository and Unit of work

Akos Nagy

May 22, 2019

The age-old question of whether you should use the Repository and Unit of work patterns comes up pretty often in my life. That's probably because I'm for and not against :) and whenever I teach or wherever I go to consult, I always tell people to use this pattern. And then come the usual arguments that they are wrong because... I don't want to go into the details here, I have already done that in another post in defense of this pattern. And in this post, I declare that this pattern enables you to change your ORM in your system if you want to. And of course this was not completely true :), so let's dig a little deeper.

Returning IQueryable might be a bad idea

Let's take a sample project that I usually use to illutrate how the pattern works. The sample project can be found in this my Github repository. This is a little web application that builds on the Northwind database. It allows you to handle products and categories. Let's take a look at the products:

public partial class Product
{
  public int ProductID { get; set; }  
  public string ProductName { get; set; }
  public int? CategoryID { get; set; }        
  public decimal? UnitPrice { get; set; }
  public bool Discontinued { get; set; }
  public virtual Category Category { get; set; }
  
}

This class has a couple of properties that you can use when you want to access the data; I guess most of them are pretty self-explanatory. Onto the repository interface then:

public interface IProductRepository
{
  void Add( Product p );
  IQueryable<Product> List();
  Product Get( int productId );
  void Update( Product p );
  void Delete( Product p );
}

Pretty standard implementation with the basic CRUD operations. The only interesting part is that the List() method returns an IQueryable<T>. This is done in order to allow the user to specify filtering and other operations on the whole table to be performed on the server-side. The implementation of the List<T>() method is simply this:

public IQueryable<Product> List() => ctx.Products;

And with this design you have the option to do the filtering on the server-side but from the business logic via the unit of work like this (a couple of details are omitted, but you probably get the idea; or if not, check out the Github repo):

uow.ProductRepository.List().Where( p=> !p.Discontinued).ToListAsync();

The ability to do the filtering from the business logic is very important. Remember that the business logic holds the business value and therefore must go through rigorous unit testing. To test the business logic, you mock its dependencies, on of which is the repository and the unit of work. So if you do the filtering there, you mock away the filtering as well and have no way of testing whether or not your logic is correct. The ability to run the query server-side is just as important for performance reasons.

But this design has an inherent flaw: for example, ToListAsync() can only be called on an IQueryable<T> if the IQueryable<T> supports it. And EF does, but a simple List<T> for example doesn't. Or just another example:

uow.ProductRepository.List().Where( p=> p.Name.Contains('s')).ToList();

This runs just fine if you have a memory-based implementation. But for EF, this overload of Contains() (the one with a character parameter) is not supported and it throws an exception. The whole point of these patterns is to allow you to separate the ORM from your BLL, but the IQueryable<T> is actually a cheat and you do have to be aware of the specific implementation that's behind your business logic, because that defines what queries can you use the resulting queryable object with.

Now I have to admit, I usually don't care, because I do this only to make my business logic testable and I don't really expect the ORM to change (except maybe moving to EF Core from EF, but honestly, who in their minds would switch the ORM to a whole new implementation mid-project?). But if you are really into clean code, you should.

Shifting the focus of the problem

So you have to make sure that your filters run on the server-side and are specified in the unit testable layers of your code, but cannot use the evaluation semantics of IQueryable<T>.

Here's a trick that I like: when you have a problem, try to shift the focus of the problem. Don't try to solve the problem, but first, transform the problem to a different version, and then try to solve that. The problem is that you cannot return IQueryable<T>, because it is not specific enough. Instead, I suggest returning IReadOnlyList<T>. That's a lot more specific and has not problems like the IQueryable<T> has. And instead of allowing the user of the repository to create a query after using the repository, change the signature so that they have to create the query before, and then pass it to the repository. So basically, the interface becomes:

IReadOnlyList<Product> List(Expression<Func<Product,bool>> predicate = null);

And the implementation for that when using EF:

public IReadOnlyList<Product> List(Expression<Func<Product,bool>> predicate = null)
{
  IQueryable<Product> query = ctx.Products;
  if (predicate != null)
  {
    query = query.Where(predicate);
  }
  return query.ToList();
}

Basically, nothing's changed: you can still specify your query in the business logic and the results are still queries on the server-side. But now, you pass in your complete, constructed query to your repository. And with this, now it is the responsibility of the interface-implementor to handle every possible input value (as per the Liskov-substitution principle). So now, the responsibility is not on you to know what you can do with the resulting queryable, but instead on the interface implementor to handle all the input values, as it should be. And this solves the problem — mostly.

Using the specification pattern

And with that, the problem is more or less solved. Of course, this makes the repository a little less easy to mock and the whole testing is a bit trickier. And it could be argued that passing in all these expressions makes the code a little harder to read an understand (even though we do it all the time with LInQ anyway). If the repositories want to somehow add some extra magic to the query-building, they have to deal with the expressions themselves dynamically, and that makes it also a bit cumbersome.

But of course there is a pattern to help with that as well: specification. The pattern was described by Martin Fowler in this excellent whitepaper, but the basic idea is that you encapsulate the logic that defines whether or not one object meets a given requirement, or specification into a separate object and then use this object in your business logic for validation or filtering. Basically, it is just an encapsulated predicate.

Here's my version for the .NET implementation:

public abstract class FilterSpecification<T>
{
  public abstract Expression<Func<T, bool>> SpecificationExpression { get; }  
}

And that's it. That's your specification: an encapsulated predicate. Now if you want to filter the not discontinued products, you can create an implementation:

public class NotDiscontinuedProductSpecification : FilterSpecification<Product>
{
  public abstract Expression<Func<Product, bool>> SpecificationExpression
      => (p => !p.Discontinued);
}

And then, you can modify your repository to accept something like this:

IReadOnlyList<Product> List(FilterSpecification<Product> spec = null);

And the implementation for that when using EF:

public IReadOnlyList<Product> List(FilterSpecification<Product> spec = null)
{
  IQueryable<Product> spec = ctx.Products;
  if (spec != null)
  {
    query = query.Where(spec.Specification);
  }
  return query.ToList();
}

And finally use it like this:

uow.ProductRepository.List(new NotDiscontinuedProductSpecification());

Why is this better than simply using an expression? Well, now you can use the same object for viewmodel validation in the user interface as well. With the encapsulation, your logic becomes reusable. Or just imagine: you want some client side validation in the browser and some server-side validation in ASP.NET. If you use Blazor on the client-side, now your validation code can be shared between the client and the server-side. How cool is that?

Also, you can add some extra features to the filter. To enforce encapsulation better, you can seal the expression away entirely and create an operator to convert it implicitly when needed:

public abstract class FilterSpecification<T>
{
  protected abstract Expression<Func<T, bool>> SpecificationExpression { get; }
  public static implicit operator Expression<Func<T, bool>>(FilterSpecification<T> spec) 
       => spec.SpecificationExpression;

And now, you can simply create your specification and inside the repository, pass it to the Where() method.

Pimping out the specifications

I like to add some other cool features to my specifications. Basically they are predicates, so they should support logical operators. I have this implementation that I like that supports &, | and !

public abstract class FilterSpecification<T>
{        
  private class ConstructedSpecification<TType> : FilterSpecification<TType>
  {
    private readonly Expression<Func<TType, bool>> specificationExpression;
    public ConstructedSpecification(Expression<Func<TType, bool>> specificationExpression)
    {
      this.specificationExpression = specificationExpression;
    }

    public override Expression<Func<TType, bool>> SpecificationExpression => specificationExpression;
  }
        
  protected abstract Expression<Func<T, bool>> SpecificationExpression { get; }
  
  public static implicit operator Expression<Func<T, bool>>(FilterSpecification<T> spec) => spec.SpecificationExpression;
  
  public static FilterSpecification<T> operator &(FilterSpecification<T> left, FilterSpecification<T> right) => CombineSpecification(left, right, Expression.AndAlso);
  
  public static FilterSpecification<T> operator |(FilterSpecification<T> left, FilterSpecification<T> right) => CombineSpecification(left, right, Expression.OrElse);

  private static FilterSpecification<T> CombineSpecification(FilterSpecification<T> left, FilterSpecification<T> right, Func<Expression, Expression, BinaryExpression> combiner)
  {
    var expr1 = left.SpecificationExpression;
    var expr2 = right.SpecificationExpression;
    var arg = Expression.Parameter(typeof(T));
    var combined = combiner.Invoke(
        new ReplaceParameterVisitor { { expr1.Parameters.Single(), arg } }.Visit(expr1.Body),
        new ReplaceParameterVisitor { { expr2.Parameters.Single(), arg } }.Visit(expr2.Body));
   return new ConstructedSpecification<T>(Expression.Lambda<Func<T, bool>>(combined, arg));
  }
  
  public static FilterSpecification<T> operator !(FilterSpecification<T> original)
     => new ConstructedSpecification<T> (Expression.Lambda<Func<T, bool>>(Expression.Negate(original.SpecificationExpression.Body), original.SpecificationExpression.Parameters));
        
}

It's pretty simple, you just have to know some expression-magic. Basically, when you want to negate, you take the body of the original expression, negate it and create a new expression with the original parameters. So if you have p => p.Discontinued, you first take p.Discontinued (the body), negate it (!p.Discontinued), then take p (the parameter) and build a new expression (hence p=>!p.Discontinued).

When you want to concatenate, you have to take the body of the two parts, use the appropriate operator, then create a new expression. But you have to be careful because the two original expressions have their own ParameterExpression objects that represent their parameters, so you have to create a new parameter and substitute it for both bodies. You can do that with a simple expression-visitor like this:

internal class ReplaceParameterVisitor : ExpressionVisitor, IEnumerable<KeyValuePair<ParameterExpression, ParameterExpression>>
{

  private readonly Dictionary<ParameterExpression, ParameterExpression> parameterMappings = new Dictionary<ParameterExpression, ParameterExpression>();
  
  protected override Expression VisitParameter(ParameterExpression node)
  {
    if (parameterMappings.TryGetValue(node, out var newValue))
       return newValue;

     return node;
  }
  
  public void Add(ParameterExpression parameterToReplace, ParameterExpression replaceWith) => parameterMappings.Add(parameterToReplace, replaceWith);

  public IEnumerator<KeyValuePair<ParameterExpression, ParameterExpression>> GetEnumerator() => parameterMappings.GetEnumerator();
  
  IEnumerator IEnumerable.GetEnumerator() => GetEnumerator();
    }

This basically traverses the expression, and if finds a node that's in the dictionary as a key, replaces it with the value that belongs to this parameter. The IEnumerable<T> is just implemented so that I can initialize it with a collection initializer.

And finally, the expression that's created is wrapped into another specification object. I made this private to enforce stronger encapsulation. So now, you can create two specifications like this:

public class CheapProductSpecification : FilterSpecification<Product>
{
  protected override Expression<Func<Product, bool>> SpecificationExpression => p => p.UnitPrice < 30;
}

public class DiscontinuedProductSpecification : FilterSpecification<Product>
{
  protected override Expression<Func<Product, bool>> SpecificationExpression => p => p.Discontinued;
}

And then, you can write queries like this:

uow.ProductRepository.List(!(new DiscontinuedProductSpecification()) & new CheapProductSpecification());

That's kinda cool. And it has all the attributes of loosely-coupled, clean code. The specifications are testable basically on their own, the queries are still executed on the server-side. And if you are really clever, you can even create something like this:

public static class ProductFilterSpecifications
{
  public static FilterSpecification<Product> IsCheap => new CheapProductSpecification();
  public static FilterSpecification<Product> IsDiscontinued => new DiscontinuedProductSpecification();
}

And simplify your code like this:

uow.ProductRepository.List(!ProductFilterSpecifications.IsDiscontinued & ProductFilterSpecifications.IsCheap);

And what about the other operations?

This is cool, but if you notice, I have cheated and only discussed filtering. What if you need other operations? Well, this is a good news-bad news kinda situation. You can naturally extend this concept to other operations as well. Let's say you need paging. Then you can do something like this:

public class PagingSpecification
{
  public PagingSpecification(int skip, int take)
  {
    this.Skip = skip;
    this.Take = take;
  }

  public int Skip { get; }
  public int Take { get; }
}

IReadOnlyList<Product> List(FilterSpecification<Product> filterSpecification = null, PagingSpecification pagingSpecification = null);

 public IReadOnlyList<Product> List(FilterSpecification<Product> filterSpecification = null, PagingSpecification pagingSpecification = null)
{
  IQueryable<Product> query = ctx.Products;
  if (filterSpecification != null)
    query = query.Where(filterSpecification);

  if (pagingSpecification != null)
    query = query.Skip(pagingSpecification.Skip).Take(pagingSpecification.Take);

  return query.ToList();
}

uow.ProductRepository.List(!ProductSpecifications.IsDicontinued, new PagingSpecification(10,10);

You can implement a specification for paging, modify your repository and then process the parameter inside the List() method. Or if you need projection:


public abstract class TransformSpecification<TSource, TTarget>
{
  protected abstract Expression<Func<TSource, TTarget>> SpecificationExpression { get; }
  
  public static implicit operator Expression<Func<TSource, TTarget>>(TransformSpecification<TSource, TTarget> spec) => spec.SpecificationExpression;
}

public IReadOnlyList<T> List<T>(TransformSpecification<Product, T> transform, FilterSpecification<Product> filterSpecification = null, PagingSpecification pagingSpecification = null)
{

  IQueryable<Product> query = ctx.Products;
  if (filterSpecification != null)
    query = query.Where(filterSpecification);

  IQueryable<T> transformedQuery = query.Select<Product,T>(transform);

  if (pagingSpecification != null)
    transformedQuery = transformedQuery.Skip(pagingSpecification.Skip).Take(pagingSpecification.Take);

  return transformedQuery.ToList();
}

If you need ordering or eager loading or anything else, you can support that with another specification. That's the good news. The bad news is that you basically have to wrap the whole language integrated query with an appropriate specification and that's tiresome. But the choice is yours. On the bright side, you only have to do this once and after a while, you'll end up with a fairly complete, ready-to-be-used specification-repository-uow framework. If I have the time, maybe one day I'll clean mine up and publish it for all to see. In the meantime, keep your code clean ;)