Entity Framework 4 Enum support in Linq

As many of you might know, Entity Framework 4 still lacks support to map enum properties.
There are countless of more or less worthless workarounds, everything from exposing constants integers in a static class to make it look like an enum to totally insane generics tricks with operator overloading.

None of those are good enough IMO, I want to be able to expose real enum properties and make Linq queries against those properties, so I’ve decided to fix the problem myself.

My approach will be using Linq Expression Tree rewriting using the ExpressionVisitor that now ships with .net 4.
By using the ExpressionVisitor I can now clone an entire expression tree and replace any node in that tree that represents a comparison between a property and an enum value.

In order to make this work, the entities still needs to have an O/R mapped integer property, so I will rewrite the query from using the enum property and enum constant to use the mapped integer property and a constant integer value.

For me this solution is good enough, I can make the integer property private and make it invisible from the outside.

Example

public class Order
{
     //this is the backing integer property that is mapped to the database
  private int eOrderStatus {get;set;}

  //this is our unmapped enum property
  public OrderStatus Status
     {
  get{return (OrderStatus) eOrderStatus;}
            set{eOrderStatus = (int)value;}
     }

     .....other code
}

This code is sort of iffy and it does violate some POCO principles but it is still plain code, nothing magic about it..

So how do we get our linq queries to translate from the enum property to the integer property?

The solution is far simpler that I first thought, using the new ExpressionVisitor base class I can use the following code to make it all work:

namespace Alsing.Data.EntityFrameworkExtensions
{
    public static class ObjectSetEnumExtensions
    {
        private static readonly EnumRewriterVisitor visitor = new EnumRewriterVisitor();
        private static Expression< Func< T, bool>> ReWrite< T>(this Expression< Func< T, bool>> predicate)
        {
            var result = visitor.Modify(predicate) as Expression< Func< T, bool>>;
            return result;
        }

        public static IQueryable< T> Where< T>(this IQueryable< T> self,
            Expression< Func< T, bool>> predicate) where T : class
        {
            return Queryable.Where(self, predicate.ReWrite());
        }

        public static T First< T>(this IQueryable< T> self,
            Expression< Func< T, bool>> predicate) where T : class
        {
            return Queryable.First(self, predicate.ReWrite());
        }
    }

    public class EnumRewriterVisitor : ExpressionVisitor
    {
        public Expression Modify(Expression expression)
        {
            return Visit(expression);
        }

        protected override Expression VisitUnary(UnaryExpression node)
        {
            if (node.NodeType == ExpressionType.Convert && node.Operand.Type.IsEnum)
                return Visit(node.Operand);

            return base.VisitUnary(node);
        }

        protected override Expression VisitMember(MemberExpression node)
        {
            if (node.Type.IsEnum)
            {
                var newName = "e" + node.Member.Name;
                var backingIntegerProperty = node.Expression.Type.GetMember(newName, System.Reflection.BindingFlags.Instance | System.Reflection.BindingFlags.NonPublic | System.Reflection.BindingFlags.Public)
                    .FirstOrDefault();

                return Expression.MakeMemberAccess(node.Expression, backingIntegerProperty);
            }

            return base.VisitMember(node);
        }
    }
}

The first class, is an extension method class that overwrite the default “where” extension of IQueryable of T.
The second class is the actual Linq Expression rewriter.

By including this and adding the appropriate using clause to your code, you can now make queries like this:

var cancelledOrders = myContainer.Orders.Where(order => order.Status == OrderStatus.Cancelled).ToList();

You can of course make more complex where clauses than that since all other functionality remains the same.

This is all for now, I will make a followup on how to wrap this up in a Linq query provider so you can use the standard linq query syntax also.

Hope this helps.

//Roger

Two flavors of DDD

I have been trying to practice domain driven design for the last few years.
During this time, I have learnt that there are almost as many ways to implement DDD as there are practitioners.

After studying a lot of different implementations I have seen two distinct patterns.

I call the first pattern “Aggregate Graph”:

When applying aggregate graphs, you allow members of one aggregate to have direct associations to another aggregate.
For example, an “Order” entity which is part of a “Order aggregate” might have a “Customer” property which leads directly to a “Customer” entity that is part of a “Customer aggregate”.

 aggregate-graph

According to Evans book this is completely legal, any member of an aggregate may point to the root of any other aggregate.
Evans is very clear on the matter that aggregate root identities are global while identity of non root entities are local to the aggregate itself.

The opposite pattern would be what I call “Aggregate Documents”:

Here the aggregates never relate _directly_ to other aggregate roots.
Instead, the associations may be designed as “snapshots” where you store light weight value object clones of the related aggregate roots.
An “Order” entity would have a “Customer” property which leads to a “CustomerSnapshot” value object instead of a Customer entity.
This way each aggregate instance becomes more of a free-floating document.

aggregate-document

Since I have been applying both of these patterns, I will try to highlight the pros and cons of them in the rest of this post.

Aggregate Graph

The Aggregate Graph pattern is the approach I used when I first started doing DDD and I think that it is the most common way to implement DDD.
Since I was an O/RM developer (NPersist) this felt very natural to me, I could design my object graph in our design tool and then draw a few boxes on top of it and claim that those were my aggregates.
I most often used eager load inside the aggregates and lazy load between aggregates in order to avoid that the entire database was fetches when one aggregate instance was loaded.

This had a very nice “OOP” feel to it, I was working with objects and associations and I could ignore that there even was a database involved.

My “Repositories” were mere windows into my object graph, I could ask a repository to give me one or more aggregate roots and from those object I could pretty much navigate to any other object in the graph due to the spider web nature of the aggregate graph.

repository-window

The pros of this approach is that it is easy to understand, you design your domain model just like any other class model.
It also works very well with O/R mappers, features like Lazy Load and Dirty Tracking makes it all work for you.

However, there are a few problems with this approach too.
Firstly, Lazy Load in O/R mappers is an implicit feature, there is no way for a developer to know at what point he will trigger a roundtrip to the database just by reading the code.
It always looks like you are traversing a fully loaded object graph while you are in fact not.
This often leads to severe performance problems if your development team don’t fully understand this.

I have seen reports over this kind domain models where the implicit nature of Lazy Load have caused some 700 round-trip to the database in a single web page.

This is what you get when you try to solve an explicit problem in an implicit way.

If you are going to use Lazy Load, make sure your team understands how it works and where you use it.

Another problem with this approach arise when you need to fill your entities with data from multiple sources.
Many of the applications I build nowadays relies on data from multiple sources, it could be a combination of services and internal databases.

When using Lazy Load to get related aggregates, there is no natural point where you can trigger calls to the other data sources and fill additional properties.
You will most likely have to hook into your O/R mapper in order to intercept a lazy load and call the services from there.
nowadays, I mostly use the second approach, Aggregate Documents.

Aggregate Document

Aggregate Document approach is much more explicit in its design.
For example, if you want to find the orders for a specific customer;
Instead of navigating the “Orders” collection of “Customer”, you will have to call a “FindOrdersByCustomer” query on the “OrderRepository”.

While I do agree that this looks less object oriented than the first approach, this allows developers to reason about the code in a different way.
They can see important design decisions and hopefully avoid pitfalls like ripple loading.

Another benefit is that since you only work with islands of data, you can now aggregate data from multiple sources much easier.
You can simply let your repositories aggregate the data into your entities.
(If you do it inside the actual repository or let the repository use some data access class that does it for it is up to you)
repo-prism
You don’t have to hook into any O/RM infrastructure since you no longer rely on lazy load between aggregates.

Personally I use eager load inside my aggregates, that is, I fetch “Order” and “Order Detail” together as a whole.
A side effect of this is that since I don’t use Lazy Load between aggregates and don’t use Lazy Load inside my aggregates, my need for O/R mapping frameworks drops.
I can apply this design without using a full-fledged O/R mapper framework.
I’m not saying that you should avoid O/R mapping, just that it is much easier to apply this pattern if you can’t use an O/R mapper for some reason.

This also makes it easier to expose your domain model in an SOA environment.
You can easily expose your entities or DTO versions of them in a service.

Lazy Load and services don’t play that well together.

Maybe it looks like I dislike the first approach, this is not the case, I may very well consider it in a smaller project where there is just one data source and where the development team is experienced with O/R mapping.
You can also create hybrids of the two approaches;
e.g. In Jimmy Nilsson’s book “Applying Domain Driven Design and Patterns” there are examples where an “Order” aggregate have a direct relation to the “Product” aggregate while the same “Order” aggregate uses snapshots instead of direct references to the “Customer” aggregate.

Snapshots also comes with the benefit of allowing you to store historical data.
The snapshot can for example store both the CustomerId and the name of the customer at the time the order was placed.

Thats all for now.

//Roger

Entity Framework 4 – Entity Dependency Injection

When dealing with a fat domain model, there is often a need to be able to inject different services into your entities.
e.g. you might want to inject some domain service like “ITaxCalcualtorService” or an infrastructure service like “IEmailNotificationService” into your entities.

If we rely completely on eager loading, then this is not a big problem, we can simply let or repositories iterate over each entity once we have fetched them from the DB and inject our services.
But when it comes to Lazy Load, we can no longer do this, in this case we need to get notified by the O/R mapper when an entity have been materialized so that we can inject our services into it.

If you are aiming to use Entity Framework 4 once it is released, you can accomplish this with the following code snippet:


..inside your own EF container class..

public MyContext()
    : base("name=MyContext", "MyModelContainer")
{
    ...
    ObjectStateManager.ObjectStateManagerChanged +=
    ObjectStateManagerChanged;
}

// this handler gets called each time the
// containers statemanager is changed
void ObjectStateManagerChanged(object sender,
                          CollectionChangeEventArgs e)
{
    // we are only interested in entities that
    // have been added to the state manager
    if (e.Action != CollectionChangeAction.Add)
        return;

    var state = ObjectStateManager
                    .GetObjectStateEntry(e.Element).State;

    // we are only interested in entities that
    // are unchanged (that is; loaded from DB)
if (state != System.Data.EntityState.Unchanged)
        return;

    OnEntityMaterialized(e.Element);
}

// this method gets called each time
// an entity have been materialized
private void OnEntityMaterialized(object entity)
{
    if (entity is Order)
    {
        Order order = entity as Order;
        // use property injection to assign 
        // a taxcalculator service to the order
        order.TaxCalculatorService =
SomeDIContainer.GetObject<ITaxCalculatorService>();
    }
}

The above is a very naïve example, but it does show how you can catch the materialization of a specific entity type and then configure that entity.

This allows us to add complex domain logic to our entities.
We can for example call a method like: “order.CalculateTotals()” where the CalculateTotals method now uses the ITaxCalculatorService.

HTH.

//Roger

Entity Framework 4 – “Almost” POCO

This is a short rant..

I have been very impressed with EF4 so far, but I’ve now found out that EF4 will NOT support enums.
I find this is very strange, I can’t see how Microsoft can claim POCO support and not support one of the most common code constructs.

More info here:

http://social.msdn.microsoft.com/Forums/en-US/adonetefx/thread/7659feab-d348-4367-b2cd-0456b20262fe

Someone might claim that you can create a private property containing the mapped integer value and then make a public property exposing the enum.
But this comes with two major drawbacks:

1) You can’t create Linq queries that are executed at DB level if you use unmapped properties.
The Linq query would have to use the integer property, and thus loosing it’s semantics.

2) That is not POCO, that is mapper requirements leaking all over the place.

Entity Framework 4 – Using Eager Loading

When Linq To Sql was released we were told that it did support eager loading.
Which was a bit misleading, it did allow us to fetch the data we wanted upfront, but it did so by issuing one database query per object in the result set.
That is, one query per collection per object, which is a complete performance nightmare. (Ripple loading)

Now in Entity Framework 4, we can actually do true eager loading.
EF4 will issue a single query that fetches the data for all the objects in a graph.
This have been possible in other mappers for a long time, but I still think it is awesome that Microsoft have finally listened to the community and created a framework that from what I’ve seen so far, does exactly what we want.

So how do you use eager loading in EF4 ?

Eager loading is activated by calling “ObjectSet[of T].Include(“Details.Product”)”, that is, a dot separated property path.
You can also call include multiple times if you want to load different paths in the same query.

There are also a few attempts out in the blog world to try to make it easier to deal with eager loading, e.g. by trying to remove the untyped string and use lambda expressions instead.

I personally don’t like the lambda approach since you can’t traverse a collection property that way; “Orders.Details.Product” , there is no way to write that as a short and simple lambda.

My own take on this is to use extension methods instead.
I always use eager loading on my aggregates, so I want a simple way to tell my EF context to add the load spans for my aggregates when I issue a query.
(Aggregates are about consistency, and Lazy Load causes consistency issues within the aggregate, so I try to avoid that)

Here is how I create my exstension methods for loading complete aggregates:

public static class ContextExtensions
{
  public static ObjectQuery<Order> 
           AsOrderAggregate(this ObjectSet<Order> self)
  {
    return self
        .Include("Details.ProductSnapshot")
        .Include("CustomerSnapshot");
  }
}

This makes it possible to use the load spans directly on my context without adding anything special to the context itself.
(You can of course add this very same method inside your context if you want, I simply like small interfaces that I can extend from the outside)

This way, you can now issue a query using load spans like this:

var orders = from order in context.OrderSet.AsOrderAggregate()
             select order;

And if you want to make a projection query you can simply drop the “AsOrderAggregate” and fetch what you want.

HTH.

//Roger

Entity Framework 4 – Managing inverse properties

[EDIT]
I was wrong!

It is perfectly possible to do one directional associations in EF4 and POCO mode.
You simply have to manually remove the “ <NavigationProperty ..” tags from your mapping files.

Awesome work EF4 design team :-)

[/EDIT]

Original post:
To my surprise I’ve found out that Entity Framework 4 don’t support one directional collection properties.
That is, if you have the entity “Order” which has an “Details” property, then the “OrderDetail” entity _must_ have an “Order” property.

To make things worse, those properties do not have any auto sync mechanism if you are using POCO entities.
They could very well have supported this by adding an inverse management aspect to their run-time proxies that they use for lazy loading in POCO.

While I do think this is a lacking feature, it is not really a show stopper for me.
We can work around this problem by applying the “Law of Demeter” principle.

We can design our entities like this:

OrderDetail:

public class OrderDetail
{
    ...properties...

    [Obsolete("For EF4 Only!",true)]
    public OrderDetail()
    { }

    public OrderDetail(Order order)
    {
        this.Order = order;
    }
}

Order:

public class Order
{
    ...properties...

    public void AddProduct(Product product,
                                   double quantity,
                                   double itemPrice)
    {
        var detail = new OrderDetail(this)
        {

//offtopic: you might want to associate
//the product via ID or via a snapshot instead
//depending on how you deal with cross aggregate references
            Product = product,

            Quantity = quantity,
            ItemPrice = itemPrice,
        };

        Details.Add(detail);
    }
}

This way, we get a whole bunch of positive effects:

We solve the problem with inverse properties, inverse management is handled inside the “AddProduct” method in the order.

We get a nice way to handle consistency in our aggregate roots, the methods can easily update any accumulated values in the order or change status of the order when we add or remove order details.
This is what aggregates in DDD is all about so you should probably do this anyway, regardless if EF4 did support inverse property management or not.

We add domain semantics to the model, “AddProduct” or “ChangeQuantity” have meaning in our model, thus we get a more self explaining model.

This is a quite nice example of how lacking framework features can force you to write better code.
If we did have support for inverse property management, we might get sloppy and just go the path of least resistance.

//Roger

Entity Framework 4 – Where Entity.Id in Array

Here is a little trick if you want to issue a query to the database and select a batch of entities by ID only:

//assemble an array of ID values
int[] customerIds= new int[] { 1, 2, 3 };

var customers = from customer in context.CustomerSet
                where customerIds.Contains(customer.Id)
                select customer;

This will make Entity Framework issue an SQL query with the where clause “where customerId in (1,2,3)”, and thus, you can batch select specific entities with a single query.
I will get back to this idea in a later post because this is related to how I design my entities and aggregates in DDD.

//Rogerr