We're in the middle of proof-of-concepting some stuff at the moment and decided to play around with writing a business layer and data layer which is totally implementation and data-agnostic. In some implementations they will be tightly integrated at the object-level, and in others, not so integrated. Our goal is to be able to run code within an environment that can talk to a database directly, and then ship it somewhere else that might have to do it via XML web services, .Net Remoting or whatever. We set ourselves the task of creating 4 competing implementations for creating a 'person' in our database, wiring them all through the same common business layer methods.
Initially, we wanted to focus on the implementations that can interface directly with the data store - the SQL database. Therefore we've decided to do the following implementations:
ADO.Net - using stored procedure wrapping
Linq-To-SQL - (argh! - recently retired - bad move Microsoft)
nHibernate
Entity Framework
Initially, we thought that the fact that our 'person' is represented by two records, as a one-to-one relationship, was going to be the biggest challenge for the data layer parts of the Business Layers. We were wrong. Our 'person' ID is assigned to the record by incrementing an ID held in another table. We have a stored procedure which does this in a safe manner - which uses an integer output parameter to pass back your new record's ID. You set the ID into the record(s), and insert into the table.
Okay, okay, I hear you scream, this is dreadful design. Yes, we know that, but we're in the position that our DB is just so vast that it cannot easily be changed over to use identity columns - so we need to be able to make our data layer fit. And in any case, any decent data layer solution should be able to work with any database design - not just best-practice. To enforce anything else is just naive, given that most databases will predate the code used to access them.
Anyway, the upshot is that we cannot use the very helpful OR/M auto-insert or pure-SP insert options, but have something in between.
There's also another problem - there are three columns in one of our records that are bound to DB-driven defaults. There's a couple of DateTime values that are used to timestamp the insert and modification (a common pattern), and another that is an integer state value. So, our datalayer has to be able to understand that these three columns can be NULL on insert, even though are marked as not-nullable in the schema, and that the values must be read back after the insert/update takes place.
Linq-To-Sql
I'm impressed with Linq-To-Sql - it's simple to use and understand, it's easy to intermingle with straight ADO.Net if you need to, it's easy to override default behaviours and, despite my initial misgivings, it's fast. It also makes it's MetaModel easily accessible - that is the data which tells you which properties are mapped to which columns, and which types are mapped to which tables - which I have used to great effect in another project. Non-disclosure prevents me from specific details here I'm afraid!
Because of the very helpful number of hooks that it provides - you can step in and execute custom code either when a record is being validated (partial method 'OnValidate' generated by the designer for your table-mapped class), or you can implement the partial method InsertRecordType(RecordType instance). There are similar hooks for Delete and Update as well.
Also, bringing a stored procedure into your code is simple, just drag and drop it from the Server Explorer on to the design surface and, hey presto, you get a nice friendly .Net method to do it for you. If you want to map the return of the SP to an L2S-generated type you can, or you can leave them as low-level data objects as well. I've found a couple of problems with parameter types (ever tried writing a code-generator to wrap SPs? parameters are a nightmare!), but nothing that I couldn't solve. It does handle output parameters nicely, though - turning an 'int OUTPUT' parameter into a 'ref int'.
So, we drag and drop our magic ID-giving SP onto the surface, implement the Person.OnValidate partial method:
//here the auto-generated DataContext type is PersonDataContext
//The SP is generated as a method on that.
partial void OnValidate(System.Data.Linq.ChangeAction action)
{
switch (action)
{
case System.Data.Linq.ChangeAction.Insert:
{
//execute the SP to get the ID
int newid = 0;
new PersonDataContext().GetNextID(ref newid);
ID = newid;
}
}
}
I like this kind of model, also - where validating a record is done inside the object that represents that record.
Now to those auto-assigned columns. On the designer, we can change the behaviour of any of the properties that have been generated for the schema. I select one of the DateTime columns, and in the properties window you see one of the properties will give you what you want in a single-click:
Now, Linq-To-Sql will not try to set that DateTime value when it auto-generates it's insert SQL. To ensure that I get the new value after the operation, I also set the 'Auto-Sync' property to Always.
I do this also for the other two auto-generated values.
Now I can 'new' an instance of this Linq-To-Sql generated type, fill out some of the properties, tell a new Context object to 'InsertOnSubmit' that object, and then call submit changes.
Now that was easy - total time to implement, about an hour with learning some stuff along the way.
Entity Framework
Well, as I said to a colleague this morning - the phrase 'sucks donkey balls' comes to mind.
Let's focus on what it can't do, compared to L2S has let us do:
Stored Procedures
Stored procedures are expected to be for Insert, Update or Delete operations on entities. Beyond that, they are only 'truly' supported (as auto-generated methods) when they return an entity result. If the result is a scalar, then the code is not auto-generated, and you have to resort to using Entity SQL. That is - if my stored procedure does a SELECT of some data that I can map to an entity, then I get an auto-generated method.
So, this means that we cannot simply drag and drop our GetNextID SP into the project. Instead, we have to add it into the project, create a 'Function Import', and then write some Entity SQL, which might as well be ADO.Net:
//PersonEntities is the ObjectContext-inheriting object that was generated
//by the EF custom tool.
//This method is an instance method of that type.
public int GetNextID()
{
//re-use existing connection, opening it if necessary
if (Connection.State == ConnectionState.Closed)
Connection.Open();
//this is annoying - having to use dotted notation to refer to the SP
//why make me do more work, when I've already had to first add the SP,
//then import it into the workspace as well!?
using (EntityCommand c = new EntityCommand("PersonEntities.GetNextID", (EntityConnection)this.Connection))
{
c.CommandType = CommandType.StoredProcedure;
EntityParameter outID = new EntityParameter("ID", DbType.Int32);
outID.Direction = ParameterDirection.Output;
c.Parameters.Add(outID);
int retval = c.ExecuteNonQuery();
return (int)c.Parameters["ID"].Value;
}
}
Okay, I might be able to handle that. I'm not happy about it, but if it can handle the DB-assigned default values, then I might forgive it.
Default Values
You guessed it - it can't. Not only does EF appear to ignore the DB schema on this front (remember, L2S detects the default, and then also gives you the ability to either use it, or force the calling code to supply a value anyway), it doesn't even give you the ability to tell it to use a DB driven-default. As a result, the SQL that EF eventually generates will always be passing explicit values on it's insert statements for those columns which should be auto-assigned by the DB on insert - causing our DB defaults to be ignored.
So, our date/time columns, which are non-nullable, have to be supplied in code. If we just leave them as default-initialised values, they are set to DateTime.MinValue. SQL Server then complains that this value is not supported, of course (out of range for the SQL type). I thought I'd be clever, and flip them to Nullable on the EF designer - thinking that if it gives me the ability to do so, then it must work, right? Wrong - EF then simply complains that I've mapped a nullable property to a non-nullable column! Why give me the ability to set the nullable-status then?
So what do I do - do we add triggers to the DB that replace the insert behaviour just to accommodate the fact that there's going to be some stupid code that's sending in useless defaults for default-driven columns?
Do I change the schema? Do I construct a stored procedure to act as the insert operation for this entity? I'm sure these are all MS-suggested workarounds. But why should any of that be necessary? My data layer should not be dictating the data! It should be able to be moulded and shaped to fit my data, now matter how wacky it is!
So, then, I'm forced to reproduce the same default-value behaviour from the database in my code. This is great - if we decide to change that default at a later date, our Linq-to-Sql and ADO.Net implementations (I dare say nHibernate too - but I wasn't working on that proof of concept!) will continue on happily, while our EF implementation would need to be rebuilt - or we have to abstract away a whole code layer for getting default values for column-values.
Also - EFs half-arsed attempt at providing a nice factory method for constructing new records (by default, on each entity type that is generated, it creates a static 'Create' method, whose parameters reflect what it deems to be its required values, based on those columns in the DB that are not-nullable. As a result, I have three parameters in this method that are actually not required, because they are automatically assigned. I can't even customise this method, and tell EF not to include these parameters! ARGH! - what is the point?
So far so bad.
Insert/Update customisation
Okay, calm down Mr Zoltan...
[Deep breath]
Now on to the last area where I can draw a comparison between Linq-To-Sql and EF. Remember, we need to get into our new record and make sure that we have set the new ID before the insert is completed. Now, because of the above lack of thought from the EF designers, we also have to make sure we set our other columns to values that the DB would otherwise do for us. So, I start looking round for equivalents of the Object.OnValidate, or InsertObject partial methods that Linq to SQL gives us.
Oh dear - there aren't any [sobs].
In fact, there's practically no externally visible overridable functionality that we get from EntityObject (the required base class for all entities. Hmmm... L2S provides it's functionality without imposing any object-level framework on us - another failure for EF, because it means if I want to these types into our own frameworks, I have only got interfaces at my disposal, whereas with L2S I can use anything). One has to ask then - what is the point of EntityObject? What does it give me? Or does it simply exist to allow everything else to work? If so, then we have bad class design.
Sure - we've got all the business-logic hooks that let us validate per-property changes on our entity. Woop-de-doo - no '10/10 - have a cigar' for EF, just a '5/10 - could do better'.
So what is there? Oh there's an Event on our ObjectContext-deriving object: 'SavingChanges'. MSDN says that to hook into this, you do two things (ARGH again, two steps to achieve one thing!):
//implement the OnContextCreated partial method to add an event handler.
//Personally, I HATE this programming model. I don't see why an object
//should ever be subscribing to one of it's own events - totally contradictory.
//This should be a virtual method that can be overriden.
partial void OnContextCreated()
{
this.SavingChanges += new EventHandler(PersonEntities_SavingChanges);
}
//then the event handler. It'll pass through the changes, right?
//wrong...
void PersonEntities_SavingChanges(object sender, EventArgs e)
{
//hmm... where ARE those changes!?
}
Where are those changes - oh, there's another object that you have to query to get them. And you are forced to get the changes as an enumerable, grouped together by type.
Let's look at my implementation of this event handler:
void PersonEntities_SavingChanges(object sender, EventArgs e)
{
//get all the added objects for this save
var inserted = ObjectStateManager.GetObjectStateEntries(EntityState.Added);
Person person = null;
foreach (var obj in inserted)
{
//have to filter out only the 'people'
//hmm, can see this becoming a huge if/else if
//branch in the future, after we've got a bunch of
//entities in this project.
if ((person = obj.Entity as Person) != null)
{
//set the ID - calling our eSQL implementation of the SP.
//for clarity in this blog post, I've turned the ref
//parameter to a return value,
person.ID = GetNextID();
//now set the other columns that are not nullable, but
//have to be set because EF doesn't understand default-valued
//columns.
person.DateCreated = person.DateModified = DateTime.Now;
person.Status = 1;
}
}
}
Wow, what a load of rubbish! What happened to the object-level and propery-level validation that the L2S designers thought to put in? At the very least, I would have expected a method on the context that let me customise the Insert for each entity type that's been generated!
Instead I'm reduced to iterating a collection of ObjectStateEntry objects, where I have to do a cast to determine the type of pre-insert validation and modification that I have to perform. Quickly, this code will become unmaintainable and unreadable.
Another nail in the coffin for EF, then, and I'm afraid it's the final one for me.
EF Failures - in a roll-up
At first I had high hopes for EF - I found the process of building an 'entity' whose properties are mapped to two tables very simple. At this point it was ahead of L2S, because with L2S you are forced to separate your entities at the table level, meaning that they're not really entities as such. However, L2S makes it so easy to construct your related records (and indeed validate that those required related records have been newed when the insert operation occurs) that I'm not too bothered by it. I figure it would be quite easy to build a compound entity out of your L2S objects anyway.
But then it was downhill from there. It's as if EF has it's own view of what a database is, and if you've not designed your DB in that way, then you can go hang (I actually have a horrible feeling that all full-blown OR/Ms have the same narrow view, but we shall see). The word that crops up time and again with EF is naivety:
Naivety leads the designer to think that SPs are used mostly for CRUD, in fact they are used as often for other purposes, and sometimes you want to be able to call them from code. It's not so unreasonable!
Naivety leads a designer to ignore default values assigned by the DB for a column - and not consider that a non-nullable column might still be able to be null on the insert statement.
Naivety in the object framework that gives us the bare minimum room for maneuver when it comes to customising the activities within the EF model itself. Naivety also that enforces a base class that provides no real benefit to the consumer of the framework other than what you already expect of it. Consider the number of overridable methods in System.Windows.Form, or System.Web.Form - that's what I expect from a base class in a framework that wants to be as fundamental as EF pretends to be.
Even the designer is lame - tried removing a SP that you no longer need to map? You have to drop out of the designer, and hack the XML to do it. Okay, it's simple XML, but it's simply not acceptable to offer me a unified development experience, and then fail to offer fundamental features within it.
Conclusion
With EF, Microsoft has set it's goals high, which is no bad thing, but at the same time it has simply not managed the community's expectations of what it was going to provide. In my opinion EFv1 should have been straight-to-video as a separate package and gone through another few versions before MS took so bold a statement as to have integrated into the Framework. There are better frameworks already produced by MS that deserve candidacy for such integration - like those in the Enterprise Framework - that will forever be ignored. To push businesses towards a half-baked solution such as this which, as far as I can see, is going to have to change quite fundamentally in order to fill the huge chasms of functionality that is missing (but present in its little brother and, of course present in nHibernate), is simply reckless.
To invest in EF now is to acknowledge that a large slice of your code is going to have to change as the framework evolves to accommodate real-world database usage - can you really afford to do that?
If you're working on a new project that needs a brand new database, then by all means consider EF. In the meantime, stick to what you're already doing. L2S is an option, also, but unless MS actually deign to acknowledge that it is in fact a superior product that has a lot less distance to go in order to be a full-blown OR/M that EF pretends to be, then you are faced with the fact that it's going to remain as it currently is for a while yet.
I'm sure that all these shortcomings and more will be fixed and improved upon in future versions of EF, but the fact is it's not really fit for purpose in its current form, and as a result it should never have been let out of the stable. At the moment, instead of a race-horse it's just a nag that's only just fit for the glue-factory.
So I, for one, am not going to be using this near-spent Pritt-Stick for a totally fundamental part of our business' framework, it's looking more likely that we're going to go with ADO.Net and possibly some L2S as well (nH might creep in - but development is slow on that one - all that manual XML hacking is a little bit cumbersome for my liking) - because it's light and doesn't force us to re-design everything else around it - but we know already that it'll be as strong as Cyanoacrylate.
I might look at EF again in 12 months, if I remember.
Of course, I could set myself a reminder.
Right now I just can't be bothered.