Sorrow and Elation – Why Reflection Isn't Always Your Friend

2 Comments

Over the past two weeks I've begun moving a lot of my hosting infrastructure onto Microsoft Azure. This has consisted of all sorts of configuration – virtual networks, VPN's, backups, servers, databases and sites. One of my sites uses Code First Migrations at it's core, and during the migration simply nothing would work. After much pain and frustration I made the worst mental leap a developer can make: I came to the simple conclusion that is was everyone else's fault. Like most of these kinds of problems, I was to blame all along.

What follows is a tale of how even though managed languages like c# are awesome 99% of the time at telling you what's wrong – the other times you're sweet out of luck and need to fall back on stabs in the dark.

The Exception that started it all

When using Code First Migrations in a website where I'm not running migrations through the migrate.exe, I often call the database initialiser on Application_Start(). There's a few ways to do this but it often looks something like this.

public class MvcApplication : HttpApplication
{
    protected void Application_Start()
    {
        DataContextInitializer.Initialize();
    }
}

public class DataContextInitializer
{
    public static void Initialize()
    {
        var initializeMigrations = new MigrateDatabaseToLatestVersion<MyDbContext, Migrations.Configuration>();
        initializeMigrations.InitializeDatabase(DataContext.CurrentContext);
    }
}

This works fine 99% of the time (Side note: if you want it to still throw your friendly error page if it fails this should really go on the first Begin_Request() so your application can still start).

If you've ever used Code First Migrations before (with Automatic Migrations turned off) you'll know that when this doesn't work as expected it'll explode in one of a few ways:

  1. Throw a DbEntityValidationException because your database and code first model are different and EF needs a new migration to tell it what to do. This usually contains information about the model properties that need help.
  2. Throw a AutomationMigrationDisabledException because it can't figure out what to do and thinks that giving it power to make decisions may help.

The latter is one that I don't see very often as usually EF is able to tell me what field is wrong in the exception and therefore it throws the DbEntityValidationException with a whole bunch of model property information carried along for the ride to help you find the problem and fix it.

"To the Cloud!"

As my trial migrations to Azure wasn't a paying job (read: being carried out gung-ho after midnight), what I was setting out to do was a combination of things all at the same time ("what could possibly go wrong!").

  • Deploying a website to the new Windows Server 2012 R2 and IIS 8.5.
    • Setting up holding DNS records.
    • Configuring IIS 8.5 and all the new toys.
    • Installing and configuring Web Deploy.
    • Getting deployments from my build server working seamlessly.
  • Moving my database to SQL Azure.
    • Exporting a database from a working copy as a BACPAC file.
    • Uploading this file to Azure Blob Storage.
    • Importing this into a new database server in Azure.

All of the above would become vectors and dark alleys that nearly lead to my insanity…

"It's Alive… ALIIIIIVE!"

So I've completed everything and everything seems to be working on it's own.

I can connect to the SQL database and it all looks good.

I've tested connectivity to the SQL database from the web server using the awesome UDL SQL connectivity trick.

And then I attempt to load the website.

It explodes…

Exception information: 
    Exception type: AutomaticMigrationsDisabledException 
    Exception message: Unable to update database to match the current model because there are pending changes and automatic migration is disabled. Either write the pending model changes to a code-based migration or enable automatic migration. Set DbMigrationsConfiguration.AutomaticMigrationsEnabled to true to enable automatic migration.
   at System.Data.Entity.Migrations.DbMigrator.Upgrade(IEnumerable`1 pendingMigrations, String targetMigrationId, String lastMigrationId)
   at System.Data.Entity.MigrateDatabaseToLatestVersion`2.InitializeDatabase(TContext context)

Of course I read this exception and thought

"Ono, something is wrong with my code first models, everything was fine before using SQL Azure"

Of course I was oh-so wrong.

But this didn't stop me from Googling for about an hour or two to figure out what was wrong.

And then Googling some more…

(As Scott Hanselman would say, of course this was "Googling on Bing").

The investigation went into full swing

imageTo be honest I searched and battled this for about 6 hours.

And then went to bed and did it all over again the next day.

What on Earth could be wrong?

Things I tried:

Using a legacy "SQL Azure" SQL provider for EF

At first I'd jumped to the conclusion that his was all SQL Azure's fault. I felt that this was the main change in my application so it had to be some magic incantation to get EF Code First to work with Azure.

So I looked up the Azure SQL provider I'd seen floating round the net before:

internal sealed class Configuration : DbMigrationsConfiguration<MyContext>
{
    public Configuration()
    {
        SetSqlGenerator("System.Data.SqlClient", new AzureSqlGenerator());
    }
    protected override void Seed(MyContext context)
    {
    }
}
public class AzureSqlGenerator : SqlServerMigrationSqlGenerator
{
    protected override void Generate(CreateTableOperation createTableOperation)
    {
        if ((createTableOperation.PrimaryKey != null)
            && !createTableOperation.PrimaryKey.IsClustered)
        {
            createTableOperation.PrimaryKey.IsClustered = true;
        }
        base.Generate(createTableOperation);
    }
}

I didn't really stop to think that the above code really only helps when actually running a migration, and my exception was not from actually doing this at all.

Comparing the database schemas

After adding my local IP address to the allowed IP addresses in SQL Azure, I used the SQL Schema compare tools in Visual Studio to see if there was a difference.

image

There was no differences between the schema of my local backup and the SQL Azure DB. I was thinking maybe there would be some clustered index differences

Writing out the EDMX model to a file.

When Entity Framework Code First compares itself between your database and your model, it's really creating an in-memory EDMX model from your Code First Classes and then comparing it with the serialised model stored in the latest row in your __MigrationHistory table.

You can reproduce this by saving both to EDMX files and comparing their output.

For your code model:

using (var context = new MyDbContext())
{
    using (var writer = new XmlTextWriter(@"c:\temp\CurrentContextAsEDMX.edmx", Encoding.Default))
    {
        EdmxWriter.WriteEdmx(context, writer);
    }
}

And then from your DB stored model can also be seen by decompressing and serialising the latest model row stored in the _MigrationHistory table in your database.

First to get this out of SQL:

declare @modelAsBinary varbinary(max);
select @modelAsBinary = Model from __MigrationHistory
where migrationId = '201311121254161_LatestMigrationName'
select cast('' as xml).value('xs:base64Binary(sql:variable("@modelAsBinary"))',
   
'varchar(max)') as migrationContent

Then to decode it into an EDMX:

var base64ModelString = "H4sIAAAAAAAEAO09y24jOZL3B.......eYfBJ1mBxirXDM+Z//B3qMS2h/eAEA";
var compressedModel = Convert.FromBase64String(base64ModelString);
using (var memorySteam = new MemoryStream(compressedModel))
{
    using (var gzipStream = new GZipStream(memorySteam, CompressionMode.Decompress))
    {
        using (var stream = new MemoryStream())
        {
            gzipStream.CopyTo(stream);
            File.WriteAllText(@"C:\code\mymodel.edmx", 
                Encoding.UTF8.GetString(stream.ToArray()));
        }
    }
}

There was no difference between the local (on my machine) EDMX model generated from the context and the one in the database.

Take note of this: My local DbContext running in the bin directory of my local machine.

Finally a breakthrough

The last thing that I did was actually to turn on Automatic Migrations to see what EF thought needed changing.

I figured if I let it wreak havoc, at least I could see what it thought the end state of my database should be.

"What could possibly be out of wack?"

And this is where the plot thickened.

The schema changes that Code First Automation Migrations completed were all wrong.

Any model properties that had the [JsonIgnore] attribute from JSON.Net was losing any of it's Entity Framework attributes as well.

For some reason Code First had a problem with JSON.Net

This:

[Required]
[Column("SKU")]
[JsonIgnore]
public string ProductSKU { get; set; }

Was resulting in this:

image

On these affected model classes Code First also ignored any other properties that existed, such as the [Column] attribute, the [ForeignKey] attribute and any others that just happened to be in my model on the same properties as the [JsonIgnore].

After seeing this I was upset that I'd have to rejig my models to remove JSON.net, but I thought removing JSON attributes were a small price to pay to get closer to the solution.

So I removed the attributes and redeployed.

My application was at last happy…

And then it all fell into place…

I loaded my site and the first page I visited that actually did use JSON.Net failed because it couldn't load the JSON binary.

I'd found my real problem.

For some reason I hadn't deployed the JSON.Net binaries when I'd repointed my build server to Azure. For the life of me I didn't know why my build server had changed its mind when copying binaries to the bin directory of my published website, but to be honest I didn't care I was so over the moon to have found the problem.

So I turned on Copy Local and then reverted the code changes I'd made to try and fix the issue by removing JSON.Net. I then redeployed my database and all was right with the world.

Root cause analysis

What this was caused by:

I was using the [Attribute] based approach to code first migrations and at the same time I was using attributes from other libraries, more specifically the [JsonIgnore] attribute.

This requires a dependency on the JSON.Net binary.

I didn't have this binary in my website's Bin directory.

So why did I get a AutomationMigrationDisabledException?

Entity Framework, like most libraries that use class or property Attributes, loads Attributes at runtime using what's called a Reflection-Only Context so that it can get all of your model properties when building it's model.

When this fails because one of your attribute's binary dependencies aren't met, all the .Net CLR would return EF was the property name.

This caused EF to constantly think that my model was different as it never saw the attributes I had specifying "Name this model property's column name something else".

EF then thought my table column should be named one thing, and my database already had it named something else.

This caused Entity Framework to show an exception of type AutomationMigrationDisabledException instead of DbEntityValidationException.

Summary

In closing, I've been reminded again that when working with runtime executed parts of .Net you should take exceptions with a grain of salt. Always investigate deeper, turn on fusion logging and generally be suspicious of everything.

In my case this included [Attributes] as when reflection fails it can make your code fail in unexpected or hard to debug ways, similar to when trying to debug Assembly.Load.