A few days ago I finally discovered why rake db:migrate:redo consistently angers me nearly as much as watching Paula Dean deep fry the vegetable kingdom. As any devoted connoisseur of the db rake tasks in Rails knows, db:migrate:redo always leaves your schema.rb file in the wrong state. The reason, as mentioned in our standup blog, is that rake will only invoke a given task once in a particular run.
To trivially test this try running a single task twice:
rake db:rollback db:rollback
You’ll find that your database only rolls back one migration. Now, you can set the STEP environment variable when calling db:rollback, but this is, as I said, a trivial example. It gets worse.
Take a look at the implementation of the db:migrate:redo task. The part we’re interested in looks like this:
namespace :migrate do task :redo => :environment do ... Rake::Task["db:rollback"].invoke Rake::Task["db:migrate"].invoke end end
That looks fine; db:migrate:redo just verifies that your new migration will properly run down and up without blowing up. Sweet.
But, here’s what db:migrate looks like:
task :migrate => :environment do # Do migratey stuff Rake::Task["db:schema:dump"].invoke if ActiveRecord::Base.schema_format == :ruby end
task :rollback => :environment do # Do rollbacky stuff Rake::Task["db:schema:dump"].invoke if ActiveRecord::Base.schema_format == :ruby end
Both db:migrate and db:rollback dump the schema after they run, as they should. If you were to migrate or rollback your database and not dump the schema, then your schema would be in an invalid state. So, of course you can see where this is going, when you run db:migrate:redo the task performs the rollback, dumps the schema, performs the migrate, and then doesn’t dump the schema, because that task has already run. Boom, your schema is one migration behind, db:test:prepare loads the invalid schema into your test database, and all your tests fail (or, worse, pass inappropriately)
Now, I assumed this was a bug in Rake, and so I went on a little investigatory safari through the jungles of the Rake code to find it and kill it. I found the culprit, but invoking each task at most one time is, somewhat surprisingly, the expected behavior; it’s tested and everything. Now I can only wonder why. Why prevent invocation of a task more than once in a given rake run? The code contains unrelated guards against circular task dependencies, so that’s not it. Is this an example of overly-speculative defensive coding, or is there an actual use case for which this behavior is desirable? I’d like to hear from anyone who has written tasks that depend on this behavior, as well as anyone who (like me) considers this behavior unexpected and has run into problems because of it.
Assuming no one steps forward with a compelling reason that Rake should behave this way, I’d suggest that this be changed. I could see the value of it (perhaps as a performance optimization?) if rake tasks were guaranteed to not change the state of anything they operate on, or even were guaranteed to be idempotent; but neither is the case. This behavior severely limits the composability of tasks, since a task writer has to know which atomic tasks have run, and avoid any task that might try to run them again.
In the meantime, Rake provides a way to explicitly re-enable tasks that have run once, but it doesn’t seem to work. The db:schema:dump definition looks like this:
namespace :schema do task :dump => :environment do # Do dumpy stuff Rake::Task["db:schema:dump"].reenable end end
That #reenable call is meant to tell the task “hey, task, you can run again.” I tried calling #reenable on the db:schema:dump task inside the db:migrate and db:rollback tasks as well, but without any luck.