Lee Byrd's blog
In my last post, I modified a Mongoid::Document during a migration in order to access fields that where no longer defined in the class. This time I am using the mongo ruby driver directly to migrate data.
Since I am using Mongoid in this project, I will be using it to access the dependent mongo driver. This will prevent me from having to provide mongo with a connection string in my migration. I am also using mongoid_rails_migrations to roll out the change.
Initial Design
class User
include Mongoid::Document
field :first_name
field :last_name
end
New Design
class User
include Mongoid::Document
field :name
end
Migration
class MergeUsersFirstAndLastName < Mongoid::Migration
def self.up
#get the mongo database instance from the Mongoid::Document
mongo_db = User.db
#query the collection for the fields needed for the migration
user_hashes = mongo_db.collection("users").find({}, :fields => ["first_name", "last_name"])
user_hashes.each do |user_hash|
new_name = "#{user_hash['first_name']} #{user_hash['last_name']}"
#update the new field
mongo_db.collection("users").update({"_id" => user_hash["_id"]}, {"$set" => {"name" => new_name}})
#remove old fields from collection
mongo_db.collection("users").update({"_id" => user_hash["_id"]}, {"$unset" => { "last_name" => 1, "first_name" => 1}})
end
end
end
Resources
When first starting out with mongodb, it's easy to make the wrong decision on whether to embed a document or not. Even if you made the correct decision at that moment, changing requirements may force you into a migration. So how do you migrate existing data when transitioning from a standalone document to an embedded document? This is what I came up with.
Initial Data Structure
class User
include Mongoid::Document
field :name
references_many :sales
end
class Sale
include Mongoid::Document
field :price, :type => Integer
referenced_in :user
end
Now with Sale embedded in User
class User
include Mongoid::Document
field :name
embeds_many :sales
end
class Sale
include Mongoid::Document
field :price, :type => Integer
embedded_in :user, :inverse_of => :sales
end
Migrating Sales Data
class EmbedSalesInUsers < Mongoid::Migration
def self.up
# pull your existing data into memory
# consider batching for large data sets
# Note that you must call query methods on the object you are migrating
# for this method to work (i.e. you can not pull via User#sales)
sales_attributes = while_stand_alone_doc(Sale) do
Sale.all.map(&:attributes)
end
# now when you save your data, your fields will be embedded
sales_attributes.each do |attributes|
user = User.find(attributes[:user_id])
user.sales << Sale.new(:price => attributes[:price])
end
# remove all the documents from the original collection
while_stand_alone_doc(Sale) do
Sale.destroy_all
end
end
def self.while_stand_alone_doc(klass)
# by changing the Mongoid::Document.embedded you can temporarily
# modify which collection Mongoid looks to for your model's data store
begin
klass.embedded = false
yield
ensure
klass.embedded = true
end
end
end
There are a couple things to note here.
- The embedded flag in Mongoid::Document is not documented so it could easily change. This was working as of 2.0.0.beta.20
- When you create the new embedded document, make sure you pass only the attributes you care about. Passing all attributes will add things that you no longer need like user_id in this case. (For clarity, attributes you assign will be persisted, though you will only have setters and getters for the fields you explicitly define in your document.
- I am using mongoid_rails_migrations in this example
