Mac OS X Essentials • 56:13
As your application evolves, typically so does your data model. In Leopard, Core Data provides support for creating versioned managed object models and for migrating data from one version to another. Learn how you can use these new features to make it easier for you and your users to transition to the latest version of your application.
Speakers: Miguel Sanchez, Ron Lue-Sang
Unlisted on Apple Developer site
Transcript
This transcript has potential transcription errors. We are working on an improved version.
Good Afternoon everyone; welcome to session 111 on Managing Schema Versioning and Data Migration in your Core Data Application. My name is Miguel Sanchez, I'll be doing the first half of the presentation and then I'll be joined by Ron Lue-Sang who...and we're both engineers in the Core Data team. So, seventeen days until the iPhone ships.
( Laughter )
( Applause )
This session is actually about Core Data versioning and migration. I hope you guys had a chance to go to the previous 2 sessions. How many of you were in sessions this morning or have prior Core Data experience? Yea, given the limited crowd I'm assuming you guys are the hard core, Core Data fans.
So how will this session help you? This is the situation you might find yourself in as you're developing your Core Data applications. You'll have a particular version of your application, it's great but you continue to improve it and before you deploy it you have to face the following issue.
Your customer data or whoever is using your application is still bound to the previous version of the application; so as you've improved your application you've decided to improve the schema...you basically decided to manage your data in a slightly different way and it's like a different shape and form, so we have to find a way to migrate your data over from your old version of your application into the new format that you require for your new application; so this is what this session is about.
We'll be talking about 3 features; the first is a new way we have in Leopard for tagging your models and your stores with a particular version ID; the second is a new model we're introducing, a mapping model that allows you to...this is where you tell us how you want us to migrate your data from one to the other and the third is course new run time classes that perform the migration for you and which you can customize.
A quick Core Data review for those that are new to the technology or skipped this morning sessions; this is the high level items you need to know for this particular talk. So Core Data manages data for you; to do that we need the description of the data you want us to manage in a model. Once we have that description we maintain a file in your file system, that's your store, that's where your data lives but you'll actually be interacting with your data as an object graph, as a...in memory object graph of Objective-C instances; so this is what Core Data is about.
Now of course a Core Data application has more; it's got UI, it's got your custom logic, for purposes of this presentation we really won't be talking about migration issues relating to moving to new UI or even API for example; let me make clear that this has nothing to do with new Core Data APIs that you have to migrate to; this is about you guys, right; so like my wife says, "Honey, it's not me, it's you." So I think what she means by that is that I have chosen to move, to evolve my model over and I need to take care of moving the data over.
So why do we need migration? As long as we have a model description we're able to create the persistent store file in which we're going to use...in which we're going to store your data; so that works fine; we're able to persist the data that you asked us to do.
Now, the moment you decide that your customers are really asking for you to give them an application that manages stars and hexagons then you're breaking the schema with your store, right? I just want...you're giving us a model, you're pointing us to a store that is not described by the model you're giving us so we have to do something about that; this is where migration becomes an issue.
So this can happen for 2 reasons; either you're deploying a whole new version of your application, a major version change but realize this also happens internally as you're developing your own Core Data applications; you're modifying your schema; you're experimenting with things so the migration issue is still the same; this is not just a deployment time kind of thing; you'll also have to deal with it within your own many versions that you have in your deployment cycle.
So once you realize that you have to migrate your data you need 3 things. You need to give us...we don't do magic, we still need the model, the Core Data model descriptions to both versions of your data; so you need to give us that. You also need to give us a mapping model that tells us how to get from one to the other and of course you need to point us to the stores that you want us to migrate.
Most of the time you will be doing this within Xcode; we now have a new Core Data model which is a version model; it's now a group and a directory in the file system and inside that you can put in the multiple versions, your multiple generations for each particular model that you're working with and we now have a new file type which is the XE mapping model and you can see the editor there under write and I'll be showing these shortly in a demo.
So this is what you need to begin the migration; what happens behind the scenes to actually perform the migration. So we're now starting to talk in terms of sources and destinations, this is the terminology I'll be using throughout the talk. So we get to migration by you giving us a model which is incompatible with your source store; so the first thing we have to do is find the source model that we need to open the source store and the mapping model that tells us how to migrate your data over; this is all given to the migration manager class, one of our new classes.
The migration manager's responsibility is to create a new Core Data stack with the proper data format. So to do that it instantiates a whole source Core Data stack; it knows how to do that because you given it the source model. We instantiate a source stack, we fetch the instances and one by one we migrate them over to the destination.
Why should we do this in multiple stages? During the first stage we only create instances, during the second stage we recreate the relationships and during the third stage we make sure that the migration was correct and we do the validation. So at the end of this process you have a new Core Data, a persistent store with the data that is compatible with the new version of the model that you gave us.
So let me show you how this all fits together so that you know what I'm talking about. So here we have 2 simple versions of the same application; on the left is my source application, on the right is my new application; so you can see on the left we have a very, very simple Core Data model; it's on a person with age, name, state, and street address fields and the new improved application not only has a slightly different UI but I've also decided to improve my schema by introducing a new relationship and a new entity. So let's begin here by inputting some data in my original application.
( Pause )
I'm so glad I'll be able to drink at the beer bash this year. And my son...let's add 3 records; he turned 2 last week; he still lives with me, and then Ron...
( Pause )
So that's your version 1 of your application; now you deploy your new application and we don't have it set up to do versioning yet but I just want to show you what happens if you try to open the store we just created with a new application. So, we have the demo store and we get an error panel where Core Data detects that there's a version incompatibility, so we need to do something about that to move your data over.
So fortunately we've already have all of the elements that we require in this project; I have just not enabled versioning to happen automatically. So we have, like I said, a version Core Data model with both the old version, here's the person entity and the new version and we also have a mapping model which I'll go into a lot more detail in the second demo as to how to create one of these but right now we have one where we have the mapping definitions that tells us...that tell us how to go from person to person and from person to address and the last thing I need to do it by default we bring up that error panel; unless you tell us to try to do automatic migration.
So this is a default code that you get from the document class and one of the methods is where you're configuring the persistent store coordinator and all I'm doing is, we have a new option and NS migrate persistent store automatic option, which I'm setting to yes...that's the only reason I'm overriding this method and we just called the supra implementation so all I'm saying with that option is "If I see this option, go out into your project, look for the corresponding models that we need and try to perform the migration." So let me launch the application again. Here's the old data...we open it...here...and boom; we have new data.
( Applause )
So we can go back to slides please. So that's just a quick overview so you know really how you'll be interacting with these tools; now we're ready to go into a lot more detail and you won't be so lost as to what I'm talking about.
Model versioning and mapping...let's step back a little bit. Let me introduce a few patterns that you might come across as you're evolving your schema of your Core Data application. Now these could be very simple things, you're simply adding or removing attributes or entities. You can have fairly intermediate kind of changes where you're actually transforming the data that you have in your store, up to nontrivial transformations; you have to deal with incomplete data, duplicate detection; I'd like to point out at this moment that even though Core Data helps you a lot with the problem of migration from a data base technology point of view migration is a very sophisticated issue.
You could really get into trouble if you're trying to deal with your data in a manner that you didn't intend to so you have incomplete data, the transformation ends up being a fairly sophisticated thing; so that's independent of what Core Data does for you. So here's the first pattern; we have a very simple entity that has an age and a name and you might want to extend in version 2 of your application you now what to keep address information in your entities and now you have a state and a street.
You continue the evolutions from version 2 to version 3 and you decide to pay attention to what Melissa told you in the first talk and extract address fields into a separate, a whole separate entity; this also gives you the opportunity of having multiple address per person; you can also do this without migration solution.
Going from version 3 to version 4 you decide that you now want to introduce inheritance into your model; you're interested in seeing person instances not only as generic persons but either as adult or children based on their age and you decide to also introduce additional information per subclass, per sub entity, I'm sorry; company names for adults and school names for children and the last change you might want to do is this is where you start cleaning up your data.
You notice that adults and children have company names and school names but of course multiple adults will be working at the same company and multiple children will be going to the same school so you want to do a unique pass and extract all the common values for company name and school name and have their own separate entities for them and create relationships. Okay, so this is just an example of how you can be evolving your model from version 1 all the way to version 5 and all of these things you can do with our versioning mapping solutions.
So, how does Core Data know that you have made a change in your model that breaks our ability to read your data? If you change any of the following things that are up on the screen this breaks our ability to break, to read your data from your store.
You're changing your entity name, you're saying anything having to do with inheritance, the properties you're actually writing out or you want us to write out to disk, at the property level names, optionality, attribute types, obviously the definition of the relationship; we've already prebuilt the store for you and now you come by and change the structure of what you want to manage so that will break our ability to read you data.
There are certain things that don't break our ability to read your data such as the class name that you're using a runtime to instantiate your instances, transient properties because those are not actually persisted; user info validation, predicates, which really don't come into play until we save your data out and the same thing for properties.
So what do we do with these changes? We simply calculate a hash digest for each one of the entities that participates in your model; so you give us your model, we do through a hashing algorithm, we come up with a number, that's the version ID, you give us a different version of the model, we pass it through a hashing algorithm, we come up with another version. So, I'm going to be using shapes from now on just so that it's easy for you guys to detect when there's a difference rather than trying to part long strings of hex numbers.
So here's what I was talking about that certain changes do affect the version hash and others don't. You have version 1-0 of your entity, we calculate its version hash we come up with a particular value, you continue evolving you application; you might even change the name of your model but you only happen to touch validation rules, user info or the custom class that you are using so even though for your purposes this might be a different version we're still compatible with your stores; so our hashing algorithm will detect that you haven't changed anything of significance to us so we have the same number coming out the other side. Of course the moment you start adding properties to you entities or for example an address relationship then we will detect that something is different.
I'd also like to point out that we don't have the concept of versioning even though I keep saying versioning for your models. We don't really manage the concept of versioning at the model level; a model is really just a collection of entities so when we talk about versioning at least from core data's point of view we're talking about the versions of the entities inside your model; we don't really tag your model with a particular version identifier.
So what do we do? You give us a model, you point us to a store and we create a Core Data stack for you; we calculate the version hashes for the model and when you ask us to save we save that versioning information as metadata in the store; so you simply, we have a new key, you know, metadata in a store, model version hash key; you can look it and all you see will be a dictionary with the entity names and the hash digest for each of the entities.
Once we have this information now we can really determine that there's a version compatibility or incompatibility. We can't fully trust you guys to tell us when something broke because your idea of what a version is, is different from what our idea is. So, if you give us proper versions of the model in the store, how do we know they're correct? Because we have that information; we go to the store first, get its metadata, check against what we have from the model; this is fine; we build up the stack for you. Version incompatibility we now have different numbers at the model at the store and this is how that panel came up; this is how we were able to know that you were giving us a wrong model.
Now of course you don't want to deploy error panels in your applications. You want to have function applications that open your preexisting data; so this is where mapping models come into play. So here we have version 4 and version 5 and we're going to need to create a mapping model that tells us how to go from one to the other.
I'm going to flatten out the models here a little bit; so I'm going to get rid of the inheritance lines in the relationships...so you can see on the left the entities that participate in the source on the right the entities that participate on the destination. At a very high level what a mapping model has to tell us is the following, "You know what Core Data, create adult instances from my person, create other children instances from person to and just migrate my address instances as they are because I didn't really change any of this version; so this is what you're telling us, how to get from one to the other. Likewise at the attribute level you're telling us how to populate each thing in your instances as we're moving the data over.
So we have equivalent classes; you should know that we have...modeling classes in the current version of Core Data to...represent a Core Data managed object model; so we now have 3 new classes to represent a Core Data mapping model; mapping model entity mapping and property mapping. At runtime these are used by a new class which is the true workhorse of the migration logic and its entity migration policy; so the migration policy has all the default implementation for how to migrate your instances over and we use the definitions that you give us in your entity mappings and the whole process in coordinated by a migration manager; these are the 5 new classes that we're introducing for migration.
A quick refresher of the migration process, remember that we build up 2 Core Data stacks and we stage multiple stacks, instance, creation, relationship creation, and validation. So entity mappings, the 2 main things we want to know about an entity mapping are what the source entity is and what the destination entity is and you can also give it a mapping name; as your mapping models grow it's important that you have names on them so you can differentiate them.
A mapping order, it might be important for you to migrate certain instances before others and of course you can plug in your own custom policy class names so that if you don't like the default implementation this is where you tell us, "You know what, use this class which is a subclass of your default implementation and that's where I'll do my customize migration." At the property mapping level we have the name of the destination property and a value expression definition. So you have a...a value expression which we evaluate and whatever comes out of that is the value that we're going to stick in your destinations property.
As we're evaluating the value expression and as you're giving it to us you have access to a handful of special keys which are dollar sign something; so you can access the source instance, the destination instance, the manager, even the entity mapping if you happen to need that during the evaluation of the expression. Here's a very, very simple expression; you'll see a more sophisticated one later on, 2 or 3 slides down.
Let's say that you want us to fill in your name property by doing a simple copy of the name property from the source; so you would just do dollar sign source dot name; we evaluate that, we get its name and that's what goes into the name in your new entity. Now let's go back to the pattern that I introduced in the earlier session; actually it was this session.
So we have person to person, you would have an entity mapping that we can name whatever we want, I'm going to call it person to person, the source is person, the destination is person, that's what you want us to create and the property mappings are the following. So you want us to get the age and the name from the sources age and name; pretty straight forward, first step. Second panel, we still need the mapping we just talked about because you still want us to create persons from your persons so you still have that mapping but in this case you need an additional mapping to create addresses.
So we have a new mapping, persons who address; the source is person, that's obvious. The destination is now address and here are the mappings. You want us to create an instance of address for each person and you just want us to copy over the state and the street as we were basically getting the information from the person instance and sticking it into the new address instances.
Now we get into the inheritance example; you want us to create adults and children, 2 mappings; the first one is person to adult, person to adult, the source is a person but you can attach a filter predicate to the mapping; so you only want us to fetch persons who's age is over 18, the destination is adult and the property mappings you've seen before; likewise for the child, we have another mapping...that the only thing that changes here is that the filter is now under 18. I don't know what happens with the animations there. The filter is now under 18 and the destination is child.
You'll notice that I have glossed over; I haven't even mentioned relationship mappings. Those take a little bit; those take one additional step that is unlike the property mappings. You might think...so what we want to do here is hook up Ron to his corresponding migrated address instance; now you would think you would want to do something like, "You know, just populate his address with whatever the address was on the source." Why won't this work? As we evaluate dollar sign source we have an instance of source in the source context; we traverse the key path; we have an instance of address in the...source context but this is not the one we want; we want to find this one, right.
We can't just take an instance from a source content and put it into the destination; these are completely different objects; they have different object IDs; they could be even different address definitions for the entity; so what we really want to know is "what was the corresponding address that was created in the destination for this source?" Once we have that we're able to hook up Ron to his migrated address; so fortunately the migration manager keeps track of this as we're doing the migration. So as it's moving each instance it's keeping lookup tables so that you can ask these kinds of questions.
So what you have to do in your mapping model is, you can't just do the simple expression I had before; you still have those elements; you still have, "I want to walk, dollar sign source dot address" but you also need to tell us which entity mapping was used to create the instances that you're interested in getting here on the destination; so here we want to get the addresses that were created with the address to address entity mapping and all the other stuff, don't be overwhelmed; that's just because we have a value expression this is how you have to do it in a valued expression to execute a method called on the manager and we also have UI that makes this...defining this a lot easier; you don't have to type in this whole thing.
You'll also remember that in one of my patterns we didn't really have a whole separate address instance on the source but the mechanism that we follow is the same; so we evaluate the source; we don't have source dot address because we remember the address information was embedded in the source model but we still want to answer the same question; we still want to answer, "What was the address instance that was created for Ron in the source?" So the question is the same; the only thing that changes is that we don't use the full key pad in the value expression; we only say, "Go to the source, get Ron, now look at another table, look at the person to address mapping; so now that you have Ron tell me which address was created for this person and whatever that is that's what I want to bind to my address relationship." So this is why you need an extra step when you're hooking up addresses. So let's show you this in a demo.
This demo will show you how to create version models and...mapping models. So I have to bring up...I have to bring up this guy...so this is where we ended in the previous demo; now we want to get to this model. So in the previous demo we did not have inheritance; now we're going to introduce inheritance; so here's a new version of my application; I'm going to launch it; I'm going to...you see we have a new UI where we separated between adults and children; we're going to try to open the old version of the data.
Here we have demo and there's our error. Our error panel, you have to do something about this so let me...now I'm not going to cheat; I'm going to show you how to create things by hand here. Before I forget let me built in Queen because I always forget to do this. So here we have...the new version of you model, right. It's got inheritance.
How do we create a version model inside of Xcode? We select the model; we go to the design menu, data model and now we have a new menu item here that says, "Data model version," so this becomes a group; it just made a copy of the model you had selected. So here's the old version, the new version; I want to rename them just so that they're easier to identify.
So this is the new...this is the old...but the old didn't really have inheritance so I'm going to modify it to actually be the old; so get rid of that, get rid that; I also added an attribute in the new model for city; I'm going to get rid of that. Okay, so now you have a version model. You have your old version and your new.
We also want to make the new one the current version so that when the application launches it uses this one. So you'll see this little green icon here; that indicates what the current model version is. So I want to make new the current model version so you just go to the design menu, set current version and you see that the little icon changed.
Okay, so that's it; we have version models, now let's create the mapping model. Go to file menu, new file, the design category, mapping model...let's call this from old to new; it asks me for a source and destination models; that's my source; that's my destination; here we have a mapping model. The wizard tried to be smart and...part your model and tried to create a partial mapping model for you to continue editing.
So it recognized, by the way the editor has the entity mappings area, the property mappings area, the detail area, even a short dipping view so you can see the differences between your source entities and your destination entities. So it created a mapping for us to migrate persons to persons but we don't need that in this case because person is an abstract entity in our new model. It created a mapping to migrate addresses to addresses with the property level mapping, that's fine.
We don't really have city or residence in our new destination so we can just leave those out; actually we do have residence but I'm going to deal with that in the inverse relationships side. So if I leave a mapping out we just use the default value that's defining your model. So here's the interesting ones. How do we create adults? We fetch them from person.
How do we filter them? We only fetch the ones whose age is greater than or equal to 18, right, and that's our mapping for creating adults and we're going to get rid of company name because we really don't have company names yet. The same for child, the source entity for a child mapping is person and the filter is age is less than 18 and we get rid of the school name mapping which the wizard tried to be smart about; so here we are.
We have 3 mappings migrating addresses, creating children, creating adults; you'll note that we have the mapping for the relationship; here's that scary function which you can type if you're a masochist or use our fancy UI here where you give us a key path that you want to navigate and the name of the mapping that you want us to use to do the translation over to the destination. So we've got everything we want now; we have a mapping model, version models, we launched the application. We launched the application side...so here's our old data...we open our store...and we have adults and children and that's migration.
( Applause )
( Pause )
Thanks Miguel. My name is Ron LuSang and I'm an engineer on the core data team and I'll be your host for the rest of the hour. So let's see...so migration is a big topic and instead of going over every piece of API that we have for the migration process I'm just going to go over some the big concepts to make sure that you guys feel comfortable working with the classes later on and navigating the documentation, asking questions. Let's start off with versioning; so we assume that you figured out versioning; you've mastered creating versions of your managed object model and you have a store that you want to migrate forward and you've also figured out how to do mapping between the different stores; the different models.
So I'll focus mostly on how the migration model drives everything that happens in the migration classes and I'll focus on 2 classes in particular; first the migration manager. So the NS migration manager is really the heart of the migration process and we saw earlier that we have all these other things around the migration manager.
It manages the source and destination stacks and all of the entity mapping in your mapping model and we touch on it as well that it keeps track of these per-entity mapping association tables that let you go from the source and destination objects, figure out given some destination object what source object was it created from. Okay, so there's a lot of stuff that goes on around the migration manager and it's the heart of the migration process for that reason.
So the other class that I want to talk about is if the migration manger is the heart of migration then NS entity migration policy is kind of the limbs of migration that actually does the hard work in each of the 3 stages. The entity mappings in you mapping model each have a migration policy object attached to it and in each of the 3 stages, instance creation, relationship creation, and validation, it's the policy objects on each of those mappings that does the real work.
I'm going to talk about each of these stages but first I want to make sure that you realize where migrating your data in all of these stages; it's not necessarily the behavior in you custom managed object subclasses; so in the runtime we only use manage objects to hold all the data during the migration process.
This greatly simplifies everything since we don't have to keep multiple different versions of the same classes and the runtime it wants but we do have a validation step so the custom logic that you would normally have called in your custom managed object subclasses does have a place to live. You can move that into a different area and we'll talk about that when I say, "Limbago." We'll see that in stage 3.
So, first we'll look at how the migration policy works in stage one, instance creation. So as you saw stage one is really about copying over the objects wholesale from the source of the destination and the basic implementation of the migration policy is to just expect one source instance to be handed to it and then it'll create a blank managed object in the destination context, will copy over all the attribute values and then the important bit is associating that destination object with the source object we were handed.
So one to one mapping from source to destination; that's great for a lot of common cases but you can imagine many other cases where you might want maybe more destination objects than you started out with for source objects; so imagine having one large record that had person information as well as multiple copies of different phone number records or something like that and you want to migrate it to a new model where you have different instances for each record that you might have had for a phone number.
So that's creating multiple destination instances for source instance; you might want to do the opposite which we also talked about for normalizing your data. Now imagine as we saw earlier you have child and adult and each one has an attribute for respectively school name and company name. Now if you had 100 child objects you'd have school name replicated a hundred times; big waste of space; so we move towards using a relationship between school and child and company and adult. During migration we want to make sure that we migrate over all of the 100 kids but we don't create 100 pretty much identical school objects for the migration.
So to put in logic like this we implement this method in our custom migration policy. Create destination instances for source instance entity mapping manager error, which is a long method but the important pieces that there are 3 arguments that you have to pay attention to. The source instance entity, mapping, and the migration manager are there so that you can do your work in this method for you policy and that work is simply to create as many destination instances as is necessary, populate the attributes of that new instance and then important, go to the migration manager and associate the destination instances you've created with the source instance you were handed.
( Pause )
So now we've gone through and we've created all of the objects in our destination, the migration manager has gone through every entity mapping in our mapping model, given each of the entity mappings policy objects a chance to create these destination instances. We have a big soup of objects with no relationships so now we start stage 2.
We saw by default what the migration policy tries to do; it goes through these association tables that the manager keeps for us and this is where I say limbago; this is the important piece; this is why during the instance creation phase it's important to associate our destination and source instances; otherwise we can't...we might have a hard time actually, not that we can't. This is the easiest way to figure out which destination object should be related together.
So you can do something other than going through the association tables; you could imagine having some custom logic that figures, "Well there's an adult and there's a child and they show the same address, they might even show the same last name, let's assume that they're related, like Miguel is Ovie's papa, of course. That's not something you could get directly from the association tables but you could put custom logic in your migration policy subclass in this method, create relationships for destination instances and see mapping manager error. Yet another long method that gives you 3 arguments.
The important thing is the destination instance for which you're suppose to create relationships and then you can leverage the entity mapping and the migration manager to figure out, "Well which instances were created from the source instances related to the source object that we got from this destination instance." So, again you'll end up using the migration managers association methods a lot here...and once you figured out which objects need to be related in the destination context you relate them just using regular key value coding or setting the relationships for regular managed objects, right.
Okay, so now we've gone through all of our entity mappings twice giving each entity mappings policy objects a chance to create instances and now relate objects and we save again. So now in our destination store we have all of our objects copied over; they're all related together; now we should run validation on it to make sure all the data is correct.
The base implementation for the migration policy does something very simple. It looks at the destination model and re-enables all of the validation rules that were set on all of your properties; so here you can see for person the name attribute its non-optional, it has a minimum and a maximum length; so we would use this to validate all of the person objects that were created in our destination context.
You can do something special and augment this behavior by implementing in you custom policy the method, "Perform custom validation for entity mapping manager error." Here you'll notice you're only handed 2 pieces of information, the entity mapping and the manager itself; it's up to you to fetch or find any objects in the destination context that you feel you need to validate; so this gets called only once for this stage on your custom policy...and in this case you're probably going to call super so that you can get any custom rules from your object model validated to be used to validate your destination objects.
And at this point we save, we have all of our objects copied over to the destination context, all of the relationships are set up, we've validated them and we have a new context, a new store that we can load and using the destination model we're ready to go.
So let's recap. Again the heart of the migration process is the migration manager and it's the one that drives everything into 3 stages during migration using the mapping models, the entity mapping to configure each of the policies and the policy objects are the ones that do the work and if you want to customize anything that's the place where you're probably going to want to start, is by sub-classing the policy.
But before you can actually perform the migration you have to configure the migration; really configuring the migration manager. Now as we saw there's a lot of stuff that needs to go into the migration, the manager keeps track of all these different things; they come from somewhere and that's what configuration of the manager requires.
There are 2 ways to configure the migration manager; you can either let Core Data handle everything starting off and just letting Core Data figure it out or you as the developer can look through your application bundle, you can figure out whatever you need to, to configure the migration manager. The important point here is that in both cases whether you do it or you let Core Data figure things out the migration that happens will be the same, right.
The process that we just talked about, the 3 stages, everything about the policy, the migration manager, all of that is the same it's just a matter of who sets up the migration manager. So first let's take a look at what happens when we let Core Data figure things out.
If we let Core Data configure the manager this is what we call automatic migration or automatic boot strapping and this happens when you load a store in your persistent store coordinator and it happens because you asked for it and we saw this earlier where you add a persistent store with type and you add this option to say, "Yes, if there's any version skewed, please try and migrate everything automatically." In this case you've handed the persistent store coordinator, you've handed Core Data a managed object model and a store to try to load; those end up being the destination model we'll use when we create our migration manager and the source store that we're going to migrate from. So that's 2 pieces of the picture that we've already figured out for when we're configuring the migration manager.
From there Core Data can figure out, "Well we'll look in the applications resources, try and find a model, an object model, whose version information matches the version information for our source store." Once we find one of those we can start looking for a mapping model in you applications resources that defines a migration between the source and the destination models that we've already found.
Once we've found that we figured out just about everything we need in order to start the migration; the last piece is where should we migrate the data to...and that's simply a matter of appending something to the URL for the source store that you've already given us. So there you can see that there's not a whole lot of magic here; it's just doing what would probably be done by all of you automatically but if you want to do things yourself, if you want to do something differently, you can use the same API that we're using to do the automatic migration.
You could read the models off of disk; you could read them off the internet; you could create the models at runtime, make them up, not read them from disk at all and the only thing you have to do is manage the manager, like you alloc in at the manager, set the source and destination models and push the button that says migrate using this mapping model from this URL to this URL.
It's not anything special that goes on there; it's all public API...and so from there let me show you a demo of what you can do with a custom policy.
( Pause )
We'll bring up version 5 of our application starting first with version 4 of our model. You can see we already have a version model here...where we started off with the inheritance based version of our model, right; adult and child, school name as an attribute, company name as an attribute. Version 5 of our model...we end up breaking out the company and the school into separate entities, fine.
Here we have our mapping model; so we're just moving over addresses wholesale; we're not doing anything special there. So state becomes state, source, street becomes the destination street and for companies though we're going to break out the company name into a separate entity so we need to read that information from somewhere.
We get that from the adult that we're migrating over...but here you'll see...we've set a custom policy, we named it my migration policy and yea, we'll see how that works in a second and finally when we migrate the adults over to the destination stack...we've set up relationship mappings here so we just copy the address over wholesale from the adult in the source; the age is the same thing, copy over the age and the name...and then here the company we look at the adult to company mapping; so we're expecting that companies will be created in this mapping which is where we set our custom policy and if we look at our document real quick you'll see we've already done the work to trigger the automatic migration; I've decided not to do any of the work myself and here we've defined our custom migration policy; here see it's just a subclass of entity migration policy...and we've overridden 2 methods; for stage one of the migration, instance creation, we've overridden the create destination instances method and we're handed enough information that we can figure out, "Well here's the destination context from the manager; We know which entity we're going to create; we get that from the mapping that we're handed; we can read the company name out of the instance that we're handed; the adult object that we're handed here and we go off to the migration manager which has a user info dictionary.
We'll lazily create this user info dictionary because we're going to stick our own mapping of which, it's kind of a cash of which companies we've already created by name, so we go...once we find that user info, if it already exists, then we can look at the dictionary, at another dictionary and see, did we already create a company with this name; so that's what we do here.
If we haven't created a company with this name or at least we didn't catch one, right, we go in and we create a new managed object, blank, generic NS managed object, we set the company name setting all of the attributes that we care about for the migration. We can do something special here; we see, well if the company name is Apple then we know what their business is; I might have missed that.
In version 5 of our model companies have a business as a string and their default value, all businesses make stuff but we know in the case of Apple we make phones, right and feel free to log anything you like. We auto release the instance because we're about to set it in our...yes in our association table 2 but in our unique look up map, right, that we keep on the user info on the manager...and then here's that last step for the creation...creating instances, we associate the source and destination instances using this mapping.
So you'll notice that we do this even if we already had a company with this name. We didn't create any new companies but we're still interested in making sure that we can get back from whatever company we've created to any source object that would make sense for that company that matches; so this is what we'll use for creating relationships later in this method.
Well actually this method doesn't do anything because...we've already set up a relationship mapping in our entity mapping; so this is something you can take advantage of. Remember that we have inverse relationships modeled in our model between company and adult and we've set a relationship mapping so that every company knows its adults, which adults, all of its employees rather; so you don't have to do both sides of handling the relationship mapping; we've already dealt with this in our generic entity mapping here.
( Pause )
here's our UI for Peoplizer 5; well open up documents...let's choose that one...good; so here we've created companies. You can see this is our list of companies; these are all the companies we have.
The Banana Inc. makes stuff; Apple makes phones; Ron's employed at Apple; Miguel is a Banana...no he works at Apple too; so the whole thing just works. One thing that I didn't show you;
( Pause )
if you take a look at arguments that we've passed to executable you can actually log some information about what happens during the migration process.
We take a look at that...cool, okay. So we saw, we were looking for a mapping model with this hash information, these version hashes; okay we found one; this is the one we're going to use and here are the source hashes...cool...and...and here' s our log of what we were doing in our custom policy.
So it's a good way to keep track of what's going on during the migration process. We'll go back to slides now.
( Pause )
Actually I think that's it; we have only one more slide. Tips and tricks; so we touched on a bunch of these; one thing we didn't mention is that when you're doing the migration remember that if there's already a file there, the migration process appends to that file; so we don't just want to delete files out from under you; I hope that's okay; we just append to it...and as I mentioned try and take advantage of the fact that you've modeled your relationships with inverses.
I mean, you did model all of your relationships with inverse, right; we keep telling you that...and this is good reason to try that because this way you don't have to setup relationship mappings from both sides; you can let Core Data do the inverse relationship management for you and just choose whichever relationship mapping is easier to define.
( Pause )
And it says it here, for large object graphs try doing the migration in multiple passes; what I mean by that is imagine you have a bazillion, just to choose a number, person objects that you need to migrate; you could fetch all the people whose names start with A first, perform a migration, nuke the contexts and then progress to B, C, D; this way you're using, you're chewing up a lot less memory during the migration; it's much nicer to your user that way and you can even put up a progress bar or something like that.
The migration manager gives you API for figuring out what stage of the migration you're in and the last tip, remember if you already have an application that's shipping on Tiger some of your users coming to version 2 or version 3 of your application will have store files that don't have any version information in them.
It's up to you at that point to look at the store file, check out the metadata, maybe map it to a specific file number and then...and then perform migration yourself there. Okay, that's all I have. For more information you can check with Deric; there's lots of sample code; you can come by and ask questions in the Core Data lab.