Core Data Model Versioning and Data Migration - WWDC 2006

Application Technologies • 1:11:10

As you change your application's underlying data model to accommodate new features, you still need to support different file formats and allow for backward compatibility. Learn how to support different managed object models in your Core Data application and how to migrate data from one model to another.

Speaker: Miguel Sanchez

Unlisted on Apple Developer site

Check out Bezel, our iPhone mirroring app →

Transcript

This transcript was generated using Whisper, it has known transcription errors. We are working on an improved version.

Welcome to session 118, Core Data Versioning and Migration. My name is Miguel Sanchez. I'm an engineer in the Core Data team. I've already been criticized for wearing the official speaker blue shirt, which apparently people don't like that much this talk. So just call me Core Data Smurf. Thank you for coming out at 9:00 AM, but why would you ever want to get up at this time at dawn and come listen to a talk other than the donuts, which there are none this year, but it would have been a joke last year.

So we're all here to continue our education in the Core Data world, specifically issues relating to versioning and migration of Core Data apps. Let me get a quick show of hands. Who's got at least some basic exposure to the Core Data framework, including having attended the presentation last evening?

So I'm assuming almost everybody has some idea of Core Data. Don't worry if you haven't used Core Data but you're seriously thinking about using the framework. I'll try to give you tips and hints here and there. I'll take into account that some of you are not Core Data people, but you'll be okay.

So this week you're learning about a whole bunch of our new technologies, the OS, the APIs, the new tools. But the ultimate point of all this is that for you to go back to your fancy offices next week and start implementing a new version of your application, you want to incorporate all of these technologies into a new version of your application.

So you'll take a current version of your app, your current grade app, or eventually if you haven't done a Core Data app yet, you will whenever you have version two of your app. And you'll be adding a whole bunch of bells and whistles to it, and it'll be all great. And this particular developer went a little overboard with Core Animation or something like that. So what issues will you have to deal with when you're migrating your application?

Like I said, this presentation is focusing on Core Data applications. So a quick review, a Core Data application is made up of probably the Cocoa side of things, the UI, the application code. But the essence of a Core Data application is its Core Data model and the persistence store. That's what Core Data does for you.

If you give us a model, we manage the object graph for you and we persist it out. For purposes of this presentation, we are not focusing on migration. We're not talking about migration issues relating to your UI and we're not really talking about your specific application code that much. So we're talking about managed object models and their stores.

So migration in this context refers to you have a version of your Core Data application, your customer's data is tied to version 1.0 of your application. You've decided to improve your application, so you have a new structure for your model, yet your customer's application is still in version 1.0. So we have to figure out how to get that data over into the correct format that you're expecting in version 2.0 of your application.

So this is what I mean by migration in this context. I know that every time we invite you guys to this conference, we say migration and you guys freak out because we're moving architectures or we're changing the API under you. This is nothing, this is not that kind of talk. This is migration because you are evolving your application and you're naturally requiring to migrate your customer's data. So this is something that you're causing by improving your application.

You might be asking yourselves, especially if you're not that familiar with Core Data, what's the big deal? Why do we even need a whole session of migration? Again, the essence of Core Data is that we do a lot of data management for you if you give us a model.

So you give us a model, we manage the data for you, but we need to know what the model of that data is. Consequently, if you switch the model under us or you decide to modify it, we no longer know how to manage that old existing data that you had in version 1.0 of your application.

So this is why versioning and migration is required. Here's a graphical representation of the same concept. You have a store that is all set up to store circles, triangles, and squares. As long as you give us a blueprint of that model that is, you know, I'm going to be dealing with Core Data Framework. I'm going to be dealing with triangles, squares, and circles. We're fine. No problem. We know what you want us to do. So we start putting data into your store. That's fine.

Here's version 1.0 of your store. Then you go ahead and improve your application. You now have decided that you're also dealing with stars and pentagons and or hexagons. Sorry, it used to be a pentagon in version 1.0. So we switch that. Here's the store. Here's the model. Framework goes, what you talking about, Willis? We don't know, right? So we localized that to English saying, data store is incompatible with the version of the model you're giving us.

So you have changes in your model and they're affecting the store. What kind of changes can you have in your model? They're simple changes that you're adding new fields, you're dropping fields. You're not really transforming your data. But you can get into more complicated changes where you're really splitting up fields and refactoring your inheritance hierarchy and fairly sophisticated stuff.

So we do a lot of cool stuff for you, but please, migration as a problem in general, it's a non-trivial issue and some of you experienced software developers out there will know that you will deal with other issues independently of what we're providing you at the framework level.

So let's start by introducing the example I'll be using throughout the whole talk. It's a Core Recipes example. It's actually available ever since the Tiger timeframe. You can download Core Recipes. We use it to give you an example of how you do a Core Data application in Tiger. So we have a simple recipes model here.

We have chefs, ingredients. We improve our application. We now go to version 2 of our model. We have a whole bunch of changes, which I'm going to go into detail. So what changed between version 1.0 and 2.0 of our model? We, or I should say you, you're the developers, you decided to add a rating attribute to the recipes entity. A simple change, right? It wasn't there before. I want to add something new. You decided to drop a training attribute from chef. Nobody was using it. Why keep it there? Let's drop it.

You decided to fix a typo you had. Recipe used to be recipes with an S in version 1.0 of your model, but then you realize that it's really a 2.0. You decided to change it to one relationship to recipe. So typo, oops, let's fix that. Those are all pretty straightforward. Here are the more interesting ones. You decided to change it to one relationship and to many relationships.

So at this point, you don't want to lose your pre-existing data. You just want to tell us that the structure of the store changed, but the data already existed in version 1.0 of your model. You decided to split a single name field that you had into first name and last name.

So this is now you're getting into data transformation. Don't, where I'm not just adding new fields. There's pre-existing data there that I want to do something with. And the most sophisticated change in this example is if you're familiar with database terminology, you decided to normalize the cuisine field.

So in version 1.0, you had a field, a string field cuisine in each recipe instance. But if you have multiple Mexican recipes, you want to just have a lookup table of those in a separate recipe cuisine. So that's the most sophisticated change you do here. Here's a quick review, version 1.0 of the model.

version 2 of the model. So this is what we'll be referring to back and forth throughout the presentation. Once again, this is the issue we're dealing with. You've improved your application, you're using version 2.0 of your model. Yet, your customer's data is still bound to version 1.0 of the model. They're still in the old version of the store. They just bought your upgrade and they're putting it in their system.

So what has to happen? Something has to happen that moves that data over, without losing what they already had, into the new structure of the data so that they can use the new features in your application. That's why we're here. What's new in Core Data to help you do this?

We have an official versioning mechanism that allows you to tag models and stores with "this is version X of something." We introduce a new type of model, a mapping model, in which you tell us how to go from A to B. And more importantly, we do the work for you. So we introduce a whole bunch of new migration infrastructure there where we do the migration work for you. Of the Core Data developers out there, who has already come across a problem like this that would find something like this useful?

Great. So you'll be pretty happy, I think. We've increased our number of classes in the framework significantly. We have six new classes. We didn't have a lot. So we have the first half of those classes are relating to the mapping model and the bottom half are the ones that do all the migration for you.

There's two significant improvements within Xcode. One is that we now version your Core Data Models. We have this group and I'll be showing this in the demo. And two is that we have a new type of model file, like I said, the mapping model file. So we have its associated editor where you tell us how you want to go from A to B without writing much code.

That's what this session is about. So we're going to be focusing on each one of those three parts I outlined, the versioning, the mapping model, and the migration algorithm. So let's start with the versioning side of things. So what's a version? How do we tag or how do we identify that? We know we want to migrate from something to something, but how do we identify that those two things are different?

There's two ways of approaching this issue. One is the way you guys would think of a version, and then the other way is the way the framework needs to think of a version. For you, a version is usually an identifier or a number, most likely whatever you're naming your file.

The important thing about this identifier is that you're the ones giving meaning to that. So it means something to you. This is version X of my app, or this is what was working before I went on vacation, and now I don't know why it doesn't work. But it's a string somewhere, probably the name of the file. You have meaning to that. You have assigned meaning to that string.

The framework needs to have a little stricter definition of what a version is. Again, we rely heavily on knowing the exact structure of your store. So what happens if you change your model, yet you forget to change the name of the model or something? Whatever versioning mechanism you're using, we'll run into trouble because you didn't give us a hint that something changed. We need a much stricter definition. We need to add a versioning interpretation mechanism above whatever you guys are using. So this is versioning to us.

We care about whether we can read that old data for you so that we can help you perform the migration. So those are the kinds of changes that we're interested in your model. A model is made up of entities and properties. If any of those things on the top two lists have changed within your model, an entity's name or their parent or for the properties, whether they're optional or transient, if any of those things have changed in your model, those things affect our ability to read in your data.

So we flag those as a different version change, as a version having changed. For the Experience Core Data Developers, things we ignore for reading purposes is the class name of the entity that you're using, user info changes, validation predicates. They're very important to you, but more at the writing phase of things. When we're reading stuff over to migrate, we're okay with you having changed those things. What do we do? How do we tag the version? We simply create a hash digest of your model elements. We pass them through a SHA hashing algorithm.

So you have... Here's version 1.0 of your model. You're calling it version 1.0. That's your string name. That's your way of identifying the version. Great. We don't really do much with that. We pass it through the hashing algorithm and we come up with a 32-byte number. Version 2 of the model, we do the same and we put it through the algorithm and we come up with a different number. Now, those numbers don't spend time trying to give meaning to those numbers. They're really just a way for us to quickly identify whether something is equivalent to another version of your entity.

So, for purposes of this presentation, I'll be switching and using graphical symbols so that you can quickly see that the recipes version, which used to be a triangle in 1.0, something has changed significantly and it's now a square in version 2.0. So, that's what I'll be using. I said, we don't really care about every single little change you did in your model and let me talk you through that. Here's version 1.0 of your recipe. You have precedence, directions, name. So, we pass it through the algorithm and we tag it as, oh, this recipe entity has version green triangle.

[Transcript missing]

So in this, for this first demo, I'm not going to actually execute anything yet in the app. We're just versioning your models. Here's Core Data application. We have the simple recipes model. It's still in the old format, XE Data Model. That's what you would get in Tiger.

If we open it, we see that we have version 1.0 of your recipes model. We only have three entities. It's fine. Let me compile this just so that our Core Data newbies can see what happens at the file system level. So if we look at the build directory and I... Open the package. This is what happens in Tiger.

An Xe data model is compiled at runtime into a .mom file. That's what happens. So what's new? Now we say, "Okay, we want to improve our application. Let's go, let's start working on version 2.0." We can't just go in here and change this model because we would lose the original model and then we wouldn't know how to open the store. So in the past, you would have had to make copies of your model and manage them yourself. No longer, you no longer have to do that. You just select the model, you go up into the menu here, Data Model, Add Model Version.

We now convert your model into a group with a .d extension at the end. We now have your original model and we create a copy of that model and that's model version 2. You'll notice this green icon here. That indicates what your current model is when you actually launch your application. In this case, we want to make version 2 our current model. So we'll go up to the menu here and we once again see that we have a new menu item here where we can set the current model and the icon changes.

So this is how you create version models within the tool. So let's go in here and start doing changes. I'm not going to do everything here. I'm just going to do one of the simple changes. Remember the recipes I said. We had a typo so it's no longer recipes, it's recipe. We save.

Now let me build what happened within the file system. Oh wait, I have to... Let me, just to not confuse you, let me clean all so that the on version side of things is cleaned out and then we build it. And then we build. So we go back into our-- - To our output directory, here we go, we compiled our output.

We look at the content. Resources. You'll now see that we no longer have a .mom, we have a .momd. Inside of it, you still have .moms, they're still the same format. We now have version one of your model, version two of your model, and now there's this Info.plist in there that gives us runtime information to properly process those models.

We can see that the Chef, we didn't touch Chef, so the hash is the same. It begins with an 8433. Here we have 8433. We touched the ingredient, so the hash is different. FA something here and here is now 3D something different. For the five of you that are still awake, you'll notice that recipe also changed. Why did recipe change? Why is this hash different from this hash? This is where the subtleties of versioning start to rear their ugly heads.

We changed recipe, right? But Core Data manages a complex object graph for you. I'm sorry, we changed the recipe field here, but recipe, the entity, is related to ingredient, and it also has an inverse relationship on the recipe entity. So even though the change was only here, from your point of view, the model, this entity was also affected because its inverse relationship is now different. So that's why the hash changed in the recipe entity. So that's the end of the first model. Could we go back to slides? So that's how you do versioning now at the tool level.

What do we do with this information at runtime? It's fairly straightforward. We have a model, we have a store, we now know that the model has versioning information. When we save out your instances to the store, we simply store the versioning information in the store's metadata. People not familiar with Core Data, each store, think of it as having a header. There's a little special area in the store where we can store additional information about your data. We now have a new key, NSStoreVersionHashKey. If you look at that key, you have a dictionary with all the versioning information of the entities that live in your store.

So what do we do with this? We detect version skew, version incompatibilities. It all happens in the add persistence store with type method. This hasn't changed. This is the persistence store coordinator. When you're setting up the Core Data Stack, you tell us to add a persistence store. You give us a URL to the store. You give us a model. We check the versioning information. No problem. We can open it.

But, if you had given us a pointer to a 1.0 version of your store and at the same time a version 2.0 of your model, we detect that there's a version skew there because the version hashes don't match. So this is how we can now tell that there's a problem and that you need to do some sort of migration.

The default behavior at this point is to report an error. We don't just migrate stuff for you. We do if you tell us with a special store option, but by default, an error still comes up that says, "Look, there's an incompatibility here. You gave us a store. You gave us a model. We don't really know how to work with those two things." So that's the end of the first third of our talk where we're talking about the version, the official way of versioning models and stores.

Fine, we have version skew. What do we want to do now? We want to do migration. So we want to go from version 1 to version 2 of your data. So it's the data that needs to be migrated. You've modified your store. I'm sorry, your model is in version 2.0. Again, your customer's data or your own data is still stuck in version 1.0.

And we need to find a way to get those those data from one to two. How would you have done this in the past? And the reason I'm talking about how we how you would have done this in the past is because we didn't really invent a new way of doing migration. It's still there isn't like this sophisticated magic going on.

We we do what you would have done in the past, except that we do it for you. So let me review what you would have done in the past. What you would have done is that you you set up two core data stacks, one for the source and one for the source. And one for the destination.

And you simply take your data from here and then you write code and then you can you transform your data and put it in the in the new stack. So that's really all it is. It's just a bunch of loops from fetch all of these things and create everything on the on the destination.

The downside is that this was hard for you guys because we didn't do any of it for you in Tiger. So you had to do all of this. You had to set up the stacks and fetch everything and do all of your migration logic. There's actually a Tiger based example.

If you if you look at the core recipes example, which you can download from ADC, one of the subfolders of the core recipes example in the Tiger timeframe is how you would do the hard, how you would do the migration in the Stone Age. So you would have something like this.

You would have a you have a little migration algorithm there where this is all code you would have had to write. You have to specify methods where I've years where I'm normalizing my cuisine and here's where I'm migrating my recipes and my chefs. And you might say, well, Miguel, that's not so so bad. Really?

That's not the Core Recipes application. That's the migration code. The Core Recipes application is a whole different folder, a whole different code base. That's just the code that you would have had to write to set up your stacks, fetch everything, invoke your migration logic, and basically, you do everything by hand.

If you had attempted to do this in the past, you would have scratched your head and said, why can't you guys make this easier? There are certain kinds of changes that should be no-brainers, right? I'm adding a field, I'm dropping a field, I'm making it to one and to many, that's not a big deal. I'm changing a typo here. So these kinds of changes, why do I need to write all that code? From the point of view of my data, it's migrating as is.

I'm not doing any transformations, I'm just simply changing something, a simple change in my model. Why don't I just focus on my custom logic of when I want to do specific kinds of migration? For example, I'm splitting up fields. Okay, at that point I know that the Core Data Engineers won't really know how exactly I want to split up my fields, so let me write my custom code for that.

Or I'm normalizing my cuisine in a special way, so I'll do. So why don't I just write this kind of code? For example, for the splitting up of the first and last name, instead of writing that whole hundreds of lines of code we just scrolled past, why can't you just have your specific method that splits up the first name and last name?

And as you've figured out by now, that's what we do in Leopard. That's the new hotness. So model driven migration. We do it for you with a model. There's a new plugin within Xcode where you can now specify a mapping model to take you from A to B.

And we take care of all that cumbersome work that you saw scrolled by in the last couple slides. We detect the version skew, that there's incompatibility. We manage the Core Data stacks for you. We create, we move stuff from the source to the destination. We recreate the relationships. That's the key part of the migration.

But of course you know that when we say we do everything for you with zero lines of code, it's really not 100% true. You'll always end up writing a little bit of code. So don't worry, this is an extensible framework where you plug in your own logic. So the mapping model. The mapping model's role in life is to indicate how you want to go from A to B without writing the code. It's much easier to go into a tool and say, this field goes here, this field goes here, This field goes here.

We introduce a new class, NS Mapping Model. If you're familiar with Core Data, you know that our model hierarchy is something like NS Managed Object Model, which is made up of NS Entity Descriptions, NS Property Descriptions. We have a parallel hierarchy now in the mapping world. We have NS Mapping Model, which is made up of NS Entity Mappings and NS Property Mappings.

So the mapping model contains all of the entity mappings. So here's where the interesting things start to happen. Entity Mappings is where you tell us how you want to get from A to B. The two main elements in an entity mapping are the Source Entity and the Destination Entity. Where do you want us to fetch stuff from and what do you want us to create with the things you are fetching?

So let's look at the mapping model that we would create at the entity mapping level for the example we're following. So we would have the Source and the Destination for entity mappings. to be showing up in that table represents an entity mapping. We want to map chefs to chefs. That didn't change. That's pretty straightforward. Ingredients to ingredients. Recipes to recipes. Three entity mappings. What happens with cuisines? Any guesses? Destination? Cuisine, right? What's the source?

Recipe. Because that's where my data lives. You're normalizing cuisines, so you're still asking us to fetch recipes. Once we fetch recipes, there's something going on with those recipes that you were not creating cuisines. Recipe. Because that's where my data lives. Once we fetch recipes, there's something going on with those recipes that you were not creating cuisines.

[Transcript missing]

So you would say, I want you to fill the name field with this expression value. Since you already have the source object when you're migrating, I want you to just use key value coding and go to the source object and get its name. And that's what I want in my destination. That's the property mapping.

Rating. Rating is a new field, if you guys remember. So there's nothing really-- there's no data in your source to pull over. So you leave it blank. We just use the default value. This is the tricky one in property mapping relationships. The first half is very similar, very similar to the attribute mappings. There's a key path there, right? Go to the source and get its chef.

There's one other thing we need to do with relationships. And for here, let me have a little parenthesis here for those people that are not familiar with Core Data, why this is an issue. Core Data manages an object graph for you. If you were at last night's presentation, we managed that object graph by keeping all of your objects inside an instance of something which we're calling the managed object context. So that's where, as long as your instances are inside a managed object context, we know how they're all related to each other.

There's a little bit of a trick that you have to be aware when you're doing Core Data applications. You can't just take an object from a managed object context and put it in another context. You can't just say, oh, here's one. Let me insert it over here. Just by the actual instance, you can't just insert it like that. Because that object is related to a whole bunch of other things. It could have pending changes. And there's a lot of sophisticated stuff we're doing for you.

So you can't really just take it over. You can do it, but you have to do it in a different way. You have to recreate an instance of that object in the new context. Or you have to get the ID for the object and tell the context saying, go to the store and get the equivalent instance of that object in the store. So basically, just remember, you can't just take an instance from the source and stick it into the destination. There is a little bit of transformation involved.

So for recipes, we can't just take a key path that you give us. We also have to know, okay, you're telling us to migrate a chef and bind it in this particular property mapping. But when was this chef created? And what did I create when I was migrating chefs?

So we also need to know the name of the mapping that was used to create the chef so that we can migrate the chef. So we can take that chef instance and create the equivalent instance over in the destination. This is probably one of the toughest points in the migration architecture that we're presenting. We'll be going over this again throughout the example. So this is just the first time I go through this. Property mapping is the runtime class that does this. Let me show you a demo of how you would create a mapping model.

So again, we're not yet ready to migrate our application. We're going into our second of three demos. This is where we create the mapping model. So let's review where we are. We now have our version core data model that we did in the first demo. You'll notice that I opened a different project, not the one I was originally working in version 1 of the, in the first demo.

The only trick I'm doing there is that I already have all of the changes for the model. I didn't want to sit in here and type all of these things in front of you. So the only change in this project is that we have the completed version 2 of the model. Right. So we're ready to migrate, but we want to create a mapping model. If you go to new file, you will find a new file type under the design group mapping model.

Recipe, so let's call this Recipes from 1 to 2. This is probably where you'll have to think a little bit as how you want to name your mapping models. We have a new assistant where we allow you to look for your source and destination models. So here we are. Here's my source, right? Version 1.0 of my recipes model. And here's my destination. Click Finish. And we create the mapping model for you.

So, we do, you don't even have to do everything by hand. We do, we know your source, we know your destination, we look at them and we do our best effort to create the mapping model that we think you need. So, Here is the source and the destination columns. Here's the source and the destination. Those are the tables I showed in my previous slide, right? We're going from chef to chef. We're going from recipe to recipe. We're going from ingredient to ingredient.

We don't know what you want to do with cuisine. So we just said we're going from nothing to cuisine. But you would come in here and change your mapping model and say, you know, I'm really going from recipe to cuisine. That's where you start modifying your mapping model.

You'll notice that we also pre-populated all of the things that we could figure out for you. For example, in the recipe to recipe mapping, by the way, we're naming the mappings. Not only do they have a source and a destination, we also give them a name. It's important to use their name when we're using them for relationship mappings. So the simple name is recipe to recipe. So here's the recipe to recipe mapping.

And we figured out that you had directions in 1.0, so we're assuming that you wanted to migrate it to directions in 2.0 because you have it. Name is there. Rating, we didn't find it in the source, so we leave it blank, but that's okay because we know that rating is a new field. Chef, you probably want to migrate chef. Ingredients, we're actually doing more with chef here. I'm sorry, it's in the other relationship. Ingredients, we're migrating your, we found amount and name in the source.

In the destination, we didn't find recipe. Do you guys remember why? Because recipe used to be recipes in version 1.0 of your model. So you would say, oh, in my ingredient to ingredient mapping, the key path, well, you would tell us the key path that I want you to reverse is, I want you to go to the source.

and get the objects from the recipes. You as developers would know that that exists in your one of the model. So that's the key path that you want us to traverse. But again, this is a relationship mapping. This is where the tricky part needs to happen. You also need to tell us, okay, we're in the source, we find your recipes, what mapping was used to create those recipes? And you have to tell us, oh, the recipes to recipes mapping is the one that I would have used to create those recipes.

So as long as you navigate that key path and then you use that mapping to do the transformation, we'll be okay. So those are the kinds of things you would tell us in the model. So the point of this demo is that you set it, you just point the new assistant to the source and the destination, and we create a mapping model for you. So that's the end of the second demo. Can we go back to slides, please?

So we're done with all of the basics we need to do the migration. So are you still with me? Are you following this in terms of where we're coming from? We have all of the basics. Now I'm ready to tell you how we do the migration for you. But we needed to tell you how we do versioning first and how you're creating the mapping model. Does this still look pretty easy up to now? Yeah?

So let's put it all together. Let's bootstrap the migration. We're at the point where we're going to start the migration. Again, we have, you gave us a model which is a new version of your schema, but your data is in the old version of the schema. So we have those two things. This is the third time I say this.

We need to find the source model that was used to access your original store. We need to find that because we haven't changed that requirement. We still need to access a store with the right version of the model. We're not doing any magic there. We still need to have that model version that you used to create your store. So we need to find that.

You're giving us a mapping model. We need to find that somewhere in your resources in your application. Oh, I forgot to, in the demo, I should have compiled that mapping model. I'll do it in demo three. When you compile the mapping model at runtime, the extension at development time is XC mapping, and at runtime it becomes CDM, Core Data Mapping Model. So that's the runtime implementation. So we have the mapping model, and we also have the destination URL where you want us to migrate your data to. So this is the bootstrapping side of things. This is what we've talked about.

We take all of these elements and we initialize a migration manager. This is our class that's going to do the work for you. So the migration manager needs to know source model, source store, destination model, destination store, and the mapping model. And then we can do the magic. We can do the migration. We're finally ready to explain to you the migration details.

We do it in three passes. First pass, we go into your source and we create all of the destination instances and we fill in their attributes because we have their sources. Why can't we create the relationships yet? Because the instances aren't there in the destination yet, right? We're migrating your sources so we can't create destinations yet, relationships yet because we haven't migrated all of your object graph yet.

So that's why we have a first pass. In the second pass, we again start at the beginning and now we recreate the relationships for you. And in the third pass, we do a little bit of cleanup and we just save the store and do the validation. More detail into each one of those passes.

First pass. Before we start the first pass, we need to take care of a couple of housekeeping things. We need to disable validation rules because we're doing the migration in multiple passes. So there are going to be times during the migration where the object graph is not going to be consistent yet because we're not done. So within your model, we need to say, oh, don't worry about validation rules just yet because we're migrating.

The second thing we need to do is we need to disable your custom classes in your source model so that we don't have class conflicts. We now have two Core Data stacks, right, that are coming up. So you probably are interested in your destination level classes. So at the source model, we're not really executing those classes within your app. We're simply using them to load the original data and migrate it over. So we can disable the custom classes that you're using for your entities and use managed objects.

That's the housekeeping we need to do before we start a migration. The migration, all we do is, again, this is what you would have done in the past. We're not inventing a new migration algorithm. It's just we now do the work for you. We fetch. We iterate through each one of those mapping entities that you have in the model in order. This is one area where ordering is important.

You can specify the order within the model editor. So we go down each one of those mapping entities that you gave us. You told us the source. You go to the source and fetch all of those instances. You told us the destination, what you want us to create. So we create the corresponding destinations. And you told us how to populate the attributes. So we populate the attributes.

This is a graphical example. Left hand side is the source stack, right hand side is the destination stack. We start examining your mapping model. We find an entity mapping that says chefs to chefs. We fetch all of the chefs. We iterate through each one of those instances and create its corresponding destination instance in the destination stack.

That's it for the Chef-to-Chef Entity Mapping. We now go on to the next rule in your mapping model, recipe mapping. We fetch all of the recipes, and we create the corresponding destination instance and the destination stack. Don't be concerned when I say we fetch all, because in the entity mapping, you can indicate a filter. So you can have multiple entity mappings for a particular instance. This is not the example we're following, but say you're migrating a person class into employees and managers. So in version 1.0, you had only a person table, and now you're migrating to employees and managers.

So you would have two mapping entities, one where you would filter your people on their title that you have. So you are able to filter the mapping entities on whatever predicate you want to indicate. I didn't show that in the model, but that's allowed. So that's the end of the first pass.

I know we have ingredients and cuisines, but this is what illustrates the first pass migration. We'll be talking about how we're migrating cuisines a little later on, but it's still part of this. We're done with the instances, but they're not related. We need to go back to the mapping model. Again, iterate in order through your mapping entities.

At this point, we're not looking at each source instance in your source stack. We've already created destination instances. So we're looking at each-- I guess I should step this way, because this is destination for you guys, right? So we're looking at each one of the destination instances and say, OK, so I have a destination instance, but it's unrelated. How do I relate it?

So we navigate the key path that you gave us, and we do the transformation of that object. So this is how we recreate the relationships. This is the end of our first pass. We have Chefs and Recipes in the source stack, Chefs and Recipes in the destination stack. This is the data you care about, the new version of your data.

But we know that that data is related. Chefs are related to recipes. So how would we relate this recipe, that first sandwich up there? We can see, you guys are all very smart, you know that it's related to that first chef that you see up there, but how does the framework figure this out at runtime?

We know what source recipe that one came from. So we say, we have this recipe. Let's go to the source. Here is where we navigate the key path that you gave us. Remember that you told us $source.chef? That's the key path navigation. Now we have an instance of a chef.

This is where the trickiness happens. I told you that you can't just take an instance of a chef and put it over in the destination. That doesn't work. We need to know what equivalent instance was created for that particular chef. That was the second part of data you gave us. What was the mapping you were using to create chefs? Chef to chef.

So we, oh yeah, we're keeping track of these things. By the way, this is all, we're keeping a bunch of lookup tables for you when we're doing the first pass migration. Each time we're doing, each time we're creating destination instances, we're storing them and relating them in this lookup table so that we can do this lookup during the second pass.

So once we know, at this point the framework knows, oh yeah, that instance of a recipe is related to that instance of a chef. So now we can create the relationship in your destination graph. So this is all we need to know. do for each one of the instances that you have.

Does that make sense? So that's the hard-- that's kind of the tricky part of the migration. If you get this, you should be OK with everything else. Third pass is just sanity check. You might remember that we disabled the validation rules, but now everything's there, everything's related, so we can reintroduce them. And we can save your store and it's ready for you to use.

That's what we do. So this is the job security part of the talk. What do you guys do, right? We don't want to leave you guys with no jobs. So you know there's always, like you're happy that we're doing this for you, but I wish you guys would have done this a little different. So this is where you would plug in your custom code.

Once you've become familiar with our migration way of doing things and you start to think, "Oh, God, I wish these guys had done these things differently," this is a roadmap of how you would want to approach your customization. Whether what you want to customize is in the bootstrapping part of the migration process or in the actual migration part of the migration process.

Do you want to customize things in terms of where we're finding the models for you or do you want to create those models at runtime? You just don't want to use the default behavior for you, that's the bootstrapping side of things? Or do you want to customize the actual migration steps of how you're splitting your fields and normalizing your machines? I can already tell you that I hope that most of you, when you're talking about customizing, you're talking about customizing the second part of things, not so much the bootstrapping. But it's there if that's what you want to do.

So for the bootstrapping, this is what we do for you. Remember, as long as you give us the right models, we look for things in your main bundle. That's why we now have version models within Xcode, because everything's there. It's compiled into the resources of your application. So we simply go in there, look for everything that's required, and we initialize a migration manager. And you notice that the arrow there is not yet-- it hasn't gotten all the way to the migration side, because we're ready to migrate.

So this is where you might say, I don't like how you did the bootstrapping. I want to change something. I want to-- you guys are looking in my default bundle, but my models really live somewhere else. You guys are asking me for a mapping model, but I'm a super sophisticated developer, and I'm actually auto-generating the mapping model at runtime right as you tell me that there's a version skew. So if you want to do all those kinds of things, this is where you would. you would hook in things. Remember that this all happens in the @PersistenceStoreWithType method. So, a Persistent Store Coordinator, you tell it, "I want to work with this store and this model."

All we're doing in Leopard is we're introducing a new policy option, I'm sorry, a new option for the store. The key to access this option is NSStoreMigrationPolicyKey. And the value you would set here is an instance of NSStoreMigrationPolicy. So we already do the migration for you. If you give us an instance of our own class, we'll do the migration for you.

If you give us an instance of your subclass, then we'll do the bootstrapping based on whatever methods you overwrote in that subclass. Remember that I said that the default behavior when we detect version skew is to bring up an error panel? This is how you change the default behavior. Even if you don't want to customize the bootstrapping process, you still have to give us this additional option in the add persistence store with type method. So that's all you do. You don't want that error. You want us to take care of the migration.

You add that option. And I'll show that in the following demo. So within your subclass... You're customizing the bootstrapping process. So once again, you're looking for models in your own weird locations, you're initializing your own migration manager, you're specifying a different destination URL. By default, the destination URL we use is the same one you had with .new appended at the end. So if you want to change that, you would go in here. NS Store Migration Policy. That's the class you want to look at. That's the class you want to subclass if you're not happy with how we did things with the bootstrapping side of things.

Second thing, I want to customize the migration. This is where I would hope most of you would plug in your code. Again, we do a lot for you. We do everything you had to do in the past in the Tiger timeframe. We do the three-pass migration. We fetch all your sources. We create the destinations.

We go through them again and recreate the relationships. We make sure that validation is disabled and enabled at the right time. We do all this for you. But there's going to be certain migration steps that we can't figure out in the model. For example, the splitting up of name into first name and last name and the specific way that you want to normalize your cuisine's view.

So at this point, you would subclass-- the class that does this for you is NSEntityMigrationPolicy. So if you subclass this class, this is where you would plug in the custom code. Remember that very first example where we saw that whole source code scroll up, and they said, wouldn't it be nice to just write a little method that splits our first name and last name? This is where you would put that method in.

When would you want to subclass NSEntityMigrationPolicy? When you're splitting up or combining attribute values. If all you're doing is navigating key paths, you can do that with the mapping tool. If you're doing fancy splitting up or combining of your models, you probably want that little function where you're doing-- we give you the source, we give you the destination, and you populate it however you want it.

When you're reducing the count of relationships, if you want to make it to one into a too many, that's pretty straightforward. The data's already there. We simply put it into a too many relationship. If you want to make it too many into a too one, which ones do you want to keep around? So maybe that's where you want to write your code and say, well, I want to only keep these instances around from my destination.

Or if you want to do conditional creation of destination instances, for example, cuisines, remember that we're unique in value. So we're not creating a cuisine instance for each recipe. We're only creating a cuisine instance for each unique recipe value inside of my original instances of recipe. So that's custom code. That's where you will write the custom code. And we'll see that in the demo. NS Entity Migration Policy. That's the class. That's what you would subclass.

The two methods-- there's a couple other methods in this class, but the two methods that we're calling that have default implementations are the first pass method that we're calling is destination instances for source instance. That's when we say, here's the source. What should be created for the destination? The default implementation creates an instance of whatever you told us was the destination.

This is where you would plug in your code. You can still use our logic and call super and we create the instance for you. And after you call super, you simply say, oh, but I remember that you guys, Core Data, you still haven't populated first name and last name. So I've already called super. You've already created chefs for me. Now let me fill in the fields in the way that I know how to fill them in.

The second pass method that is called in this class is create relationship for destination instance. So during our second pass, this method has a default implementation where we do the hooking up that I showed you there with the recipes and the chefs. So if you want to do some other more sophisticated re-relating of your objects, this is where you would plug in your code. So we're finally ready to migrate everything over. Demo?

Simple recipes demo 3 Let's review. Where are we? We have our two versions of our model, right? We have, oh, wait, let me backtrack a little bit. We have, you notice again that I opened a different version of the project that I was using in version two of the project.

Why is that? This is a special version of the project that has two targets because I want to compile both the 1.0 version of my application and the 2.0 version of my application. I want to show you what would happen when there's a migration error there. So that's why I'm doing a different project here. But we have the source model. We have the source model right there. We have the destination. And we have the mapping model. So let's first launch version 1.0 of our application. No migration yet. This is just version 1.0 of our application. Um, build failed.

Oh no, it said succeed in, right? Why did I? Yeah. . Yeah. No, I don't know. You guys saw the build failed message down here, right? I don't know why that was there. So we have recipes. So, this is version one over app. We have pre-existing data there. I'm sorry. Let me -- so, there's recipes, right? So, this is one over app. But the only reason I'm bringing this up is so that I can add data here. Another one -- oh, I already have Miguel Sanchez. Let's use another.

[Transcript missing]

I'm a Mexican and I'm a geek, so let's call this GeekMex. A very easy recipe. All you need is one car. And then you drive... You drive and you eat. It's very, very easy. I recommend it. So we have the-- I'm only putting data in here so that you see that I'm actually migrating something in. OK? So we go back to our app. Let me launch version 2 of our app. We haven't done any migration yet. What happens?

So we're running. You can already see that, oh, we have a fancy new UI. Actually, I hope this-- yeah. So here's what you're talking about, Willis. Error message, right? Like, look, you're giving me a model. You're giving me a wrong version of the store. What's going on here? Error panel. I heard claps there. Thank you.

So this is the default behavior. We detect the version skew for you, right? We have all the versioning magic going on behind the scenes and all we do with that is, "Oh, can't do this. Fix it." Here's our fancy new UI, but there's no data because we haven't migrated yet. So let's go back to our app. We have a mapping model.

We have two additional changes. Remember that we had already done everything we needed to do for recipes and ingredients? For chefs, there's really no way of specifying here. Well, there is, but it's a little more advanced of how do you want to fill in first name and last name. So you'll notice that the mapping type, you can select the mapping type to be custom here. So all I did was select custom both for chef to chef and recipe to cuisine. So here's another custom, right?

All we do when you tell us custom here is that we enable this field right here, and this is where you tell us the name of the subclass of the entity mapping migration policy. So that's what's going on there. So both of these require a little bit of custom logic. We do that by here's where you tell us what class you want to use. Now, even though it says custom there, we don't leave you all by yourselves, right?

You can still use the tool for some parts for populating the fields that are easy that we're still supporting. So, for example, in the chef, we don't know how to migrate your first and last name for you. That will be custom code, but you can still use the mapping model to indicate how you want to repopulate your relationships, right? So the fact that you're saying that something is custom there doesn't mean that we're completely ignoring all of the mapping information that you specify.

So whatever. You specify in here, as long as you call super in your in your subclass implementation, we'll still do as much as we can for you. So custom doesn't mean do everything. Only do whatever, whatever you want to plug in. Let me see before I show you the code. What else do I want to show you here in the model? So that was for a recipe.

Here's another example where we, even though we're doing custom migration at the Cuisine level, our custom code, all it's going to do is it's going to determine when to create a different Cuisine instance because we're doing the unicking. So the custom code you guys are writing will have to deal with the unicking values. But once we've done the unicking, we actually can populate the fields using the model mapping rules. For example, the name of the Cuisine is going to come from the source's name. Let's parse this. What is the source?

A source is an instance of the recipe. The recipe has a name field, recipe.name. I get that and I fill that in with the recipe's name. The trick is how do I determine that I only create a limited number of Cuisine instances for all the duplicate recipe names that I have. So that will be the custom code, but you can still use the mapping model to do the population. You can also populate the relationships in here, right? Here's a tricky one. What are the? The keypath.

Where did the keypath go? Why do I only have source here? Well, what I'm doing here is I'm hooking up. Here I'm in Cuisine. That's the destination. Cuisine has a recipes relationship. And I want to populate that relationship with the sources. The sources were recipes, right? That's what I'm populating. I'm not really going sources dot something. It's like the recipes themselves is what I want as the destinations in this. In this mapping. So that's the mapping model. Now we look at the code. There isn't much.

Actually, let me start out by Chef Migration Policy, subclass of NSEntity Migration Policy, You override the first pass method, destination instances for source instance. You call super because you still want to use all of the mappings that you gave us in the model. But once you've called super, you want to get a hold of that new chef we created for you right here. And here you have your custom logic, right?

This is where you, oh, yeah, I know that the old chef, here's the old chef. It comes in as an argument in the method. And this is where you know how to split the data. And you set up that name field into first name and last name. So you're only writing this little bit of code.

Cuisines, again, a subclass of NS Entity Migration Policy. You're overriding your, here you have an init method where you're keeping a dictionary for unique purposes. You overwrite the first pass migration method. And when do you determine whether you want to create a new instance of a cuisine or use a pre-existing instance of a cuisine?

Well, you know that your source entity was recipe, so you're getting recipes here as an argument. You look at the recipe's cuisine name and then you look at the cuisine in your lookup table if it's already there, it means you've already created an instance for it. So you just return that. If it's not, then you use our, our implement, our default implementation. You call super.

That's what we do. That's where we create a cuisine for you. And then you populate the name. Actually, you don't need to populate the name anymore because we did that at the model level. This is, this is all code. And then you, you store it in your lookup table, right?

So this is, this is your normalization code that don't, this is what you would write to do like, oh, I only want to create this. This is what you would write to do like, oh, I only want to create this. This is what you would write to do like, oh, I only want to create this. This is what you would write to do like, oh, I only want to create this.

This is what you would write to do like, oh, I only want to create this. So you can create destination instances in this particular situation. So this is where you would code that particular situation. You'll notice that we didn't even have to do anything with the second, the second pass migration method because we were able to do that at the, at the mapping model level.

The third change is, we said that by default when we detect version skew, we have, um, we We create-- we report an error. So we add a policy as an option here. Here's the add persistence store with type method. We create an instance of NSStore migration policy because you're not doing any customization at the bootstrapping process, so you're simply creating an instance of our default implementation.

You're putting it into a dictionary with a new key, and you're setting it as one of the options, right? So very little code, right? You had a little bit of code here to tell us to do the migration, and you had your two basic subclasses with a few lines of code there. You compile and you launch. And there's your migration.

So could we go back to the slides? So I hope this-- you're excited about all this if you struggle with this in previous versions of Core Data. We've been doing this for about three weeks now, so we have hours of wisdom to communicate back to you. So this is where I enlighten you.

This is also where Darth Vader would say to Luke, beware of the dark side of migration. So remember that I left out certain things in that first slide. We didn't talk about migrating the UI and your own code. So we do a lot of stuff for you within migration. But migration as a procedural problem is something that you have to take seriously.

You also have to migrate your UI. For that, you might want to look at some of the new refactoring functionality we have within Xcode. Where it gets tricky is when you're managing your migration paths in deployed applications. So if you have one customer base with-- you're only migrating from one to two, that's pretty straightforward. But if you have a bunch of customers, this guy's in version one, version two, version three, version four, or even within your own development cycle, you're changing your model every single day of the week.

By the end of the week, you have five different upgrade paths and all of the middle paths. So this is where you could get in trouble. We do a lot of the migration for you, but you have to be careful to really know when you want to do the migration. So you have to manage migration well. You also have to consider performance. This is one of those areas in core data where we have to touch all of your data. We have to move it from the source to the destination.

So we're touching all of your object graphs. So we're not doing it all the time, only when you're doing the migration, but we have to do that. Please review your mapping model. We're doing a lot of key value coding to get values and to set values. So if when you were creating the mapping model, you put in the wrong key path, and that key path happens to be the millionth instance you're migrating, and you've been waiting for, well, a million instances, what, 30 seconds? Core data, we're very efficient, right?

So 30 seconds into your migration, you have a key value coding exception. So review your model, please. And remember, you can't just move stuff over from a source stack to a destination stack without some sort of transformation. So be careful when you're navigating key paths in your mapping model. Always think, what is this key path? Is this a simple value? Okay, no problem. Is this key path an object? If this key path is an object, I need to do that extra transformation that I did for relationships, so don't be aware of that.

[Transcript missing]