Configure player

Close

WWDC Index does not host video files

If you have access to video files, you can configure a URL pattern to be used in a video player.

URL pattern

preview

Use any of these variables in your URL pattern, the pattern is stored in your browsers' local storage.

$id
ID of session: wwdc2006-114
$eventId
ID of event: wwdc2006
$eventContentId
ID of session without event part: 114
$eventShortId
Shortened ID of event: wwdc06
$year
Year of session: 2006
$extension
Extension of original filename: mov
$filenameAlmostEvery
Filename from "(Almost) Every..." gist: ...

WWDC06 • Session 114

Core Data in a Nutshell

Application Technologies • 1:07:46

Core Data provides a framework for object graph management and object persistence, automatically taking care of tasks such as undo and redo and saving data to disk. In this session you will learn about Core Data's architecture, and how you can take advantage of the technology to create a more full-featured application with less code.

Speakers: Matt Firlik, Ben Trumbull

Unlisted on Apple Developer site

Transcript

This transcript was generated using Whisper, it has known transcription errors. We are working on an improved version.

What is this Core Data thing? So we tried to come up with a one-sentence description for what is Core Data. And it actually took my team a little bit of time to come up with something that's succinct, not really really long, end up with arguments about is persistency really a word? But this is what we ended up with: model-driven object graph management and persistency framework.

Exactly. We actually ended up, we asked a couple people this question, what is Core Data? And usually when we give a response, we get one of two reactions. We either got the okay nod, or we got the 90-minute whiteboard, scratch paper, textbook, all kinds of big long conversations version. That's not actually because Core Data is complicated. How many of you people have read the docs? Come on, lie to me, raise your hands.

Okay, if you've read the docs, you'll see that our API is relatively small. It's somewhat minimal for working with this problem set. So the conversations take a while, not because Core Data is complicated, but the problem space is complicated. Dealing with all the different issues of data management, persistency, undo, redo, and all sorts of stuff is pretty complicated.

And a lot of times you'll be working on your application and realize, I didn't really know this was going to be a problem. The fact is that Core Data provides solutions for a lot of these different problems. So usually when you're working with Core Data, it takes some time to explore the problem. But if you're working with Core Data, it's going to be a lot of fun.

So let's take a look at some applications that are in Mac OS X that use Core Data as kind of a sample of what's possible. So if you were to log into Mac OS X and start up an application, the first thing you might see is something called Aperture. Aperture is Apple's leading photographic management application. And this is a behemoth. This is a pro app. And much like a pro apps, this is one that's adopted Core Data exclusively for all of its data management applications.

This is an application that deals not just with hundreds of thousands of photos, but with potentially hundreds of pieces of metadata for every single photo, hundreds of pieces of keywords, all sorts of kind of different data for this application. This application has high data, but also has low latency requirements.

And because it has lots of data and lots of complex data, they want complex queries. They want to be able to funnel it down to keep their working set very, very small. And they can do that with Core Data. They're actually able to use it as well for something we call data replication.

Which is actually saving off pieces of their data so that any time something happens, they can reconstruct their project. And Core Data makes that really easy for them to do. But it's not just about the big apps. Let's take a look at a couple of their small applications. AddressBook and iCal. And you might be thinking, I've seen these applications before. I didn't know they were using Core Data. In Leopard, they are. And they're not actually interesting for their application space, but because they're both frameworks. Both AddressBook and iCal have adopted Core Data for their persistency.

And they're able to use it in Leopard for framework implementation. And this is actually interesting for a number of reasons. The first is when you think about AddressBook, you think about an API that's been around for a while. So in order for them to adopt Core Data, they needed to find a way to fit it into existing API to make sure that customers didn't have to change. And they were able to do that very easily. They also had a legacy schema that they needed to pull over and to put into Core Data to make this work. And that's possible too.

iCal is in a little different situation. They have their calendar store framework now that they're publishing in Leopard. They don't actually have an existing API to worry about, but they do have file formats. If you look in the ICS file, there's the V-Cal standard in there that they need to maintain for compatibility. And Core Data makes that possible.

But what's most interesting about these two is not that they're frameworks, but also that they have multi-process access. They have high concurrency needs. You think about applications, just about every application uses one of these frameworks in one way or the other. Whether it's the new AddressBook or the new iCal functionality. So these people have really strong requirements for iCal. They have high requirements for lots of people accessing the same framework and dealing with multiple access. And Core Data provides a solution for them.

Talking about other apps that you might think that might not be able to use Core Data, you think Mail. Well, okay. Mail has a very long tradition of having a very, very good schema for their data management. And frankly, they don't really need Core Data. Core Data works, or their existing schema works great for what they have.

But they have some auxiliary data needs. Storing the pop message counts and tracking what you've downloaded. That's something that they had to do themselves and realized there's a solution. It's Core Data. So you don't need to use it for the entire breadth of your application. You can use it just for a small piece.

They're also something that has just a moderate data set. You don't need something large. It's just a small amount of data that you can still get a lot of benefit from Core Data. How many of you people in here have actually done AppleScript? Currently do AppleScript? Ah, quite a few.

How many of you have used System Events? Not many. There's actually a functionality that we produced in Tiger called System Events. And in there is something called Database Events. The AppleScript team put together a background process, no UI at all, to allow people to use databases in AppleScript.

And they were able to use Core Data as a way to develop a generic schema backend so that you could store lots of different data types and use it seamlessly with AppleScript. So it's an example of something that doesn't have a lot of application logic in it. It's just something that manages data. It has no UI for it. It's just something that's done by AppleScript. But that gets a lot of benefit from something like Core Data.

Then we talk about something like Xcode. And I put Xcode in a category of its own, and I'm going to call it the kitchen sink. And this is because they have a lot of different needs for data management, and they're all very, very different. For example, the documentation window is now using Core Data for its API and its full-text searches.

We're also using Core Data-- we eat our own dog food-- as our management for-- our management for all of our design models. So the migration model is done using the persistent document implementation. And Xcode is also very unique in that it's using Core Data underlying for managing how their modules are related to one another and what kind of functionality they provide.

It has really nothing to do with the project file or the source code file or anything else, but it's a really good way to structure how the modules are related to one another and how they all fit in at runtime. So you don't necessarily need to have large data needs or a specific data set in order to use Core Data. You can use it in some very interesting ways. So after looking at all these different examples, you still might be thinking, okay, I'm still not sure. Is Core Data right for me or not? So let's take a look at that.

Let's go back to our original definition, this one that you all kind of laughed at, and let's break it down into some more fundamental pieces. So let's break it down into four or five different areas that we'll go into a little more detail. So the first one is modeling. It's defining a structure for your data.

And this might be something you're sitting there thinking, "You know, I don't really need this right now. I've got all my classes, I've got all my data management, this is not really important to me." This is actually something that you probably don't need, but it's probably something you want, whether you realize it or not.

This is something where defining the way your data is structured gives you the ability to do things like provide default values, provide validation and constraints to say, "My data has to exist this way." It may not seem like a lot, but it becomes very, very useful when dealing with user interface and user workflow kind of things. But what's further important is by defining a structure of your data, by modeling your data, you get really large benefits from like versioning and migration.

What does it mean to have this version of my application and this version? What changed? By having a structure of your data, you can model your data. By having a defined schema, you can actually see and the framework might actually help you deal with some of those things.

Object Graph Management is probably what you guys think of as the hallmark of a framework, I guess. This is something where this is really the nuts and bolts or the details of using a persistence framework. And there's a lot of things we do: object manipulation, more validation, all those kinds of things. But I would actually be willing to bet that for the second bullet up there, for undo and redo, Core Data is absolutely worth it, if for no other reason. We've all been there.

We've all had an application where you've started up and you did something and you went, "Ooh, that wasn't right." And we all do it. We all look down and see Command-Z and you're like, "I'm not really sure if that's going to work." But you tried anyway and you do and you're like, "Ah, that didn't happen."

That's just not right. Because undo should work. It's a fundamental thing, but we all know it's very, very difficult to do and to get it right. What does it mean to roll back changes? What about after saves? And to do all those things. But if every application had undo for free, I think we'd all be a little more whimsical with things that we do inside our apps.

But there's also things like dealing with... There's things like dealing with references and inverse relationships and delete rules. What happens if I delete this other thing? Should that go away? These are all things that you shouldn't have to worry about. You should really be thinking about how your application is structured, how the data is related to one another, and getting on with the core of your application.

Structured access is something that I think is very important. We all have probably had large data files of some kind and realized, I need to find something, I need to get into something. You go to your top shelf and you pull off your dusty CS manual and you're flipping through to try to figure out what's the best way to sort this? What's the best algorithm to find this? You shouldn't have to do that. There should be a really nice structured way of defining your queries.

In fact, it would be even better if you could define queries in a templatized way that you could come back to later and say, this is my query, just apply it over and over again. Things like filtering and sorting. Again, you shouldn't have to worry about this. There should be a structured way to do this. And Core Data, along with the Foundation Framework, provides that for you.

Data persistency is probably the thing that you think of as one of the hallmarks, the foundations. I need to load my data, I need to save my data, validate that it looks right. These are all things that are very important. But what about formats? Do you have options to save your files out in different formats?

Could it be one? Could it move to another? Could it start from one and become another? What about different options like read-write capability? Can you mark things as read-only and not have to worry about it through any other piece of your application? These are little nuanced details that the framework also provides.

But I also think that one of the really important things are opt-in initiatives. You guys are here all week to find out about the brand new things that are in Leopard and all the other different frameworks. And these are things that are really fundamental to the way you do things or opportunities that you'll have later that, again, you guys really shouldn't have to worry about. Things like key-value coding and key-value observation, making sure that your objects play nicely with Cocoa bindings and interfaces. But what about the new things that we've heard about this week? The new 64-bit stuff, the garbage collection, property support.

These are all things that a framework really should be able to do. And I think that's a really important thing. These are all things that a framework really should provide for you. So again, you don't have to worry about these little details. You can just get on with the core of your application. And these are things that Core Data does for you as well.

So we've heard a lot about what Core Data is, and we have a moment of honesty here. Core Data is not for everyone. Sometimes, honestly, Core Data is not an option. And there are three specific instances that we should cover, just to be completely on the up and up.

The first is if you can't link the Foundation Framework. We do a lot with the Foundation Framework, both with key value coding, key value observing, all sorts of different stuff with Objective-C. If you're not able to link Foundation, unfortunately Core Data is not going to be right for you.

The second is if you require what we call an infinitely malleable schema. If you have an application where users need to be able to provide the most flexible schema possible, or they need to be able to make changes, add their own properties, change relationships at runtime, do all sorts of different things.

If you can't define that in a way that's kind of abstracted, Core Data is unfortunately not going to be right either. We like to have a very well defined schema, because we make a lot of our functionality based on the fact that that doesn't change. So if you need an malleable schema, unfortunately Core Data is not going to be right for you And the last one, and I know this might be near and dear to some of you, but if you need true client-server access at this point in time, Core Data is not going to be the solution for you. Now that's not to say that you can't provide an application that does some kind of distributed notification stuff and all access the same store, but if you need something that's truly client-server oriented, Core Data is not going to be the solution for you.

So we covered what Core Data is and some of the things it's not. For the rest of this presentation we're going to go through two of the more fundamental pieces: the modeling and persistence side, the object graph management side, and then for all of you who are familiar with Core Data, we're going to go through the new features that we have a lot of other sessions this week to cover in more detail.

So let's get on to modeling and persistence. Modeling is, as we said before, it's kind of the formalization of your application's data model. And it defines two major functional things. It defines the type and structure of your data, and it defines the interrelationships between pieces of data. Now, a model for us has a major component, which are entities. And entities, in turn, have properties, and properties can be either attributes or relationships.

So if you look at the diagram over here, you'll see that we have recipe and we have chef. Recipe and chef are both entities. They both have properties. The recipe has attributes like name, cuisine, and directions. So does the chef. Different kinds of attributes. And they also have relationships.

So you can see that recipe is related to chef. Recipe has a single chef, in this case. And a chef has many recipes. So we've created two entities, some properties, and some relationships between them. This defines the structure of our data. This defines how our data is structured. is going to look at runtime.

Now actually using that data at runtime, we're going to have objects. So manage objects are the data workhorse. These are the things that contain the values. They also contain the relationships to other objects. These are the things on which we've implemented key value coding and key value observing to do a lot of the things that you need to get your work done.

And this is going to be your main element of subclassing. When you want to go and add custom logic or add custom functionality, extend some attributes, do some interesting things with the relationship, this is going to be your unit of work. So this is kind of the noun for most of Core Data is the managed object.

Now it's really important to be clear, though, what's the difference between an entity versus objects? Because a lot of people use these interchangeably, and it can cause some confusion. So consider that entities are the metadata. They're the structure behind things and how things are laid out, while the managed objects are the actual data implementation and representation.

Managed objects refer to a single entity. So when you have a managed object, it is only one kind of entity. But to be a little bit more clear, I have a chart here that's very simple. An entity description is to a managed object as a class is to an s object. Everybody read that carefully. So an entity description, or actually an easier way to say it is a managed object is an instance of an entity description. Is everyone clear on that? Come on. Thank you. All right.

Once you have data, we have stores to put them in, you want to put them somewhere. And we have different kinds of stores that have different kind of characteristics. Different stores are better for different things. So you see we have four different store options. We have two atomic stores, our XML and our binary store.

By atomic store, we mean they need to be read in all at once and written out all at once. We also have our SQL store, which has different performance characteristics, because we don't need to read in all the data. So you need to pick and choose a store that's right for you, and we have lots of different options.

That's kind of the basic of the persistency of the models, stores, and stuff like that. For all of you who have really read the documentation, there's actually some hidden features in there that, well, they're not really hidden, but they're sometimes overlooked features that we really wanted to cover to get the most out of Core Data for all the times that you use it.

The first feature we want to cover is transient properties. How many of you have used transient properties? Not many. So the idea of a transient property is something that's managed just like everything else, but it's not persisted. And you might be thinking, "Well, why would I ever want to do that?" Consider something like your own custom class, or something like an NSColor.

We don't have a native data type for that. I can't just go into my model and say, "Oh, this is a color." Now, we do have a binary attribute that would allow you to serialize it out as an archive or something like that and read it back in. That's code that you have to write, but it's kind of a pain to think of, "Every time I deal with a color, it's not really a color. It's a data blob, and I need to convert it back and forth."

A transient property gives you the ability to define something that says, "It is this type. Add a little code behind it that does the transformation, and it looks just like any other attribute. It feels like any other attribute, but you don't have to worry about actually figuring out some way to save a color natively as its actual format."

But while most people think of transients as properties, there's also the other side which is a transient relationship, which is also a little bit heady, but it's actually very, very interesting. A transient relationship is one that exists at runtime, but that's not persisted. Now, why would you want to do something like this?

Consider if you have two stores of data with two objects that are actually related. Well, they're in two separate stores. I can't actually bind them together in a way that's going to be meaningful if I just open one store. I can't create a hard relationship. But if I could create a transient relationship such that when both stores are opened and that relationship exists, so I could ask one object for the other or vice versa, and it works just like any other relationship, it provides a really powerful way to do what we call cross-store relationships, or bind two schemas, or take two graphs and connect them in a way that really doesn't exist in a persisted world, but really needs to exist in your runtime. So transient properties can be very, very useful.

But to be honest, you don't actually need to model all of your data. A lot of people have said, "Well, I've got all these instance variables and stuff like that. What do I do with them?" In some cases, you do nothing with them. If your properties are not modeled, if they don't exist in one of our data models, Core Data ignores them. They're not managed at all. Now, of course, you need to be careful when you do this, because when we say that they're ignored, we really mean it. There's no undo, there's no redo, there's no validation, there's nothing.

But it provides a very easy way for you to take a schema that already exists and start to move it to use managed objects, start to use properties and use the attributes and relationships over time, but not have to move it all at once. So you can actually have unmodeled data.

Another thing about the modeling tools it provides is the fetch request templates. This is kind of the templatized queries that I was mentioning before. And it provides a way to create pre-canned requests that you can refer to by name at runtime, and provide your own variables for replacement. This may not seem to be very useful, except that it includes a really easy to use visual development for your queries. It's all sorts of nesting, ands and ors and everything else like that.

And when you have a really complicated query, something that takes a lot of details to look at and figure out how to get it just right, this becomes a really useful tool for creating those. And again, you can just go ahead and use the dollar sign notation to provide variables, and then at runtime you say, "This is my query, these are my variables, go and perform this query." So it makes it really, really easy to create complex queries and refer to them later.

When creating models, there's the whole concept of combining and targeting models. The Core Data runtime likes to have one model to work from. Now for us, models are just collections of entities. They're really no more sophisticated than that. So you can take two models, and with different pieces of API, you can merge them together.

It effectively just puts all the entities into one model, and the result looks just like any other, as if you had created it yourself. So this becomes a really easy way to take two separate and distinct things, and to put them all into one. But what's probably a hidden feature is something we call configurations, which is the ability to take a model and define collections of entities by name, and say, "This is a collection, and this is a collection," and then to use those collections to target specific stores.

Now, why you might want to do that is, consider you may have a store that says, "Well, my application data is over here, and my user data is over here, and I want to make sure that it's actually split up that way. I want to make sure that the data doesn't actually get mixed up, or go to the wrong place." Or you may have an application that says, "Well, I have all sorts of data, but depending on the type of user that logs in, I really don't want to give them access to a certain set of entities." So by using configurations, you can define specific collections of entities, either targeted to a specific store, or they're only accessible to the application. So it's a very handy feature. You can do this with the API, and you can also define them in the modeling tools.

Talking about picking specific tours and stores and targeting specific stores, we listed the in-memory store as one of our options, and it's kind of the unsung hero. Having an in-memory store is an exceptionally useful thing to do when, well, you want to use managed objects, but you don't want to save them.

So reasons you might want to do this are, for example, you have your own file format, and you just want to read in, do some managed object stuff, but you don't want to actually save it out. But you still want to use managed objects to work with them.

As we talked about before, if you have multiple stores, you may actually want to use an in-memory store to represent things that go across them, or relationships that go across them, that again don't exist at runtime, or things that are calculated. It's a really easy way to get all the same benefits of Core Data, but not have to worry about persistence or where the information goes to.

But we've talked a lot about having lots of stores, and this is an important point to point out, that you can use as many as you want. By default, Core Data's API works across all the stores that are added to our coordinator. So all of your fetches work across all of your stores. When you create new objects, we find the right store for it, figuring out which stores can support which entities, or where it's related to and where it has to go.

And it makes it really easy to have lots of different pieces of data in your application with many different backings. Now it's kind of interesting to think about the fact that there's two ways you could really structure this. You could do something like iTunes, where every iTunes library has the same kind of information. We all have songs and playlists in the same manner. So every store has the same schema, and we just add multiple stores so you can look at it.

But again, you may want to structure it so that some data lives in one store, and some data lives in another. And it all just works with Core Data. You really don't have to think about how many stores you have, or where it needs to go, unless that's really important to you. So it provides a really good flexible way to structure where your data comes and goes to.

But of course you don't always have to pick one. As we said before, we have lots of different store options, and the framework is mostly store agnostic. When you start with one store and end with another, the framework just works. Your application logic just works, regardless of what kind of store you're saving to.

And it makes it really easy to, for example, start with the XML store at development time, so that you can open up your store and look at the content, figure out, did I really save that correctly? Is it the value I was really expecting? And then later convert it to the SQL store to get all the benefits of performance and only pulling in the data set that you need. And I'm not joking, it is really one piece of code, one line of code to migrate a store from one version to another. So just because you start with one doesn't mean you need to end with that one.

And finally, NS Persistent Document. For all you people who've had to write document-based applications, there's the read from URL, write from URL. There's all the different pieces of API you need to implement in order to figure out where your data comes and goes to. We've provided a subclass, NS Persistent Document, that handles that for you. It provides all of the basic core data stack, all the basic stuff that we need to go ahead and pull in your data and persist your data, and all you really need to do is provide a model file.

Just need to tell us what the structure of your data looks like and get on with developing your application. And it's a really, really easy way to get started with core data. So actually, let's go ahead and show you a demonstration of that and take a look at what that looks like.

I'm going to go into Xcode here, and I'm just going to go ahead and create a brand new project. And I'm going to go ahead and select the Core Data document-based application here, and give it a name. We'll call this Xcode. Let's call it Photos 2006. And I'm just going to go ahead and create my application.

And there are a couple pieces of interesting information in my project here. There's a model it created for me by default. It added some classes for me. There's my document class, and it added some resources for me. I can show you the code that it put in for your document class, and as you can see, it's minimal.

And that's mostly because of the fact that this is a persistent document subclass. And all the nitty-gritty details about being a persistent document and figuring out where the data comes from and goes to is all in the persistent document. So it's not code you have to worry about.

Let's go and take a look at the data model. This is the data modeling tool, where you're going to go ahead and create your data models. And I'll up the resolution here so you guys in the back can see it. And we just want to go ahead and create a model, create something to structure our data, and allow us to create something on top of it. So I'm going to click the little plus button here, and it's going to add an entity. And I'm going to call this photo.

We're going to replicate the Photo class that you probably saw yesterday in the keynote where we had the grid view and saw different information about Photo. So here's my Photo entity. I'm going to use a little keyboard shortcut here to quickly add three attributes. So you can see my attributes here, and I can go ahead and rename them. We're going to want a Data attribute to store the actual binary data for our Photo.

We're going to go ahead and add a rating so that we can rate how we like the Photos. And we'll give each Photo a name. So I've defined three properties and I've named them. But I need to go ahead and actually define the types behind these different attributes. So for the Data attribute, I'm going to go up on the pop-up here and select Binary Data and set that.

I'm going to go and select the name. I'm going to do the same thing. Select the string. And you'll note that the inspector on the right gives me some different options. I can provide a min and a max length for string. I can provide a regular expression for validation. But I can also define a default value. In this case, we'll define Untitled as the default value for all of our names for our Photos.

And I can also define a rating. And we'll make this an integer. And much in the same way, I can define some constraints. We'll say our ratings are from 0 to 5, and every Photo has a default value of 2. So I've just defined an entity with a couple attributes. Now, let's do something a little more interesting. We can add another entity here, and we'll call this Person. And we could go ahead and again add a couple attributes. Maybe this is First Name and Last Name.

The modeling tool is really good about providing you things like multiple selection, so I can select both of those and say they're both strings. So now I have person and photos, and I want to relate them in some way. I can actually pull down here at the bottom, and there's a little tool for creating relationships, and really all it's going to take is for me to drag from one to the other.

It's going to create a line, and when I let go, you'll see it added the relationship to the person entity. And I can come over here and call this relationship photos, so that I can ask every person object for its list of photos and get the actual photo. Of course, we note that here that this little arrow has only got one little end to it. I really want to click the too many relationship, because a person should really have many photos.

Well, let's actually create the inverse relationship. What about going from a photo to a person? I can just drag backwards, and you'll see that, again, it creates the relationship for me. And I could name this maybe "Photographer." and you'll see that now a photo has a 2-1 relationship to a photographer, but there's also this nice little pop-up here called an inverse, and that functionally these relationships are an inverse.

If I go from a person to a photo, going backwards from the photo to the person is really kind of the inverse. So by defining those as an inverse, you'll see that my model is now structured, that person and photo are related, but it collapses those relationships into one to help you think about how your data is structured. So there we go. Not too difficult to define a model. Let's do something interesting with it.

So in as much as we want to talk about how you can use Core Data in your applications, I think it's also interesting to talk about how Core Data is integrated into the development tools. We take this very seriously and provide the fact that because Core Data is so easy to use, by making it more integrated, by making it a better part of the development experience, you guys can get a lot more work done.

So I'm going to go into my document class here, and you'll see it was brought up in Interface Builder. And I want to add something interesting here. So let's go ahead and just delete this item here. And how many of you have seen the Interface Builder demo so far, seen the new library? You like it?

Come on, do you like it? Thank you. All right, so in the library now you'll find a new Core Data item. So you see this little item here, I'm going to drag it into my window and drop it. And what this is going to do is it's going to go and ask Xcode, "Okay, what's the project for this nib?" and show me all its data models.

So there are the two entities I just created, photo and person. So I'm going to go ahead and select photo, and it's going to bring up this nice little window just to show me the kinds of interfaces it can help me create. Now this is just going to go ahead and lay some widgets out for me and set up some bindings.

It's not going to do anything magical, it's just going to take care of a little bit of minutiae details that it would have taken me a while to do on my own. So I can go ahead and pick different checkboxes here to see a live preview of what it's going to create, and I want it to create the search fields and the table views and everything else.

And I can actually pick and choose which pieces of information I can put into my interface. And at this point in time I really don't want to deal with the photographer information, I just want to deal with the photos. So we'll click finish, and there's my interface. And you'll note that if I look over here in my document window, you'll see that it added an array controller for me.

And you'll note that it set up the binding to the manage object context. And the context is on the files owner, which is the persistent document. So you see it's just binding up things that have existed for me, things that it's taken advantage of that exist in the classes like the persistent document.

So really at this point there's nothing I need to do. I'm just going to go ahead and save my nib, and I'm going to go ahead and create a new interface builder. And I'm going to go ahead and create a new interface builder. There's a great new feature of Interface Builder, build and go in Xcode. So I'm not going to leave Interface Builder, I'm just going to click the button. You'll see that the little, you can see it over there, it's going to precompile and go ahead, build it, So here's my application.

Let's go ahead and do something with this. I have a nice little folder of photos here that I just want to go ahead and now that I've defined a model, let's put stuff into the store for this. So we can go ahead and just add a new record, and you see that there are my default values, the rating of 2 and the untitled. So I can drag in the beach, and we'll put that in. We'll call it beach. And I'm a city guy, so we'll make that a 4 out of 5. I'm more of a... Anyone from Chicago here?

Excellent. So I'm more of a city guy, so we'll put in city here. I know it's not a picture of Chicago, but we'll give that a five. And I could just go ahead and add more pictures. I'm just dragging the image into the views that Interface Builder created for me.

We'll call this fall. We'll give it a rating of four, because I like the colors. And we'll drag in the tower. And I'm scared of heights, so we're going to leave that as two. So all I've done is used all the bindings that were set up for me in the persistent document, and it's now put this into my graph.

And all I'm going to do is hit Save, and it's going to give me a list of places I can save that. I'll pick my desktop. And they're my format options. These are the store options that were provided for you by the persistent document set up in the project template. And so I'm just going to pick the binary store, and we'll just call this demo.

And if we look at the desktop, there's the store file. Just to prove to you that it's real, I'm going to close that. We'll go and reopen it. And there it is. There's my store file. Again, I can go through and things like multiple selection, I can go ahead and change the ratings on things like that. I could go ahead and change them to something different.

Actually, I can go ahead and save it. Now watch, I'll go ahead and change all the way back to two, maybe change this one back to one, and realize, you know, this is not really what I want. Let's just hit Command-Z and undo all the way past the save.

And it all just works. So oh, wait. Oh, oh. But wait, there's more. So that's interesting. That's getting data into your application. Let's do something a little bit more, well, Apple-like. So let's go back into the interface here. Actually, let's quit the app first. No, really, I said quit.

And go back into the app here, delete the interface, and let's actually just run that little assistant again, and we're going to pick the photos. And this time I'm actually going to do the grid view, that nice little thing you saw yesterday. There's a little preview of the grid view, and we'll add a little search field at the bottom. And in this case, all I really want is the data and the rating for the photo. We won't worry about the other details for right now.

And we'll say Finish. And now here's the grid view interface. In this case, it doesn't have the black background it had yesterday. It's just transparent. But you can see that it also added this other little view here. And the view is the one that's going to be replicated over and over again inside the grid.

And just to show you that it's real, I can click on this here, and we'll see that it was set up the binding for me to the represented object dot data. So that's going to be the managed object for the photo, and it's going to use the data attribute.

I can actually go ahead and come down here to the tool tip, and we'll make the tool tip for this the name. And we'll say, OK, this is the name of the photo. So as you go over, you can see the name. We can go ahead and change different attributes of IB.

This is all just real stuff. Now, rating-- text is boring. So we can go and see that this is still bound to, again, represented object dot rating. So let's just go ahead and delete this over here from the view. And let's go and do something a little more interesting.

Let's take a-- oh, let's take a level indicator. Doesn't make any sense. But it actually does when you note that it has a rating property. So I can go ahead and change the max-- set the min and max value to the same thing that we had in our model.

Set the default value right now to 5 so that I can easily size it and get it fitting right here. We need to make sure that it's structured correctly so that it fits at the bottom of my view. So we'll set our springs and struts correctly. And we also need to make sure that it's bound correctly.

And I need to remember that it's bound to the layout item and the rating. So there you go. Some simple changes. IB is going to make sure that I wanted to save this. So we'll go ahead and save that. And now let's just go ahead and run the application again. Don't panic. It's just the default document that comes up. We'll go back and open the demo. And there are my images now with nice little ratings and nice little pictures.

Not a single line of code. Now, that's not to say that any of you will write a completely codeless application and ship it. If you do, call me first. But it shows you that it's really, really easy to go ahead and create a data model, create an application, or create an interface to go ahead and see what your data looks like, try out some different things, and wait till later to figure out, all right, now I've got to figure out, how do I sort this? How do I figure out how to rearrange the objects? I want the different sizes. I want different widgets. It just provides you a really easy way to not even think about the fact that you have data and data management needs until it's really important to you.

So a really, really easy example of using Core Data to write applications. So now what I'd like to do is turn the presentation over to Ben Trumbull, senior engineer and architect on Core Data, to talk more about object graph management and some of the new features in Core Data.

Good evening everyone, I'm Ben Trumbull, and thank you Matt. Alright, so I'm going to first talk about object graph management, the next section that Matt outlined for you at the beginning. And what exactly is an object graph? It's an excellent question, I get it all the time, I really don't know other than what someone before me told me. And a bunch of graph objects all related to each other, separate clusters, and they're basically all edited at the same time, so it's a single scope.

In which you're working with your data. So all of these graphs that we manage are controlled by a single managed object context. So a separate graph would have multiple different contexts, one for each graph. And the context performs all the change tracking, the undo, the fetching, a lot of the verbs. and as I mentioned the relationships between your data are what's really important for creating this graph. So as Matt showed you in the modeling tool, we have inverses between objects, objects have delete rules that get applied when you delete them, and changes cascade throughout the graph.

So the context is providing all this change tracking. And for you, we're built on top of key value observing and NS Notification Center. So a lot of that is how we're all providing these features to you. And the context provides a list for you of inserted, updated, and deleted objects, as well as when it saves. And we flush out those changes, and we maintain those inverses.

Matt might give you a little demo of the undo/redo support. So we provide complete support based on the model. Transient properties and all the persistent properties get undo and redo for free. We're built on top of the foundation NSUndoManager, so there isn't really any super magic in there.

If you want, you can go to a context and you can get its undo manager, and you can call all of the foundation methods on it, so you can set a limit to the number of items to undo, or you can turn off grouping by events, or you can set the undo manager to nil to turn off undo management entirely. Matt kind of stole my thunder there with undo/redo cross saves.

So, in addition, at the same level here at the object graph management, we provide validation. So how to validate is in the model, and in the modeling tool you can specify rules like whether or not a relationship is allowed to have a certain number of items, if there's a mandatory minimum or a limit, whether or not a property is optional. So if it's optional, it might be set to nil. If it's not optional, then you must give it a value. It could be a default value. But when to validate is done by the context, and this is done at save time.

So we'll basically validate all the objects before we push the save to the database itself. And there are a bunch of hooks there for custom validation that you can apply, validate for insert, validate for save. And key value coding has its own set of callbacks that you can also provide, so validate name error, validate person error. Validate person or photographer error in the previous example.

I keep coming back to this Managed Object Context. It's your primary access point in Core Data. It's the place where you're going to find most of the verbs, most of the actions that you apply to Managed Objects will be methods on the Managed Object Context. And it's the central observer, the spider in the web, that is basically reacting to all the changes that you make to your objects.

And each one is its own context, its own isolated scope, so it's a scratch pad. So you can have multiple of them, you can have different changes. So you might have one context for, say, an inspector, and another context for the main view, and they can track changes separately. And that lets you save some, maybe you undo others, but you don't have to commingle all those edits.

The notifications are a great way for you to add custom code in reaction to what Core Data is doing if you don't find that there's a delegate method that does what you want. A lot of people have done some really interesting things, building on top of the NS notifications that we post. The contexts of posting these notifications about all the things that happen at the end of every user event, as well as after a save.

Some cool things you can do is use the Distributor Notification Center to repost those to other apps. The managed objects themselves provide KVO notifications at a fine-grained level for every single property change. Another thing to note is that Cocoa Bindings, the integration with Core Data, a lot of what Matt demoed to you, is all built on top of this same mechanism. The communication between Core Data and Cocoa Bindings is really the same set of mechanisms available to you. for you. So it's built on top of the notification center and key value observing.

So sort of a reprise of the overlooked features here, now at the object graph layer. The first is batch faulting. Basically, one thing that comes up a lot on some of the mailing lists, a lot of questions people have, is we tell them that the SQLite store is the most scalable store. And so they decide that their app has a lot of data, for whatever that means, and they switch over, and then it runs slower than it did with, say, the XML store. Never mind that when you write out XML it's huge compared to a nice little database.

And part of what's going on is Core Data is trying to be lazy to keep memory pressure down, and we have this faulting mechanism that only pulls in the objects that you're actually using. So if you ask for an object one at a time, we're going to have to go and fulfill all those values one at a time.

And performing a fetch from the database is a lot like unbuffered I.O. You want to have a nice big chunk, not so big that you waste memory, but not so small that you keep getting hit by the overhead. So it's really easy to do this. You can set up a predicate.

That's self in, and you give it a var arg, and you bind that up to an NSArray of object IDs or managed objects, and we'll grab all of those objects. They don't have to have any particular relationship to each other. You just identify them that way. And that's a good way to get much better performance.

Something else is all the object IDs we provide are sort of opaque identifiers for every object in the persistent stores, and these objects provide a URI reference. So you can ask them for a textual representation, which is an NSURL, and you can pass that out to another application. You can put it on the pasteboard.

There are lots of things you can do once you have a nice URL format for your identifiers. Now, they're only useful after the object's been saved, because until it's been saved, it's a temporary reference. But once you have saved it, whenever you get that textual URI representation back, you can give it to the persistent store coordinator and get back your original object ID.

So another thing that Address Book and iCal are now making use of in Leopard is the multi-process access. So the SQLite store supports multi-process access on file systems that correctly implement file locking. So you'll have to check what version of NFS your file server is running. But on HFS Plus and AFP you can get multiple readers and multiple writers, and conflicts are handled at this layer by the managed object context when it does the saving. It basically realizes that it's not going to be able to save the data. is that someone's edited things out from underneath you.

And as part of that, we have merge policies, which you can set per context, which tell Core Data what you want to do after it detects a conflict. So we're tracking revisions in the database to all the objects. And at that point, we can tell whether or not a different process has changed the object, or even different threads. Or as I mentioned, you can have multiple contexts, so an inspector window and a main window, and they can have different changes. We can tell even if it's a single thread that, say, the inspector window has changed something that the main window hasn't saved.

So we provide for you four automated recovery options, so you don't even have to necessarily think about it if you decide that you want last writer to always win, or the first writer to win and the last writer to be forced to roll back, or whatever. And you can use the default policy, which will give you a user info dictionary on the NSError that has all the information you need to do whatever you would like to resolve that conflict. And then you just save again.

So some new features in Core Data I'm very excited about. I hope many of you, since a lot of you have started playing with Core Data, will be interested in these. And the first is the schema evolution, so schema migration. This is the single most requested feature of Core Data after Tiger Shift was done a version of their app or even just a beta of their app. And you want to change your model, add a new attribute to an entity, something like that, really simple stuff.

And at that point, Core Data wouldn't open up older database files. So somewhat inconvenient, a lot of people have worked around this in various ways, and we're providing a full tool about this that allows versioning of the different models and the stores, so we can detect which version of the model was used to write out a store, provide mapping rules between the versions about how you would use the model, and then we can use the model to create a new version of the model.

So we're going to talk about the core data, and we're going to talk about the core data, and we're going to talk about the core data, and we're going to talk about the core data, and we're going to talk about the core data, and we're going to talk about the core data, and we're going to talk about the core data, would like to upgrade between different versions and a migration manager to help run some of that process.

So the entities all now have a version hash code, which is basically sort of a cryptographic hash. And we record all the version hashes used to write out a store in the store's metadata. And when you open up the store, we'll make sure that everything is in sync.

The mappings are pretty simple rules to describe how to transform one version of a store into another version. So if you add an attribute, do you want that new attribute to have the default value? Do you want the value to be derived from existing data? Stuff like that.

And there are a bunch of simple transformation rules that we provide. And then the Migration Manager is basically a helper class to help run the migration for you. So it basically finds the new model for a store and gives you some customized callbacks and sort of gets rid of some of the tedium here.

and Michael Sanchez will be giving a session on the schema evolution feature. And Miguel Sanchez will be giving a session and go into a great deal of detail about how to upgrade your stores and make use of the versioning. Another new feature is the Atomic Store API, and this allows you to write your own custom backend.

Core Data will interact with your store, and then we do the same stuff we do on the frontend with the Manage Objects and Manage Object Context. This allows you to bring in legacy data or write out data in a standardized file format that Core Data couldn't annotate. Core Data will interact with your store, and then we do the same stuff we do on the frontend with the Manage Object Context. and tomorrow at 2:00 PM in the Optimizing Core Data application session, Melissa Turner will talk more about writing your own custom store.

So in the Managed Object context, we've added some new API. These are some fairly basic enhancements. The first is a fast account for fetch request method, where you don't really actually want any of the data, but you'd like to know how many objects would match that query. So if you're doing a summation field or something, you don't need to fetch all the data just to figure out how many match that query. And also to help address part of the problem and some of the tediousness of working with temporary object IDs.

So if you have an object that hasn't been saved yet, but you still want that URL reference to put it on the paste board, we now have a method on the context to obtain a permanent ID before you actually save the object. So that way when you do save the object, its ID won't change, you'll sort of pre-allocated a permanent ID for it. And this really makes it a lot easier to do cross-door relationships.

And for the Fetch Request API, we've added a bunch of API to help tune performance. In particular, you can decide now if you just want object IDs, you don't actually want to fetch all the data, you just want the identities that match that query. So you can do this to build up some sets, and then you could intersect them or do some other kind of manipulation.

You might pull them aside and then fetch the data that match those object IDs at a later time. And then another important optimization is prefetching. So prefetching is the processing of relationship key paths. Like I said, you don't want to do lots of small fetches if you don't have to.

So if you're pulling up a window and you know that you have a master detail view, you might want to fetch the detail information. So that'll be off of the master object's relationships. And you can grab all of that with one fetch by specifying the key paths that you'd like to bring back at the same time. So that's a great way for improving the responsiveness of the UI, get exactly what the UI is showing.

And then the last thing I want to talk about is the key path. So if you're pulling up a window and you know that you have a master detail view, you might want to fetch the detail information. So that'll be off of the master object's relationships. And you can grab all of that with one fetch by specifying the key paths that you'd like to bring back at the same time.

And there's some sundry additional API on the fetch request. There's a bunch of stuff that Malcolm Crawford has very kindly documented for us. Prepopulating managed objects, if you know that you don't really want the faulting to go on, and you can make the queries a little more exclusive and not match subentities.

And we have some additional notification keys and context to tell you what objects have been refreshed in the past event if you call refresh object on it or the context does, as well as what objects have been invalidated if you call reset. Then the manage objects themselves have a few more callbacks. The Persistent Store Coordinator has a set URL method that you can use if you decide to move a Persistent Store file that's loaded in a coordinator. You move it on the database.

You move it on the file system out from underneath Core Data. You can tell us where you've moved it to. And also we're shipping now as part of all the installs a debug version of the framework. So you can use the DOLD image suffix export, and you can get assertions in your code as you're debugging. And you can also turn on multi-threading assertions where a coder will check to make sure that the state that you've put it in and you've put the context and all the manage objects are in are what we think ought to be done.

And these are obviously disabled in the shipping version of the framework because it's a little expensive to do. But there's a whole separate framework there for you. And some related enhancements in Foundation instead of Core Data are the predicates and expressions that Core Data is built on top of.

There's filtering on NSSet now that matches the filtered the array with predicate. And we also have a subquery expression for you. There are custom function expressions where you can say you want to set up a predicate that's using a function that you provide. We have some set operations. So you have predicates that can do unions and intersects as part of a fetch. And a fetch request expression that Miguel will show a little more about tomorrow in the modeling session that's basically a predicate that performs a fetch as part of the qualification.

So one of the things that I'm most excited about are all of the platform initiatives that Mac OS X is doing in Leopard now. I was very sad yesterday when it took over 90 minutes to get to the store to configure a new machine. So part of that here, and Core Data has participated in some way and form in pretty much all of the Leopard platform initiatives.

The first is, of course, the universal transition to Intel, and the transition to 64-bit on both PowerPC and Intel. So Core Data now is just built four ways, and it's pretty easy for you to set up your applications using Core Data to ship on any of these architectures.

and going on right now as we speak is a session on Objective-C 2.0. They have a bunch of new language features that are actually pretty cool. So they're adding garbage collection to Cocoa, and that's opt-in so you don't have to use it. And then there's a fast enumeration protocol, so it lets you type a foreign syntax that I'll show a little bit later.

And then Objective-C supporting properties, and Core Data, I'm afraid, didn't quite get to that in the Leopard Seed. I will demo some of that in a bit. But properties are some things you see in some other languages, and they're really a great way to get type safe key paths in your code, have the compiler check as opposed to just the arbitrary strings.

and we spent a lot of time working on performance and scalability enhancements in Leopard. On my trusty dual G5, I was getting about a little over 20,000 rows per second on Tiger. And on Leopard, now in the seed that you have, we're getting about 125,000 rows per second. So we've made some small adjustments.

Thank you. This is really only the beginning. There are more performance enhancements that have been throughout the Leopard development cycle than I can actually put on the slide here, but there's tons of stuff. Odds and ends, query short-circuiting, improved caching, improved memory management, faster saving, and we have a lot of additional things that are right around the corner in a future Leopard seed. Now, you may ask why we consider this so important, and that's because the first question that 92% of you ask is-- and the other 8% of you ask me if I have time for a question.

So-- So these reflect some legitimate concerns. I tend not to hear them from people who have used Core Data, but they are legitimate concerns as people need to decide whether or not they want to invest the effort to adopt a new technology. So I appreciate that. Then there's this, which is an interesting question. It's perhaps better phrased or rephrased as the, well, Core Data is a new technology. So I'm going to start with Core Data.

So Core Data is a new technology. We came out in Tiger, and all of you pretty much had data before Tiger. I mean, Panther was not some great dark age. You've all used persistence solutions before. Some of you have rolled your own. Some of them are really fast. And others of you have reused other technologies, maybe even been as simple as NS Coding. But there are a lot of different persistence solutions out there, and adopting Core Data is a transition for you. Now, this is my favorite.

and David So this one I have a little less sympathy for, because I just don't see how hand coding V tables with Strixit function pointers is fun. Objective-C is a tool. It can be used well, it can be used poorly, it can be used for the wrong problem.

So now I'm going to provide you a demonstration. So we've run a little bit longer than I expected, but hopefully none of you have to go anywhere too quickly. So basically, I'm going to show three different projects that are all doing exactly the same thing. They're fetching 500,000 words from a dictionary of words.

And the words are actually very trivial objects, but they should help demonstrate how well or not well Core Data manages all the overhead that its features provide. So the first project is something that I see fairly indicative of people who've decided that Core Data and Objective-C are too slow.

Basically, as you can see here, it's a .c. We've dispensed with Objective-C entirely. And the key point here is, basically, we have a generic dictionary as our model object. We don't have a formal schema. We're just sort of throwing things into it in key pairs. And we're using core foundation collections to handle the data. And this is all, you know, I mean, it's not necessarily pretty, but it's all pretty straightforward. We're using SQLite as directly. Nothing much going on there.

And sometimes you see pretty much the same thing in C++, but with STL classes. So I'm going to hint this a bit. And we're going to have a reproducible experiment. So I'm going to run it a couple times to warm everything up so the future test apps run and don't get an unfair advantage. So basically we're fetching about 150,000 rows per second here with this pretty basic SQLite code using Core Foundation. And if we pop up terminal, that was about 96 lines of code.

If you spent a great deal of time in Shark and you spent a lot of bribes on the Cocoa team and the kernel team and all the other teams at Apple to how to resolve every single problem that you found in Shark, you'd move along in the direction that this app is going in. So we come back to Objective-C because we can get some things for free in here. That's very nice.

And so the key point is we've replaced the dictionary, which is a very convenient model object but not a really high-performing model object, with a real class that has scalars. And we've done some serious hacking on retain and release. We've got these static functions, which are basically Objective-C's equivalent of a non-virtual method. And otherwise, it's pretty much the same. We've got some custom callbacks now in the CF collection classes that we're using.

Now we're running at over 200,000 rows per second. It's about 40% faster than the more obvious way of doing it. and it's about a third more code, so not so bad. And then here for the Core Data example, first I'll show you a model. Now all the apps are using exactly the same schema. So we've got a very trivial word here. Word's got a length, it's got a first letter, and it's got some word text. One of the nice things about Word, we have a Word class. Now here we're showing some of the new Objective-C properties.

So for Objective-C 2.0, the properties allow a way for Core Data to really tell the compiler about the same kind of things that the entity description is telling Core Data about. Properties are much more generic than Core Data. Objective-C is using them for all kinds of things, and they provide a really nice interface to the access that the word object has, or your object has.

And basically what ends up happening is the compiler then infers the existence of accessor methods. So you don't have to declare all the additional accessor methods. You could if you wanted to, but there's a text and a set text method and a length and a set length. And that's about it. Now for Core Data objects... Yeah, that's about all you need to do to implement all those properties.

So one of the great things about this is working with the Objective-C team, we're now generating all the accessor methods dynamically for you, including the primitive accessor methods. So you don't have to write any of the calls to primitive key value coding that you might have before. If you want to write a custom accessor method, then you can call, in this case, you could call primitive length, and Core Data will infer the existence of that method and will specialize them at runtime to be exactly what the entity needs.

We'll optimize them. They're great, and they're actually quite a bit faster than doing the key value coding through the original Tiger Core Data templates. So... Take a quick look at the main body of this sample. It's pretty straightforward. We're creating a core data stack-- the same thing you can get from any of the template projects here. We create a fetch request. So here's some of our new API to turn off some of the faulting, because we know we're going to iterate through every element of the result set.

And we set a fetch limit, which is the same that the SQL was written for in the other examples. And we execute the fetch. And then here, we see the new faster enumeration syntax in Objective-C, which is quite a bit faster for iterating through any of the Cocoa Collection classes. Something I neglected to show you is the advanced SQLite project that I just showed you is also using that. So you can use them bridged. Pretty nice. And here is the new property syntax.

So like I mentioned, this is type safe compiler check key paths for you. A lot of people have asked for this feature for a long time. It's really nice. If words were more complicated, you could follow relationships further. Now, if you decide that you don't like this syntax, because you're more of an Objective-C traditionalist, you can do this. You don't have to change any other part of the project. These accessors will be generated for you. So that all works.

[Transcript missing]

We're getting close to 500,000 rows per second. If this were built a little bit differently, it might actually break that, but pretty close. And we'll throw in the header file too. We're getting close to 500,000 rows per second. If this were built a little bit differently, it might actually break that, but pretty close.

So if we can cut back to the slides. Great. Yes, no whammy. All right. So. Over 800,000 lines of, sorry, 800,000 rows per second on the new hotness. The exact, exact same code. So, it's been fun. So now one of the things about this is I don't mean to say that Core Data is some magic silver bullet that you should use for all your problems. Rather, this is a demonstration that we're really serious about delivering high-performance solutions to you.

And forthcoming and future leper seeds, you'll be able to get a hold of a version of Core Data that supports the Objective-C app property syntax, and has a dynamic method generation, and is scalable across the architectures. As you might have imagined, there's quite a bit going on there behind the scenes to get that 2x speed improvement.

So one of the things that Core Data is providing is this scalability, where regardless of what machines your customer is using, if they're on a mini with a core solo, a dual core MacBook, or one of the new hotnesses, or they're like me and they're on a little older power PC, Core Data is going to be managing the data as efficiently as possible. that hardware.

So, just to wrap up, I hope that you learned that most of the applications on the system can benefit from Core Data. Even if you're something like mail and you already have your own core persistency solution, you might do some new features, some preferences, something like that. It's model-driven, really easy to use, powerful object graph features, and we have lots of new stuff in Leopard. We've shown you some. The custom stores we'll talk about more tomorrow. The schema migration is tomorrow morning at 9 a.m. And Core Data is participating in all of these Leopard platform initiatives that Core Data apps can pretty much get for free.

A parting request from us to you: help us help you, please. We love great feedback. Please read the documentation. Knock on Croft has done a fabulous job. I personally think it's the best documentation for Cocoa. There, you know, throw down the gauntlet. Please use bugreport.apple.com. It really helps us a lot to know what parts of the framework are most important to you, where we need to add a little more polish, what performance improvements we should do first, stuff like that. If you have a performance issue, please include a shark trace. Shark's really easy to use and provides us some really valuable feedback.

If you have a bug and you include a very small sample application that we can click run and it shows us our bug, that's great. Let me tell you that I fix those bugs first. If someone sends me a sample app, they have my attention. So for more information, Derek Horan is the Application Framework Evangelist, and you can find all the good stuff in the usual places.