What's New in Core Data - WWDC 2017

App Frameworks • iOS, macOS • 35:57

Join the Core Data engineering team and learn about the new features in Core Data. See how you can easily and automatically include your data in Spotlight to allow users to find content even if it's stored in Core Data. Learn about new options for indexing your data, and hear the details on a new feature for tracking changes over time.

Speakers: Melissa Turner, Rishi Verma

Unlisted on Apple Developer site

Downloads from Apple

HD Video (1.11 GB)
SD Video (287.7 MB)
PDF Slides (1.3 MB)

Check out Bezel, our iPhone mirroring app →

Transcript

This transcript has potential transcription errors. We are working on an improved version.

Hello, welcome to Session 2010, What's New in Core Data. I'm Melissa. I'm one of the Core Data engineers. And I will be joined on stage later by Rishi, one of my co-workers. And we're here to talk about all of the stuff that we've been working on for the last year because it's been fun for us. And we're hoping that you'll find it fun and useful to work with in your applications. And we will hope all of the technical problems for this presentation have already happened.

So what have we been up to? We've been doing some Core Spotlight Integration. We've added a bunch of new indexing APIs. And we've added a big new feature called Persistent history tracking. That is what Rishi's going to talk about. And it's really cool and I'm really excited to see it presented here. But first Alphonse says "Hi". He's my mascot. I couldn't do this without him.

So Core Spotlight. What is Core Spotlight? Before I talk about Core Spotlight I want to talk a little bit about Spotlight. Spotlight was the original search API, and search technology on Mac OS 10. It was very document centric, which meant that the search results that you would get back from a customer doing a search in the Spotlight menu would identify the document that had been retrieved by Spotlight, but wouldn't actually tell you where in the document that hit came from. And it was not really ideal for databases, which as we know can have lots and lots of records in them.

Which brings us to Core Spotlight. Core Spotlight was the original search technology on iOS. And it was much more database-friendly. It would return not only the application and the document that the user had selected, but it would also tell you where in that document. It allowed for deep linking into your application so you could bring up exactly the results that your user was interested in seeing. Now it ships on Mac OS, as well as on iOS. And we thought this is a great time to integrate with it in Core Data.

So how does Core Spotlight work? Well it's a lot like Spotlight. You've got an application. It has a bunch of data in it. The user goes into the Spotlight menu and looks for something that's going to be in your application. For example, you know, photo from India. And Spotlight launch services launches your application and you're given a user activity that you can use to retrieve the information about what the user was searching for.

Popup, you know, that specific view so your user could see it. As I said, it's now shipping on Mac OS as well as iOS. And at this point the Core Spotlight team wants me to tell you that you can also use the Core Spotlight API's to access the Spotlight search indexes from your application, which is cool. And it does all of this without scattering zero-length files all over the file system. And I'm really hoping that at least some of you are mentally cheering about this because I sure was.

So how does it work? There's three basic components to the Core Spotlight integration. First there's -- we've repurposed the index by Spotlight property on NS Property Description to indicate that a property should be pushed into Core Spotlight. But we need a way to specify what should be showing up in the Spotlight results menu. And that is NS Entity Description Core Spotlight Display Name Expression. And of course there's an Exporter. That whenever you do save in your application pushes whatever change the user has made into the Core Spotlight Index.

Core Spotlight Display Name Expression is kind of the trickiest piece of this to understand. It's actually an instance of NS Expression that you set as a property on NS Entity Description. And it's evaluated during the Core Spotlight index update. With the object -- the managed object that's being updated as the evaluated object parameter. And this allows you to return anything that can be pushed in to Core Spotlight. Even something that's not one of the base Core Data types. Like for example an instance of CS Localized String.

By and large, these expressions are going to evaluate Keypaths but they could be, you know, much more interesting. They could be built in functions, lowercase, uppercase, stuff like that. Or if you're somebody like me they could be completely random functions that go often dig around in your runtime to find instances of classes and call just random methods on them to get their stuff back. I think that's cool too.

And of course, there's the NS Core Data Spotlight Delegate, which is like the big and complicated piece that we wrote that does all the work. It implements CS Searchable Index Delegate. It provides default implementation of all methods, so you can use it straight out of the box. It uses separate store on a background thread so you don't need to worry about any of the work that's being done by the Spotlight Exporter blocking your main view context from getting its work done.

You initialize it with a NS Persistent Store Description and a model since it needs to know what it's working with. And basically you tell a Persistent Store that it should be exporting its data to Core Spotlight by setting the exporter as an option when in when you're calling NS Persistent Store with type on the Persistent Store Coordinator.

The override point on the delegate if you don't like the default behavior is NS Data Core Spotlight delegate attribute set for object. And an override might look something like this. Here we've got a basic method that goes through and gets the attribute set return by the super class.

It looks at the object, realizes that this is an instance of a photo because I've got like the photo app and those photos have tags. And when somebody searches for something-- when the user searches for something, that returns a hit on one of those tags, I actually want to return the photo that has that tag because the tag itself is not terribly interesting.

So I go and collapse all of the tag information on to the photo objects itself so that the photo will be returned when there's a match in Spotlight. And then I just return the attribute set. And that's as much code as we expect most of you to ever have to write for this. So now I'm going to do a demo.

And I've got a little application here. Please ignore the x Code in the background, we'll get there. It displays photos. If you scroll around, look at stuff, click on things to see, you know? Photos of people enjoying themselves with a little bit of meta data information. But here's the interesting part. I can now come up to Spotlight and type in for example, something that I know is in Spotlight and it automatically activates my application and calls the application continue user activity method with an NS user activity.

And here I've broken in that method so let's see what's going on. Well, somebody opened something and I'm going to go dig around in that user activities user info because that's where all the interesting stuff is and dig out the identifier. And if I print that identifier we can see that well, the identifier and the user info is something that looks an awful lot like a managed object ID string representation. So I'm going to turn that into a URI and ask the persistent store coordinator to turn that into a managed object ID. And then I'm just going to continue and hey wow, I've opened the photo that my user requested. Lovely, relaxing view.

And that sort of how the Core Spotlight Integration works. We do most of the work, you need to implement a method that does stuff. So that's the Core Spotlight Integration. Some totally non-sharp corners, stuff we've made easy for you. If you decide that you want to enable the Core Spotlight Integration and pass an exporter to as one of the options when you add your store to the persistent store coordinator.

We will automatically start exporting to Core Spotlight on your application's first launch. And we will continue exporting until we have exported all of the data even if your application is exited, if the user exits your application in the meantime. We use the Persistent History Tracking feature to ensure that all of the data is pushed.

Even if your application crashes, exits, whatever that doesn't quite work in the seed but it will eventually. Some slightly sharper corners, the Core Spotlight and Spotlight integration aren't mutually exclusive, you can't do both. If you try setting both entities to Core Spotlight, Display Name Expression, and the property is stored in External Record, the model compiler is going to complain at you. And we don't currently track the results of batch operations.

Some slightly sharp corners. This first one is something I discovered while debugging because I built all of my unit tests in a separate framework, which is run by XCTest, which doesn't have this entitlement, which made it very confusing as to why I couldn't talk to Spotlight. Whatever application is trying to talk to Spotlight, needs to have the com.apple.application-identifier entitlement.

Most of you are going to get this by default, but if you're writing your unit tests, which are run by XCTtest be aware of this. And of course you're going to need to handle the search result. You need to implement Application, Continue User Activity on your application delegate and you know, put whatever you'd like in there. We saw this on screen or the demo, but I'm just walk through it here.

This is the method you'd be implementing. Pretty simple, grab the User Activity user info. The identifier manage object ID is stashed in there under KCS Searchable Item Activity Identifier. Just turn that into a managed object ID and then you can do whatever you'd like. Watch a new window, zoom to the correct view, whatever.

There are however some really sharp corners involved. Which is that if you were using the old Spotlight integration, it's going to be up to you to delete your old reference files if you had them on disc, because we can't know if there was semantic meaning to those. And the Core Spotlight API's are x86 only, which means that this feature is x86 only. So it's time to upgrade your applications people.

And that's Core Spotlight. We've also added a New Indexing API. So what is indexing? Indexing is a way that allows you to configure your database to do faster searching. You can have multiple indexes, each of which can have multiple columns, and a column can actually belong to multiple indexes. So this allows you to set up a lot of indexes that the database can use to choose the most efficient index for whatever query it's trying to execute.

It will basically look at the aware clause and say, oh I'm trying to use these columns I'll use this index. And now I'm going to do a little bit of a digression on indexing for those of you, you know, aren't database heads. The basic form of indexing that supported everywhere by everything is a Binary Index. But these are very strict comparison only. They basically look at the bits and say are these bits the same? And if not, you know, it's not an index head.

But over time more types of indexes have become popular and something that's really especially nowadays is super useful is an RTree Index, which is a type of index that's optimized for doing range base searching. So a binary index sort of looks like this. This is the kind of data you might have, a bunch of -- you know, unit dimensional stuff. And if you build a BTree on it you end up with something that looks a lot like this.

You know for every node that all of the nodes -- child nodes to the left are smaller valued than the parent node and all the nodes to the right are larger valued and you just navigate down very, very simply. But RTree Data often looks a lot more like this. You end up having lots of, you know, odd shapes, multidimensional things. They're often overlapping in weird and odd ways. And there's no really great way to organize those in a unit directional linear Binary tree.

And RTree Data often is stored more like this. You have a range zero, which is the minimum bounding blocks around all of its subcomponents or its set elements. For example range element fully contains the ranges R6 and R10. And R6 in turn has a single child R7, which has more children. This is actually a slightly unbalanced RTree but oh well, we'll live with it.

The current indexing APIs, well we have two of them. You can create a single property index by calling an NS Attribute Description is Indexed. Set that property and we'll create an index on it. If you want to do a multi column index you'll call NS Entity Description compound Indexes. And that allows you to create an index that has, you know, three, four, five columns in it. But both of these API's automatically and only create a Binary Index in ascending order.

And we kind of wanted to evolve the Core Data APIs, make them ore powerful, a lot of them specifying new and interesting things, more configurable. So we've added a bunch of new model classes. First is NS Fetch Index Description, NS Fetch Element Description, and we've added a new method on NS Entity Description indexes which takes a Fetch Index Description as it's -- an array of Fetch index descriptions as its parameter.

And we've added some functions to control whether the database should or should not use a specific index when its evaluating a given query. And we've added index by, which allows you to say, hey really use this index. And no index has existed for awhile and it's a way of saying I know better than you and I know better than your query optimizer, don't use an index for this specific component of the where clause.

NS Fetch Index Element Description is sort of the basic component. It allows you to specify an Index Element, basically a single property. You can specify the type of index that should be created on that property, whether it's going to be Binary or an RTree. And if it's Binary index it also has you specify a direction, ascending or descending.

And this can be important if you're doing certain types of searches. For example if you're doing a male search you want to start with the most recent data first, instead of the oldest data first. So direction changes the order in which the database will look at the contents of the index.

A Fetch Index Description allows you to combine a bunch of Fetch Index Elements into a single index, which you'll give a name and optionally you can set a predicate that says only put rows matching this predicate into the index. It also does some validation of the index composition to make sure the index you've created is reasonable.

Some Sharp Corners here, you can't mix and match element types right now. It's either going to be an RTree or a Binary index. Range indexes can only be created on numeric properties less than or equal to 32 bit. That's a restriction of the underlying SQLite database. If you're creating your own incremental store and that's a restriction -- that's a problem for you, let us know and we'll see what we can do about it. But Range -- properties that are being put in a Ranged Index also can't be optional. Range of null is a little bit undefined right now.

And if you're building your model in code and we know some of you do it, and I've helped some of you at previous WWDC's. If you change your entities structurally after you've set indexes, then we're just going to drop the index. And that actually applies for the entire entity hierarchy. If you add or remove an entity in the hierarchy or add or remove properties on any of those entities, we can no longer be sure the index is valid so we're just going to drop it. Do it last.

As you may know from previous versions, changing the indexes on an entity doesn't actually change the entities version Hash, which means that there's not going to be migration when the user first launches your new version of your application with the new indexes. So there's not going to be a performance head.

On the other side it's not going to cause a migration so they're not going to get updated indexes. If you want to cause the index of migration to happen and for the indexes to be updated, you'll need to call the -- Update the model's version Hash Identifier and that will trigger a migration. The SQL store is going to ignore any indexes it doesn't understand. So what that means is that if you open your database after you think you've added indexes and don't see them, go look at the application console, it probably told you why you're not seeing them.

And I'm going to do another demo here. I have a model here. And I'm actually not going to work with my live model because this is a demo and I will inevitably typo something, but adding a new index in the new modeling tools is actually pretty simple. You select the entity that you want to add an index to and just come down here to add entity and you can now see that it has an Add Fetch Index option. And you just add one, call it by Location, and I can add a couple of properties to it here -- the latitude and longitude properties. And if come over here and select the photo, you'd come in here and change the Version Hash Modifier.

And now the next time a user launches an application that uses this version of the model, they're going to be using the new RTree index. Of course I am going to just going to run my demo application which already had that built in. And -- -- I'm going to come down here and grab the Predicate that I have carefully set aside -- oh I forgot one thing.

I'm going to come up here and edit the scheme, because we want to see what Core Data is sending to the database. And I've got the Com Apple SQL Debug Argument Set. And you'll notice -- for those of you who are familiar with SQL Debug as a feature in Core Data, you're probably most familiar with 1 and 2 as the numbers -- the levels you're going to be setting.

We've added a new level 4, which will actually print the query plan that SQLite is using as well as all of the statements that Core Data is sending to the database. So I'm going to run my application with that version set to 4. Come back, reset that predicate, and run the search. And now we can come and we see the last search.

It's actually kind of more interesting and I'm going to come over here and I've actually copied and pasted an earlier version into Text Edit, because I could enlarge the text there. But you can see I'm now doing a sub Select against a separate table, Z Photo by Location. And if I look down in the query plan, you can see that this is a virtual table index. So that's basically routing the location query, the latitude query down onto a separate RTree table.

And well, we can see up here that it says it took 0.056 seconds for 16 rows. And if I go compare this to the version without the RTree, well there's the query we were originally doing, Z Longitude between this and this. And we can see that it took 0.062 seconds for 16 rows.

So you got to do about a 10 percent performance improvement on a laptop with an SSD Fast Cash, all that kind of stuff. So it's about a 10 percent improvement your devices are actually going to -- and it's also a very small data set. Your devices with a larger data set are really going to appreciate having RTree's to search against instead of regular Binary Indexes.

And -- I think I'm done here. So that is RTree searching. I have one more slide before I hand you over to Rishi. We've also added some miscellaneous new stuff. Basically new attribute types. We've added NSUUIDA Attribute Type and a NSURL Attribute type backed by the UUID and URL value classes respectively.

[ Applause ]

You appreciate the work we do. That makes us feel good. And now I'm going to hand you over --

Thank you.

[ Applause ]

Hi everybody I'm Rishi Verma and I'll be showing you Persistent History Tracking. So let's talk a little bit about your app. So you've developed these amazing apps and hopefully you've adopted the NS Persistent Container we introduced last release. So you've got a View Context set up and a Background Context. And you've got a Shared Container where you've set up your Persistent Store. Now your view context is one connection and your background context is another. And maybe you've got your background context downloading contend from the remote site that you have set up.

Now you're a very diligent developer and you've gone out and created a Document Extension, a Shared Extension, and a Photo Editing Extension. Now all of these connections to your persistent store could be running in any process since they're extensions, or they could be coming from our app from various context. And you may not know exactly where this data is coming from. Well, let's break it down a little bit and see what happens.

So sorry, so here we are in your app and the user is looking at your view context and in the meantime your background context is downloading info from some remote source. Unfortunately a common thing that we've seen is that your Background Context may be ingesting and your View Context may not know about these objects, or it may not even care about these objects.

But as your Save Notifications come in, you're ingesting them anyways. And so you're going to be missing some changes, if you missed out new Background Context, because they're really easy to generate at this time but you may miss out on Batch Updates, Batch Deletes, which don't generate a Save notification.

Now the user's gone into Safari and decided to share something. This writes to the database and when you're developer comes back to the app -- or when you're user comes back to the app, you don't know what's changed. You know something's changed, but what exactly you don't know. And so you have to rescan the entire database and see what has changed since you last were there.

And unfortunately sometimes our apps crash. It may be due to the user quitting your app while a Background Contact was going. And when your app comes back up, your Background Context doesn't know necessarily where it left off, scanning your local database. So it'll have to check with remote content and rescan the entire local database to see where it needs to be picked up again.

Let's break this example down just a little bit more. So we have an app, and it decides to load some data. So the Background Context loads some data and the View Context gets updated. Now when the user suspends the app, and the document extension comes up, we'll see that data inserts two new objects. And then my Share Extension comes along, updates an object and inserts another one. And so does my Photos Extension. Once all of these are done I'll come back to my app and notice that some of the data is stale.

So our app here only knows about these original six objects. The rest of the database is actually modified or it actually doesn't know about these objects. So it has to go through and scan everything and see what's relevant to that View Context or that Background Context. Now we've introduced NS Persistent History. It's rather simple. Grab your Container, grab your Store Description and voila. Set the option NS Persistent History Tracking to true and we will now track all of your changes to your database. So let's go back to that previous-- thank you.

[ Applause ]

So we'll go back to that previous example. And now in my View Context and Background Context load data, we'll see that it happens within a transaction. And this transaction's written to my database. And I can see that it was done by the Background Context and in my app. Now when I hop out of my app and the user goes into a Document Extension -- We'll see that this also generates a transaction and so on with my share extension.

And also the Share Extension will note that there was a modification and not an insert here. And now with a Photos Extension we'll also see the fourth transaction come in and we'll see that it was done in photos. So now when my app is being brought back to the foreground, I do still have still data, but now I know where I am in the history table and can simply go from transaction 1 to 4. And voila I've updated.

So we've introduced a new Store Request for you to fetch your history. It's called NS Persistent History Change Request. You can simply start it up with either a date, a token, or a transaction. And you can simply set the result type as well. As simple as a status, the count of change is available. Or you can get nitty-gritty and get all the transactions and all the changes.

Also with a change request we allow you to purge your history. So we want you to start this off and you can delete the history prior to date, a token, or a transaction. Now your Persistent History is linear so this way it will keep it short. One of the caveats we want you to keep in mind when pruning your history is that you should have a gatekeeper.

Single place where this should be done so that it knows that you've consumed the history you need, your clients have consumed it, and you no longer are removing anything that is relevant to someone else in your app, all right? And then now we have NS Persistent History Transaction.

This is a large data blob that covers who did the transaction to your database. So as you can see we have a Store ID, a Bundle ID, a Process ID, but we also have a Token. Since your history is linear, we provide you a Token that you can encode to disc, so that when your app exits you can keep track of where your app was in the history. Simply re-launch your app, pull your token, and fetch your history based on that and you're good to go.

Now, also in the transaction we allow you -- the author, we want you to go onto your Manage Object Context and tell us who is the author of this context and this Save Batch Update or Batch Delete. And so we also allow you to go onto the Manage Object Context and set the Transaction Author. Do this before your save or executing a request and we'll take this string, put in a transaction, so you can identify who did what to your database at a later time.

And as part of every transaction there is a set of changes. A change represents a single thing that was done to one object. So they'll be many changes for every transaction and here we'll have inserts, and updates, and deletes. In the case of an update, we'll tell you which properties were updated -- either attributes or relationships, and then in the case of a delete we'll tell you what the tombstone is.

It's a set of attributes that you can set in the model editor that allow you to identify the object post mortem. It's no longer in your database so the object ID isn't as relevant anymore, but with a tombstone you can still identify that object and be able to process it later. Now here we are in the model editor.

And I have a unique identifier for this entity. And it would be great if I could tombstone it. So we simply go in and mark it Preserve After Deletion. And it's that simple and it will show up in my tombstone. And so with that I'd like to show you a quick Demo.

So I have a simple app here. It takes a post I have and puts it on my social media feeds. I've set it up with a few buttons here. A download button that will go on a background context and ingest from the Cloud. And a background task that will go out and post all of my stuff. And I've also got a tab where I can go and see the history. Currently in my code I have Use Persistent History set to False, but I've enabled it for my store so that we can see what's changed.

So here I am and I'll have a user post. And this user said, "This demo is amazing!" Awesome. And I can also go and download one from the background. And we can see that I've labeled where the post came from. The first one comes from the user, the next one comes from the background. But now let's hop out and go to the Share Extension. And let's share about Apple here real quick.

So I can post about this and see in my app. And now when we get back to my app -- oh wait, I lost some data, this is really odd. It must have something wrong. But if I go back to the history I can see that I had a post within the app and I had remote content and I even can see the Share Extension. However I'm not seeing that here, like the user I start adding data and trying to figure out what's wrong with this UI.

And my Persistent History once again is showing everything. Oh wait, that's weird. All of a sudden my app has hit some refresh that I've downloaded into it based on some view where I finally go and hit the database and re-fetch everything. Because that's the only way I can solve this problem sometimes and that's frustrating to our users.

But what happens if we turn on Persistent History and no longer have to track things like that? So I'll simply turn this on. So as you can see here real quick, I have my Background Context set up and my View Context set up and I've also got them listening to each other. But I also have some Transaction Author set. And I just want to show you how simple that is so that we can see this going forward. OK, so let's stop the app real quick and re-launch.

And here we are with our app using History at this time. And let's go to that troublesome case of sharing a post. Well, High Sierra looks pretty cool, so let's share about that real quick. All right, shared about that. Real quick let's just share about one more thing. The cool new watchOS.

And now when we get back to the app, voila. There they are. All I've done is absorb the history, when my view reloaded, and we can jump to that code real quick. So previously if I was using Fetch Hit -- not using Persistent History, I was fetching everything, but now that I am all I'm doing is fetching the history and absorbing the merged context from the History. And it gives me a much smaller subset and I no longer have to scan the entire database to see what was different.

[ Applause ]

And so to recap it's rather easy. Simple to enable Persistent History. It's just one option on your store. It really uncovers all those missing changes. And you may have been generating background context, you may just not realize that you're missing a notification, you've harnessed Batch Updates, Batch Deletes, which don't generate notifications. So they uncover all of that. And you'll never be in the dark again about changes to your database.

Two caveats to mention about using persistent history -- the first one is about Migration. We will preserve as much of your history as possible on migration except for two cases. One is if you remove entities from your model and we already have history on these entities, we'll have to remove that history for those entities and that creates a gap.

The other one is for tombstones. For tombstones you don't want to have certain data there after awhile to identify the object. Maybe it's become a security concern for you. So you've called the tombstone. So we'll do the same and we'll remove that tombstone value out from the model. But if that removes anything that will leave a gap in your history, we'll also then only preserve your history to the last complete transaction. This way you can go and follow the history without falling into a trap where I am missing changes once again because of a gap.

As far as performance there's a slight impact on every save and every batch-operation you do. According to the size of your save and your batch operation we'll do a slight calculation of what changed and write that to the database. However, once your operation is complete all of that memory that we used is returned back to you. There's also small storage overhead, but your history can be purged, so this makes it really easy for you to reclaim that space once you've consumed the history. And that's Persistent History Tracking.

[ Applause ]

Now just to summarize what we've gone over today. Melissa went over Core Spotlight. It's really easy to adopt, easy to share your data with Spotlight, and it gives you deep linking. So your users can find their data in your app anywhere on the phone. And the New Indexing API makes your searches faster and just makes your app that much faster. And lastly, Persistent History allows you to know who's always done it, never be lost, always know who's done what to your database.

One PSA before we go. Please file bugs. Let us know if you run into an issue. Let us know if you want something. Go to bugreport.apple.com. We are listening. So please go and visit. For more information check out the www.DC website. And for some related sessions we had "What's New in Cocoa" earlier today, so go check out that video. We also have "What's New in Foundation" coming up next. And we have some cool stuff on "What's New in Core Spotlight" later on Thursday. And check out the "Cocoa Development Tips" early on Friday morning. Thank you very much.

[ Applause ]