A Guide to Turi Create - WWDC 2018

Frameworks • iOS, macOS, tvOS, watchOS • 36:17

Turi Create is an open source toolset for creating Core ML models, for tasks such as image classification, object detection, style transfers, recommendations, and more. Learn how you can use Turi Create to build models for your apps.

Speakers: Aaron Franklin, Zach Nation

Unlisted on Apple Developer site

Downloads from Apple

HD Video (1.12 GB)
SD Video (290.3 MB)
PDF Slides (8 MB)

Check out Bezel, our iPhone mirroring app →

Transcript

This transcript has potential transcription errors. We are working on an improved version.

[Applause] Hey, everyone. [Applause] Thank you. Good afternoon, and welcome to our session on Turi Create. This session builds upon yesterday's session on Create ML. That session covered many of the foundations of machine learning in Swift. So if you didn't catch that session I do recommend you add it to your watch list. The goal of Turi Create is to help you add intelligent user experiences to your apps.

For example, you might want to be able to take a picture of your breakfast. And tap on the different food items to see how many calories you're consuming. Or maybe you want to control a lightbulb using your iPhone and simple gestures. Maybe you want to track an object in real time like a dog or a picture of a dog if you don't have dogs in the office.

[ Laughter ]

Maybe you have custom avatars in your game, and you want to provide personalized recommendations like hairstyles based on the beard that the user has selected. Or maybe you want to let your users apply artistic styles or filters to their own photos. These are very different user experiences. But they have several things in common. First and foremost, they use machine learning. These experiences all require very little data to create.

All these models were made with Turi Create and deployed with Core ML. They all follow the five, the same five-step recipe that we'll go over today. And all these demo apps will also be available in our labs. So please join us today or Friday for the ML labs to get hands-on experience.

Turi Create is a Python package that helps you create Core ML models. It's easy to use, so you don't need to be an ML expert or even have a background in machine learning to create these exciting user experiences. We make it easy to use by focusing on tasks first and foremost. What we do is we abstract away the complicated machine-learning algorithms so you can just focus on the user experience that you're trying to create.

Turi Create is cross platform so you can use it both on Mac and Linux. And it's also open source. We have a repo on GitHub and we hope you'll visit. There's a lot of great resources to get started. And we look forward to working with you and the rest of the developer community to make Turi Create even better over time.

Today we're excited to announce that the beta release of Turi Create 5.0 is now available. This has some powerful new features. Like GPU acceleration. And we'll go into these features in detail later on today. The main focus today is going to be the five-step recipe for creating Core ML models. I'm going to start by going over these steps at a high level. And then we'll dive into some demos and code.

So the first step you need is to understand the task you're trying to accomplish. And how we refer to that task in machine learning terms. Second, you need to understand the type of data you need for this task. And how much of it. Third, you need to create your model.

Fourth, you need to evaluate that model. That means understanding the quality of the model and wheth-- whether it's ready for production. And finally, when your model's ready to deploy, it's really easy using Core ML. Let's dig a bit deeper into each of these steps. Turi Create lets you accomplish a wide variety of common machine learning tasks. And you can work with many different types of data. For example, if you have images, you might be interested in image classification. Or object detection.

You might want to provide personalized recommendations for your users. You might want to be able to detect automatically activities like walking or jumping jacks. You might want to understand user sentiment given a block of text. Or you might be interested in more traditional machine learning algorithms like classification and regression.

Now we know this can get really confusing to those of you who are new to machine learning. So we've taken a stab at making this really easy for you in our documentation. We begin by helping you understand the types of tasks that are possible and then how we reference them in machine learning terms. So what we can do now is revisit those intelligent experiences that we walked through at the beginning of this presentation and assign them to these machine learning tasks. For example, if you want to recognize different types of flowers in photos we call that image classification.

If you want to take pictures of your breakfast and understand the different food objects within them, we call them, that object detection. If you want to apply artistic styles to your own photos, we call that style transfer. And if you want to recognize gestures or motions or different activities using sensors of different devices, we call that activity classification. Finally, if you want to make personalized recommendations to your users, we call this task recommender systems.

Now the great thing is that the same five-step recipe we just walked through also applies to your code. We begin by importing Turi Create. We proceed to load our data into a data structure called an SFrame. And we'll go into a bit more detail on the SFrame data structure shortly. We'll proceed to create our model with a simple function, .create. This, this function extracts away the complicated machine learning behind the scenes.

We proceed to evaluate our model with the simple function .evaluate. And finally we can export the resulting model to Core ML's ML-model format to be easily dragged and dropped into Xcode. Now I mentioned that this same five-step template applies to all the different tasks within Turi Create. So whether you're working on object detection, image classification or activity classification, the same code template applies.

For our first demo today, we're going to walk through a calorie-counting app that uses an object detection model. So we'll want to recognize different foods within an image. And we'll need to know where those, where in the image those foods are so that we could tap on them to see the different calorie counts.

So let's take a look at the type of data that we'll need to create this machine learning model. Of course we need images. And if we were just building a simple image classifier model, we would just need our set of images and labels that describe the images overall. But because we're performing object detection, we need a bit more information. We need to understand not just what's in the image. But where those objects are.

Now if we zoom in a bit closer to one example, we see a red box around a cup of coffee. And a green box around a croissant. We call those boxes bounding boxes and we represent those in JSON format. With a label, and then with coordinates x, y, width and height. Where x and y refer to the center of that bounding box.

So it's also worth noting that with object detection, you can reference or detect multiple images, multiple objects within each image. So I mentioned that we'd be loading our data into this tabular data structure called an SFrame. And in this example, we will end up with two columns. The first column will contain your images. The second column will contain your annotations in JSON format.

Let's or by now you're probably wondering what is an SFrame? So let's take a step back and, and learn more about it. SFrame is a [inaudible] tabular data structure. And what this means is you can create machine learning models on your laptop. Even if you have enormous amounts of data.

SFrame allows you to perform common data manipulation tasks. Like joining two SFrames or filtering to specific rows or columns of data. SFrames let you work with all different data types. And once you have your data loaded into an SFrame, it's easy to visually explore and inspect your data.

Let's zoom in a bit more on what's possible with SFrame. With our object detector example. So after we import Turi Create, in the case of our object detector, we're actually going to load two different SFrames. The first containing our annotations, and the second containing our images. We have a simple function .explore that will allow you to visually inspect the data you've imported. We can do things like access specific rows. Or columns of our data. And of course we can do common operations like joining our two SFrames into one. And saving the resulting SFrame for later use or to share with a colleague.

Next we create our model. So I mentioned we have this simple function .create that does all the heavy lifting for the creation of the actual model. And what we do behind the scenes is we ensure that the model we create for you is customized to the task and that it's state of the art. Meaning it's as high quality and high accuracy as we can get.

And we're able to do this whether you have large amounts of data or small amounts of data. It's very important for us that all of our tasks work whether you have even a small amount of data as small as about 40 images per item that you're trying to detect in the class of object. In the case of object detection.

Let's move on to evaluation. So I mentioned we have a simple function .evaluate that will give you an idea of the quality of your model. In the case of object detectors. We have two factors to consider. First, we want to know did we get the label right? But we also have to know if we got that bounding box right around the object.

So we can establish a simple metric with these two factors and go through a test data set scoring predictions against known what we call ground-truth data. And so we want to make sure that we have correct labels. And then a standard metric is at least 50% overlap in the predicted bounding box when compared with to the ground-truth bounding box. Let's look at a few examples.

In this prediction we see that the model got the label right with a cup of coffee. But that bounding box is not really covering the whole cup of coffee. It's only about ten percent overlapping the ground truth. So we're going to consider that a bad prediction. Here we see a highly accurate bounding box, but we got the label wrong. That's not a banana. So let's not consider that a successful prediction either.

Now this middle example's what we want to see. We have 70% overlap of our bounding box, and the correct label, coffee. So what we can do is systematically go through all of our predictions with a test data set. And get an overall accuracy score for a new model.

And finally, we move to deployment. We have an export to Core ML function that saves your model to Core ML's ML model format. So you can then drag and drop that model in to Xcode. This week we've actually some exciting new features related to object detection specifically. So I encourage you to attend tomorrow's Vision with Core ML session to learn more. In that session, the speaker will actually take the object detection model that we're building today and go into more detail about deployment options. And there you have it! The five-step recipe for Turi Create.

[ Applause ]

Thank you. So with that, I'm going to hand off to my colleague Zach Nation, for a demo.

[ Applause ]

Thanks, Aaron. I think let's just jump straight into code. Who wants to write some code live today? We're going to go ahead and build an object detector model right now. So I'm going to start out in Finder. Here I've got a folder of images that I want to use to train a model. We can see this folder's named Data, and it's full of images of breakfast foods. I've got a croissant, some eggs, and so on. This is, this is a good data set for breakfast food. I think let's go ahead and write some code with it.

I'm going to switch over to an environment called Jupyter notebook. This is an interactive Python environment where you can run snippets of Python code and immediately see the output. So this is a great way to interactively work with a model. And it's very similar in concept to Xcode Playgrounds. The first thing we're going to do is import Turi Create as TC. And that we way we can refer to it as TC throughout the rest of the script. Now, the first task we want to do is load data.

And we're going to load it up into SFrame format. So first we can say images equals TC.loadimages. And we're going to give it that folder day that I just showed in Finder. And Turi Create provides utilities to interactively explore and visualize our data. So let's make sure those images loaded correctly and we got the resulting SFrame that we expected.

I'm just going to call .explore, and this is going to open up a visualization window where we can see that we have two columns in our SFrame. The first is called path, and it's the relative path to that image on disk. And the second column is called image. And that's actually the contents of the image itself. And we can see our breakfast foods right here.

It looks like these loaded correctly, so I'm going to proceed. Back in Jupyter notebook. Now I'm also going to load up a second SFrame called annotations. And this I'm just going to call the SFrame instructor and provide a file name to annotations.csv. This is a CSV file containing the annotations that correspond to those images.

And let's take a look at that. Right in Jupyter notebook, we can see that this SFrame contains a path column, again pointing to that relative path on disk of the image. And an annotation column containing a JSON object describing the bounding box and labels associated with that image.

But now we have two different data sources and we need to provide one data source to train our model. Let's join them together. In Turi Create, this is as easy as calling the join method. I'm going to say data equals images.joinannotations and now we can see we have a single SFrame with three columns.

It joined on that path column. So for each image with a path, it combined the annotations for that path. And so now for each image, we have annotations available. Now we're ready to train a model. So I'm going to create a new section here called train a model.

And that's just one line of code here. I'm going to say model equals TC.objectdetector.create. And this is our simple task-focused API for object detection that expects data in this format. I'm going to pass in that data SFrame that I just created. And for the purposes of today's demo, I'm going to pass another parameter called max iterations and normally you wouldn't need to pass this parameter because Turi Create will pick the correct number of iterations for you. Based on the data that you provide. In this case, I'm going to say max iterations equals one just to give an example of what training would look like.

And the reason this is going to take a minute is it actually goes through and resizes all of those images in order to get them ready to run through the neural network. That is under the hood of this object detector. And then it will perform just one iteration on this Mac GPO.

But this is probably not the best model we could get because I just wanted to train it in a couple of seconds. So I'm going to go ahead and switch over to like cooking show mode. And I'm going to take one out of the oven that we've had in there for an hour [laughter].

[Applause] So I'm going to say TC.loadmodel and it's called breakfastmodel.model. And this is one that I've had an opportunity to train for a bit longer. So let's inspect that right here in the notebook. And we can see that it's an object detector model. It's been trained on six classes, and we trained it for 55 minutes. This is means, this means you can train a useful object-detector model in under an hour on your Mac. Next, let's test the predictions of this model and see if it's any good.

So I'm going to make a new section here called inspect predictions. And we're going to go ahead and load up a test data set. And here I've already prepared one in SFrame format. So I'm just going to load it, and I called it testbreakfastdata.sframe. There are two important properties of this test SFrame. One is that it contains the same types of images that the model would have trained on.

But the second important property is the model has never seen these images before. So this is a good test for whether that model can generalize to users' real data. I'm going to make predictions from that whole test set by calling model.predict and providing that test SFrame. And we'll get a batch prediction for the whole SFrame.

And that'll just take a few seconds. And then we're going to inspect. I'm just going to pick a random prediction here. Let's say index two. Here we can see the JSON object that was predicted in just the same format that the training data is provided in. So here we have coordinates, height, width, x and y. And a label, banana.

And we get a confidence score from the model. In this case about .87. This is a little bit hard for me as a human to interpret though. I can't really tell if this image is really supposed to be a banana or whether these coordinates are where the banana would appear in that image.

Turi Create produces a function to take the predicted bounding boxes or the ground-truth bounding boxes and draw them right onto the images. So let's go and do that. I'm going to create a new column in my test SFrame called predicted image. And I'm going to assign it the output of the object detector utility called draw bounding boxes.

And I'm going to pass into draw bounding boxes that test image column. So that's the image itself and then I'm also going to pass the predictions that I just got from the model. That's going to draw those predicted bounding boxes onto each image. Now let's take a look at that number two prediction again. This time in image form. So I can say testpredictedimage2.show. And it will render right here in the notebook.

[ Applause ]

And this is great as a spot check because at least for one picture, we know that the model's working. But this doesn't tell us if it'll work for say the next 50,000 images that we pass in. So for that, we're going to evaluate the model quantitatively. And I'm going to start a new section here in the notebook called evaluate the model. And what we're going to do is call model.evaluate and once again, I'm going to pass in just that whole test data set.

Here the evaluation function is going to run the metric that Aaron described, testing whether the bounding boxes are overlapping at least 50% and have a correct label. And it's going to give us that result across each of the six classes that we trained on. So here we can see that our overlapping bounding boxes with the correct label are happening about 80% of the time for bagel. About 67% of the time for banana. And so on.

That's pretty good, so I think let's, let's see if this model's actually going to work in a real app. I'm going to go ahead and call exportcoreml to create a Core ML model from the model we just trained. And I'm going to call it breakfastmodel.mlmodel. And then as soon as that's done training, I'm going to go ahead and open it in finder. Or sorry. As soon as that's done exporting.

So here in finder, I've got my breakfastmodel.mlmodel. And when I open it in Xcode, I can see that it looks just like any Core ML model. It takes an input image, and as output we get confidence and coordinates. And that's going to tell us the predicted bounding box and label for the image that we have. Now let's switch over to the iPhone app where we're going to consume this model.

So here on my iPhone, I've got an app called Food Predictor. And this is going to use the model that we just trained. Here I'm going to choose from photos. And I've got a picture of this morning's breakfast. This is a pretty typical breakfast for me: coffee and a banana.

Well, often I skip the banana. But suppose I ate a banana this morning. We can just tap right on the image. And because we know the bounding box, we can identify the object within that bounding box. And here we see the model tells us this is a banana, and this is a cup of coffee.

[ Applause ]

So let's recap what we just saw. First, we loaded images and annotations into SFrame format and joined them together with a simple function call. We interactively explored that data using the explore method. We created a model just with a simple high-level API< passing in that data object containing both the images and the bounding boxes and labels.

We then evaluated that model both qualitatively, spot-checking the output as a human would. And quantitatively, asking for a specific metric that applies to the task that we're doing. Then we exported that model to Core ML format for use in an app. Next, I'd like to switch gears and talk about some exciting new features in Turi Create 5.0.

Turi Create 5.0 has a new task called style transfer. We have major performance improvements from native GPU acceleration on your Mac. And we have new deployment options including recommender models for personalization and vision feature print-powered models so that you can reduce the size of your app. Taking advantage of models that are already in the operating system.

Let's talk a little bit more about that style transfer task. Imagine we've got some style images and these are really cool looking recognizable stylistic images. Here we've got sort of a light honeycomb pattern and a very colorful flower pattern. And we want apply those as filters to our own images that we take with a camera.

We've got a dog photo here and what it would look like to apply those styles to that dog is something like that. And with a style transfer model, we can take the same styles and apply them to more photos. Let's say a cat and another dog. And that's the sort of effect we would get. Here's an example of an app that uses style transfer for filters on photos that a user would take.

The code to create the style transfer model follows the same five-step recipe as any other high-level task in Turi Create. So you can start by importing Turi Create, loading data into the SFrame format, creating the model with a simple high-level API. Then we make predictions, in this case with a function called stylize to take an image and apply that style filter. Finally, we can export it for deployment into Core ML format, just like any other model in Turi Create. So let's take a look at another demo. This time, we're going to build a style-transfer model.

So switching back over to that Jupyter notebook environment. I'm going to start once again by importing Turi Create. As TC. Then, I'm going to load up two SFrames, each containing images. One, is the style images I'm going to call tc.loadimages, and I'm going to give it a directory name, styles. And then the other is content images.

And the way that this works is the style images are the styles that you want to turn into filters you can apply. And the content images can be any images that are representative of the types of photograph that you would want to apply those filters to. So in this case, it's just a variety of photographs. We're going to load a folder called content into an SFrame for that one.

And then we're ready to go ahead and train a model. So I'm going to say model equals tc.styletransfer.create. And I'm going to pass in style and content and that's all we need. But that's going to take a bit too long to train for today's demo. So once again, I'm going to do it like a cooking show, and we're going to load up one that we've had in the oven already. I'm going to say model equals tc.loadmodel and I'm going to load up my already trained style transfer model.

Let's take a look at some of these style images to see what we should expect this model to produce. We have a style image col-- we have an image column in our style SFrame. And let's take a look at just style number three and see what that looks like. It-- sort of like a pile of firewood. And this is pretty stylistic. I think this would be recognizable if we were to apply it as a filter to another image.

Now let's take a look at some content images. I'm going to load up a test data set. And once again this is a data set that is representative of the types of images that users will have at runtime in your app. And the important thing is that the model never saw these at training time. So by evaluating with the test images, we'll know whether the model can generalize to users' data. I'm going to load up a test data set now.

With tc.loadimages function once again. And we're going to call that folder test. And I'm going to pull out one image from the test data set called ample image. And I'm just going to take the first image there. And I'm going to call .show. So that we can we see what that image looks like without any filters applied. That's my cat, seven of nine.

She always looks like that. And we're going to go ahead and stylize that image using the model that we just trained. So I'm going to say stylized image equals model.stylize. And in this case, the function is called stylize because the model is specific to the task of style transfer.

And we're going to pass in that sample image. And I'm going to say style equals three because that's the style that we picked earlier that looks like firewood here. So let's see what that stylized image looks like. I can call .show on that, and here is my cat looking like a pile of firewood.

[ Applause ]

Let's make sure this works on other styles, too. I'm going to go ahead and make a stylized image out of that sample image. And I'm going to specify a style equals seven. And then let's see what that looks like. That looks pretty good. I wonder what the style was that we just applied to that. Let's take a look at style images.

Style image seven. And once again we can just call .show to see what that style image looks like and yeah, that looks like the filter that we just applied to my cat. Now that we've got a good style transfer model, we can just call model.exportcoreml exactly the same as any other model, and save it into Core ML format.

Now, let's switch over to the iPhone where we have a style transfer app ready to apply the filters in this model. So here I've got my iPhone once again. And I have an app called style transfer. I'm going to choose a photo from my photo library to apply these styles to. These are my dogs.

This is Ryker and we're going to see what styles we have available in this app. We can scroll through all of the styles here and what's important to note is that a single style transfer model was trained on all of these files. And one model can include any number of styles. So to have multiple filters, you don't need to greatly increase the size of your app. Let's see what those styles look like applied to Ryker. Pretty cool. So--

[ Applause ]

So to recap what we just saw, we loaded images into SFrame format. This time style and content images into two SFrames. We created a model using a high-level API for style transfer that operates directly on a set of style images and a set of content images. We then stylize images to check whether the model is performing well. We visualized those predictions in Turi Create.

And finally we exported the model in Core ML format for use in our app. Switching gears a bit. I want to talk about some other features in Turi Create 5.0. We now have Mac GPU acceleration offering up to a 12x performance increase in image classification. And 9x in object detection, and that's on an iMac Pro.

[ Applause ]

We have a new task available for export into Core ML format. Personalization. The task here is to recommend items for users based on user's historical preferences. This type of model is deployed using Core ML's new custom model support that's available on macOS Mojave and on iOS 12. This has been a top community feature request since we open sourced Turi Create. So I'm really excited to bring it to you today.

[ Applause ]

The recommender model in Core ML looks just like any other Core ML model. But what's worth noting is there's a section at the bottom here called Dependencies. And in this section, you can see that this model uses a custom model and that model is called TC recommender. And this is just Turi Create providing support for recommenders in Core ML through that custom model API.

Using that model in Core ML look very similar to any other Core ML model as well. You can just instantiate the model, create your input. So in this case, we've got that avatar creation app. And a user might have picked a brown beard and a brown handlebar moustache and brown long hair for their avatar. And we can make predictions from the model. By providing those interactions as input. And where we say k10 means we'll get the top ten predictions given those inputs.

So to recap what we've learned today. Turi Create allows you to create Core ML models to power intelligent features in your apps. It uses a simple five-step recipe starting with identifying the task that you're doing. And mapping it to a machine learning task. Gathering and annotating data for use in training that model. Training the model itself using a simple, high-level API specific to the task that you're doing. Evaluating that model in Turi Create both qualitatively and quantitatively. And finally, deploying in Core ML format.

That five-step recipe maps to code starting with import Turi Create. You can load data into the SFrame format. Create a model using that task-specific API. Evaluate the model with an evaluate function that's once again specific to the task that you're doing. And export for deployment, calling the export Core ML function.

Turi Create supports a broad variety of machine learning tasks. Ranging from high-level tasks like image classification and text classification, all the way to low-level machine learning essentials. Like regression and classification on any type of data. And using the resulting models, you can power intelligent features in your apps like object detection or style transfer for use as a filter.

For more information, please see the Developer.Apple.com session URL. And please come to our labs. We've got a lab this afternoon and Friday afternoon. And we welcome your feedback. We're happy to answer any questions you have. And we'll have all of the demos we showed today available to explore. Thank you.

[ Applause ]