Using Spotlight's Query APIs - WWDC 2005

Application Technologies • 58:29

Modernize the way you search for files in your application by leveraging the Spotlight Query APIs in Mac OS X Tiger. Watch and code along as we write a sample application, integrating Spotlight search capabilities and enabling complex queries on the Spotlight data store.

Speakers: Xavier Legros, Vince DeMarco

Unlisted on Apple Developer site

Check out Bezel, our iPhone mirroring app →

Transcript

This transcript was generated using Whisper, it has known transcription errors. We are working on an improved version.

Welcome to session 1.04. My name is Xavier Legros. And I'll be talking to you in the next hour about using the Spotlight Query APIs. So what are we going to do? So what we're going to try to do first is kind of give you an overview of architecture, kind of the technology behind Spotlight.

And this is going to be pretty much a follow up from this morning's talk, when Dominic actually went through a couple of details about how to write a Spotlight plugin. And what we're going to do is that we're going to pretty much implement the second part of architectural graphics, if you were this morning at the session.

In which case, we're going to do getting metadata out of a file. So we're going to teach you how you, as a developer, with a very simple set of APIs, can query the data store to retrieve metadata attribute on files. And then we'll finish, actually, with Vince, who's going to be coming here.

He will show you another part of creating the data store. And in this case, it's pretty much kind of the contrary of creating a file. In this case, what we're going to be doing is pretty much we have a set of parameters, search, and we want, actually, to get back all the files that satisfy this search.

All right, let's get started. So understanding the technology. What's very, very important for you as a developer, because there are many solutions out there that try to mimic what Spotlight is doing, what's very, very important for you as a developer to understand is that Spotlight is not bolted onto the system. Spotlight is fully integrated inside Mac OS X Tiger. So what does that mean?

That means that first, at the heart of the Spotlight technology is actually your server. And that server is pretty much a central piece that you, as a developer, will be pretty much interfacing with. So let's see the two different set of APIs that we have and pretty much the two different features that we have for Spotlight.

On your right, you're going to see the first thing is that, you know, we have a bunch of documents on our hard drive. You know, it could be on the desktop, it could be on the document folder. And the first part of the Spotlight technology enables you as a developer to pretty much extract that metadata from the files that you know to actually put them back in the Spotlight server.

What the Spotlight server is going to do when your plugin sends back that metadata, it's going to do a couple of things. First thing is that we're going to store the metadata attributes inside the metadata store. And in that data store, these are going to be attributes such as, you know, KMDI 10 title, KMDI 10 author. And Dominique went in details this morning about how to write that first piece of software, the Spotlight plugin, the Spotlight importer. And then we have right now another content store where actually what we're going to store in there is more like, you know, the text content.

And, you know, the main idea here, which is, by the way, does not matter too much for you as a developer. It doesn't impact you as a developer to the technology. But here in this case, what's important for you to understand is how things are fitting in the system. And here the content store will store pretty much a binary representation of actually the text that you're going to pass through the KMDI 10 text content.

[Transcript missing]

All right, so what do we have? As a developer, you have three different set of APIs to help you interface with Spotlight. Okay, the plugin APIs first, and once again, we went through great details this morning. The key point here is that if as a developer, your application or your tool or whatever your business is on Mac OS X generates a custom file.

What I mean by that is like, you know, a file that, you know, is only your application knows how to read. You have to write a Spotlight plugin to interface with our system, okay? And once again, this is very straightforward. If you didn't go this morning, come back to the lab and we'll show you. This is very, very straightforward.

The second part of APIs, and which is kind of like the first part for querying of a Spotlight server, is actually getting the meta-- what I call the get metadata APIs. And here in this case is that you have a file, and what you want to do is find out what are the metadata associated with that file. So you have an FSPEC, you have a CFURL, whatever you want. And you want to find out maybe the creation date, maybe what's the author, what are the key presence. So that's going to be the first thing we're going to do today.

And the last part, which I think is actually like a very powerful set of APIs and very simple, is you want to query the database and get back results like a set of files. So you want to say, show me all the files on my system that are JPEGs, that have the flash turned on, and that type of resolution. And then we send you back a set of files. You know, a little bit like what the Finder is doing, obviously.

As a developer, what's important for you to understand is that in your project, Carbon or Cocoa, it doesn't really matter. In Xcode, what you have to do is link against core services. The metadata framework is actually present inside the core services framework. And I suppose most of you are already linking against that framework.

All right, so I gave you a quick overview of the architecture, the different set of APIs you can use depending on what you're trying to achieve. Remember, you have a custom file format. Use a Spotlight plugin. Write a Spotlight plugin. Then after that, you want to do queries. We have two different type of queries. First type of query will be you have a file, you want to get the metadata. And that's what we're going to do here in this part of Apprezzo.

All right, so first, as a developer, why would you want to query the Spotlight server for getting back metadata? Well, a couple of cases. Case number one, you're an application. And it could be you're doing 2D content or you're doing 3D content, but your application generated files that has been actually using different other file formats. And what I mean by that is that take, for instance, a 2D drawing application. And the guy, the user, goes and selects a JPEG file, drags and drops it inside your application.

Maybe you want to keep a reference of that file. And maybe you want actually to present the metadata associated with that file. So this is the type of API you will be using. Another idea would be, obviously, if you wanted to display the properties for your own document or if you're in the business of tracking documents, for instance. Maybe your application manages different revisions of files across the product cycle or a project cycle.

And you wanted to query, well, I'm working on advertising for the Apple computer right now on the T1. And I want to get all my PDF files. And I want to get all my JPEG files and all my Postgres files. Maybe you want to do a query and get back specific attributes from these files inside your project.

This, once again, would be the type of APIs you would be using to achieve that feature, to implement that feature. A couple of examples that we have. Remember this morning-- I don't know if you were here this morning-- but in Dominic's talk, he went to the Finder. And in the Finder, when you command I to get the information of the file, we get pretty much the metadata information associated with the file. That would be the type of feature you could implement in your application, get info panel type. All right.

Before we go and we dig a little bit more about the APIs and how you're going to be able to implement that, how you're going to be able to architect your application to achieve these features, I need to talk to you a little bit about the MD item ref. The MD item ref, think of it as this little nice box that actually the Spotlight server manages.

Each file on the system is represented in the Spotlight data store by the Spotlight server as an MD item ref. So think of it as a representation of the Spotlight server. Think of it as a representation of the files on the disk. And each MD item ref can contain several metadata attributes.

Could have a KMDI item author, could have the pixel resolution, the width, the height, but as well, as we saw this morning, it could contain as well file system attributes. And here in this case, I don't have any, but you have to understand that inside the data store, we have actually the file system creation date, the modification date, the file size. Okay? So all these attributes are actually available as well through that simple set of APIs are stored in the MD item ref. So this is pretty much the key object you're going to be using, you're going to be manipulating in order to query the Spotlight server.

[Transcript missing]

All right, so how does it look like as a code? Once again, very, very straightforward. Step number one, remember, mditemcreate to get an mditemref for representation of a file. Here in this case, we hard-coded something with a CFSTR, you know, the macro to create a CFString on the stack. But in your case, if you have an FS spec, you go back to a CFURL, and from the CFURL, you get back the CFString, and you're in business. So you pass that to mditemcreate, the default allocator, we don't care, and the path.

And you get the CFString. And you get back this little nice, this neat little object, an mditemref. Great. So that's step number one. That's cool. At that point, what do we have? Very simple, just the link, an iditemref that is represented on the Spotlight server, and our file on disk. Okay?

All right, step number two. Remember, now we're going to go and query the data store. We're going to query the server to say, OK, tell me all the different keys that that file contains. Tell me all the metadata keys, I should say, that that file contains. Very straightforward. We're going to define a CFR.

So all the names of the keys. And remember, when I say the name of the keys here, I talk about the com, underscore, com.camdi item, blah, blah, blah, or whatever keys you will be using. And what we're going to do is that we're going to call mdi item copy attributes.

And we're going to pass the item ref. Very simple. That sends us back an array. And inside this array, we have all the keys that are present. We have the keys at that point in time, but we don't have actually the associated values with it. So what you're going to be doing from there is that two things. You could go back and say, call mdi item copy attributes. And we send you back all the keys with all the attributes associated. But maybe in your case, you want you to be able to be more clever, I would say, and just request a couple of keys.

Maybe some files will have 40 attributes. And in this case, you don't want to do that. So between the mdi item copy attribute names and the mdi item copy attributes, you could actually just go and work the name array and just remove maybe some of the keys in there. Very straightforward. So at that point, what do we have?

Very simple. We get better results, which is pretty much like an array of keys and data. Here in this case, you can see that Vince has been doing some trading and has $10 billion probably in his account balance, which is pretty good, we think, for an Apple engineer. And we have actually a couple of keys associated with all that data.

So that's pretty nice, but there are two things that are weird here. The first thing is, how did Vince get $10 billion during day trading? But that's just a side question. The real question for this session is that, would it be better if we could actually have real text, right?

What you want to see is what that key really means. So when you send back the result to the user, you could display real data. Obviously, you don't want to send back that and show that to the user. So, good text, absolutely, we have good text. And here, the main idea, what we're going to achieve now is that we're going to try to get the text that is associated with the key. But is there actually better?

So what you can do with a very simple API actually is get back the localized names of attributes. And the key point here is to achieve, instead of getting back the KMDItem author key, what you get back is, I want authors. And obviously, if it's authors in English and you're running in French, what you would like is like, auteur. Make a difference? In French.

And what is very cool is that if you've been to this morning talk, you know that people writing Spotlight plugins for their file format can actually define their own keys. And when you define your own key, you know, in the schema file, you're going to be able actually to define, you know, a description as well. So you could have, you know, the fact that this key is author and give a description. This is the author for the document, blah, blah, blah, blah, blah, blah.

All right, so how are we going to do that? How to obtain localized names for our attributes? First thing, you're going to call mdschema.copyDisplay, name for attribute, and I think it's probably easier for you to just read on the screen. And you're going to pass the key that you get back. In this case, it's, you know, I just coded it here, but you know, what you will do is probably pass, like, you know, an ID from, like, you know, the CFArray that we got back previously, okay?

Very easy. And then you get back a CFStringRef, which is pretty much the full name. Excellent. So that will send us back, for instance, in this case, if I were to pass kmd item video bit rate, db video if I was reading in French, and video bit rate in English. Sorry, my English is a little bit rusty.

What's very, very important is that if you're in the case of writing a Spotlight plugin for your file format, it's very, very important as well that you offer translations for all of these keys. I know that some of you may think that there is no French person that would use our software, but trust me, we do have a bunch of US software and it's great when it's localized in different countries. So if you're writing your Spotlight plugin, remember you can localize all these keys. So please pay attention to that.

All right. So now remember what I said is that here in this case, what we've done is that we go back, you know, not the description, but pretty much like, you know, what the key is about. So in this case, author. What is nice is that as well, you may want to get the full description of what that key does. And here in this case, once again, very simple API that will send you back the localized version of a description. MD schema copy display description for attribute.

We're going to try to make longer API names next year. And you pass the key. And once again, same thing that before here, what you would be doing is you would probably pass an ID inside the CFRA that was sent back from the previous API. So what did we achieve here? We get back, for instance, in this case, that would be the name of the media file, what the name of the media file is, or whatever full description you would have in your schema file.

So in this first part, what we did is that we got an MDI time ref. Remember, from there, we queried and we got back all the list of the keys that actually are stored in the data store. And then after that, I queried the Spotlight server to send me back only the data that I wanted for the keys. Once you have that, I show you very simple APIs that enable you to get the localized name of the key and the localized description.

To show you that in action, I can invite on stage Vince, Monsieur Vince, as we call him. Hello. And Vince is going to walk you back through the code and show you the second part of his presentation. Thanks. Oh, can you go back to the slide again? So hi, I'm Vince DeMarco.

Can you go back to the presentation? So I'm Vince DeMarco, and I'm a member of the Spotlight engineering team. So what I'm going to do today is show you what Xavier just showed you a little while ago. So I'm going to make a graphical version of MDLS. MDLS is a command line tool that lets you see all the metadata associated with a file.

Except in this version, what's basically going to happen is you're going to take a file from the Finder and drag it and drop it on the-- on the window. So all you see in the top half is a little text field with the path of the file, and then a table view showing all the data. Can you switch to the demo one?

So I'm not going to describe all the code that's going on here, but what happens as soon as the file is dropped from the finder onto this window, this method, file dropped, is going to get called. So within the notification that gets passed, in the user info is a list of all the paths that the person selected. I don't really care about all the paths, so I'm just going to grab the first one.

So the first thing I'm going to do is just set the text field at the top of the window with the path the user selected. So the first step I want to do is I want to get the MD item ref for the file that the person created, that the person just dragged and dropped there. So I get that.

Before I go any further, I really need to check to make sure that the ref that I got back is valid. So if any error occurs, we end up returning null. A possible error is the file got, the person does not have permission to read the file or the file may have been deleted in the process of doing the drag.

So as the drag was released, the file went away. So now that we know that we have an MD item ref, we're going to go ahead and create a new file. So I'm going to go ahead and create a new file. So I'm going to go ahead and create a new file.

So the first step is to get all the interesting attribute names. So Xavier showed that. All this window is going to do is show all the possible attributes, not any select set. So this will return an array of all the attribute names. The next step, I want to get a list of all the attribute values.

So I'm going to get the CFDictionary back of everything that I want. So now that I've got all the data, I'm basically going to tell the table view to update. So I'm going to go ahead and say, "Hey, this is the table view. I'm going to get the name of the table view.

I'm going to So table view in Cocoa basically shows rows and columns of data. So in this case, I've got two columns and n number of rows. And the number of rows is equal to the number of attributes in the array. So I'm just going to return that as the count. So the number of attribute names, if that has been set, is just equivalent.

So the number of items in the array is equivalent to the number of rows in the table view. So the last step here is to actually just try to display all the data in the two columns. So the first column is the name of the property that I'm interested in.

Actually, the first thing I want to do is I want to get the attribute name for that particular row. So TableView, as it's loading its data, calls you and tells you that it wants the data for a particular row. So in this case, it's like row n. So the first thing I want to do is I'm going to grab the attribute out of the array that I'm interested in. So I'm going to grab attribute name out of the inspected ref attribute names. So next, I want to return this property back to the TableView to display it.

So I'll just simply return that. And then the last thing I want to do in the second column is I actually want to display the individual pieces of data. So that entails just looking it up in the dictionary. So I'm going to grab the value out of the inspected attribute, inspected ref attribute values, which is the dictionary of the keys and values, and then return it.

This code down here is just basically trying to reformat the data to make it look a little nicer. Here, so I'll compile this and we can go. So here's the window that I just ran from the program. So we'll go into Finder and I'll drag and drop a file onto it.

So here's the file that I just selected. So here's all the attributes associated with the file. So the things that are interesting is-- so this file's got two keywords. It's a Lotus and a lease. And this file was taken with a Minolta camera. It's a JPEG image. So you can see all the content types.

So all the first column and the name of the property is not in English. Even we have API. So the next step would be to localize these values so they're displayed a little nicer. So this is really easy. So right here, after we-- instead of grabbing the-- so instead of grabbing the original value-- so attribute name. Oops. Oh, jeez.

So instead of returning the attribute name like I was doing in the past, I simply call mdSchema copy display name for that attribute that I was going to display. The thing to note here with mdSchema copy display name attribute is right here I'm checking if it actually has returned any value. If it returns a value, then it's okay to display that to the user. If it doesn't return anything, the intent is it's not--it shouldn't be a user visible value. It's not interesting to the user to see it. So I--and I will show you this.

So if we run the same program again, one more time with the same file. So all the interesting user visible things, so instead of saying KMDItemDevice, AcquisitionDevice, now it says DeviceMake here. And then the keywords, instead of saying KMDItemKeywords, it says Keywords in English. There's other things like exposure time, so it's all... The one thing I didn't do in this program, which you may have noticed, is I'm leaking all the MD item refs up at the bottom. So if you drop the second file in, you'll leak the first one. So to be nice to the system, I should probably clean this up, and I'll just do this right now.

So all you have to do here is if we have an old inspected ref,

[Transcript missing]

And so I don't know anything's happening. I'll send them to null. OK, that's it for that. I won't bother running it because I just did. So can we switch back to the slides now?

So that's basically how you get all the data for individual items. So the next step would actually be to perform a query to find a file that's interesting to you. So why would you even want to search for any files in Spotlight within your application? Well, the first thing that comes to mind is it's a cool feature and you should probably just do this in your application. But a more legitimate thing would be to enable search within your application. So it allows the user a new way to find interesting files. So one example that comes to mind is if you're writing a 3D modeling program.

and the person wants to find a skin texture to apply one of their objects in the modeling program, but they have thousands of textures. It would be impossible for them. They really don't want to get a preview for all 1,000 files, try to find which one they're actually interested in. So it would be nice if some part in your UI they could type skin, narrow the search list down to 10 and then only have to get a preview of 10 individual items instead of 1,000.

Another thing is to find related documents. So now more and more people are working on, like more and more documents to get their individual work done. So an example of this would be making a magazine. So if you make an article in a magazine, you need the, you The desktop publishing program to put all the files together, you need the text for the article and any associated pictures. So it would be nice if within the desktop publishing program they could say, "Tiger article Spotlight by me, Vince DeMarco," and then find all the related files and then pop them all together and make the finished presentation.

So how do you actually search within Spotlight? It's actually as simple as Xavier just showed you. So there's just a few key objects you need to understand. You need to understand how the query language operates. You need to understand the different modes of the queries and then we're going to go together, bring this all together and implement an example to show you how to do this.

So searching with Spotlight requires the two key objects. So the first one is MD Item Ref, which represents a file on disk. I'm not going to go into any further because Xavier explained that a little earlier within the presentation. The next item that's of interest is the MD Query Ref. So this represents, it's a calculation of the query results and the query string itself.

So the query language within Spotlight is a simple C-like expression. So you have an attribute on one side is equal to a particular value. So in this case, I'm searching for KMDItemContentType, which is the type of the file, and it's equal to public.rtf. This will basically return all RTF files stored within the system.

So all the different types, you can have numbers, strings, and dates of all the attributes, and we have all the standard operators that you can do in C. The only one of note that's interesting is you can do ranges of numbers. You can group the APIs with ands, ors, and parentheses, and you can do logical knots of the big group of expressions.

So some examples of some queries. So once again, if I want to find all the plain text documents in my file, I have KMDItemContentType. That's the attribute name and I'm searching for the value of it is public.plain-text. So the second example is if I want to find all the documents with my system that have the word WWDC.

So I'm searching for my presentation. So I'm searching for KMDItemTextContent, the attribute, is equal to the value, or in this case contains the value, WWDC. The third example is I'm searching with an array. In Dominic's talk this morning, he described the keywords are actually an array of strings. So a document can have many keywords associated with it, not just one.

So in this case, I want to find all the documents that have the keyword "important" associated with them, and it could be--it would return a whole bunch of them. The last example, I'm searching--I'm basically searching for the attribute "display name" and I want to find all documents that start with an uppercase A and are any random number of characters.

of any length. So in all these examples I'm showing, I'm finding actually exact matches. So the last one I'm finding A, any document that starts with the upper case A but of any length. It would be interesting if I could do, if I could narrow the search down without having to be really explicit in my query.

So to do this we have string match modifiers. So there's three basic string match modifiers. At the end of every query you can specify a C which means that the query is case insensitive. So in this case Spotlight all lower case is equivalent to Spotlight all upper case.

The second string match modifier is the diacritic insensitivity. So if I have a bunch of English documents and a bunch of French documents in my system, Since I'm a native English speaker, I'm not going to spell elegant correctly in French with the accent on both the Es. I'm probably going to spell it in the English form. Same if I'm searching the document. If I'm a French user, I'm going to put the accents. If I'm an English user, I'm not going to really bother. So this is so you can say elegant is equivalent to elegant with the quotation marks, with the accents, I'm sorry.

The last modifier is I want to find is the word-based modifier. So it detects transitions in the case. Lots of times people end up doing camel casing words. So you go Spotlight with a big S and a big L. And they're actually using that to signify that those are two separate words, but they're written together. Same thing with text edit or interface builder.

So in this case, light is equivalent to spotlight. So that word exists in that match. But the second example we have, Paris, is not equal to comparison. This would be a match if in comparison it was an uppercase C and we had an uppercase P in the Paris and an uppercase O in the on. Then it would match. So if we put this all together, I want to find... All the files on my system that have the attribute KMDItemTitle that contain the word light, and I'm going to search it word insensitive, diacritic insensitive, and case insensitive.

Oh, and the word match too, it also applies to spaces, underscores, and dashes and dots too. So if we want to-- it would be interesting to put this all together. All the type information in Spotlight is based off of the UTI hierarchy. And in UTI hierarchy, everything is inherited. So we have at the top of the hierarchy, we have public.data or public.content. And below that, somewhere we have like public.image. And all the image file formats inherit from public.image.

So this allows you to find, for example, all the audio content or all the image content. So instead of having to find all the public.jpg, public.gif, Tiff, etc. All the hundreds of different file formats you can say in one query. Well, they edited that wrong. So I can find KMDItemContentTypeTree equals public.image.

Ignore the equals equals WWC at the end. That's wrong. So this would find all the image files known to the system with one query without having to specify the each individual types. But you could do that if you needed to and you only wanted TIFF or JPEG images.

So you can put these all together to make a more complicated query. So in this case, I want to find all the images on my system that have an alpha channel and have a height DPI greater than equal to 300. So it's KMDItemContentTypeTree is equal to public.image. That's all the image files. And KMDItemResolutionHeightDPI is greater than equal to 300. And KMDItem has alpha channels equal to 1. So that attribute is set. So we'll do a little demo using MDFind. Can you switch to the demo machine? So MDFind is a little command line tool.

That lets you type in queries and get back results. So what I'm going to do as a query is I'm going to find all the application files on the system. So remember at the beginning, a couple of slides ago I described that the UTI Harkey is inherited. So we have different kinds of applications on the system. So you can have a packaged application or a non-packaged application, but they all inherit from com.appleapplication. So we can find all application files on the system. The item.

So that'll be all the applications on this machine. And there's quite a few of them. I'm not going to read each individual one out to you. That'll take too long. So we've got lots of files there. So I want to narrow down the query to find only applications that have that only applications that have an A in their display name.

So I'm just doing *, a* and then making it word, diacritic, and case-insensitive. So if we run this query again, the list is a little bit smaller. So in the last step, I want to try to narrow this down any more. I want to find all applications that are copyright by Apple. So we'll just keep extending this query.

So I'm just going to find the word Apple within the word, within the document, within the copyright string. So I'm going to do this word insensitive, diacritic insensitive, and case insensitive too. So let me clear this screen. So now we have a much smaller list again. So let me just run this query again with a smaller list. So if we do only applications that have to start with A instead.

There's a little demo and you have a lot left. The interesting thing, the only real thing of note, can you switch back to the slides? The only thing of note to take from that demo is while you're making your query, it's best if you just try it out on MDFind.

You don't have to write any code and you can see, you should try to make a query so you're basically trying to narrow it down to a small set of files so you can try your query on MDFind and see if it narrows down to the files you're actually interested in. So the next step now that you know how to make, you know how to write some simple queries and you need to implement the search within your application.

There's basically two ways you can implement search within your application. You can open the Spotlight search window, so that's that little menu on the top hand corner. If the person types in a string, you get the results and you can select the show all window, show all, which brings up the Spotlight results window. The second way of doing this all is just to simply perform a query within your application. So opening the Spotlight window is really, really simple.

And it provides a really easy way for you to integrate Spotlight within your application with a minimal amount of work. So all you need to do in this case is call "hi search window show" with a string you want to search for. This is the same string that the user would have typed into the field in the Spotlight window.

This is already integrated in a couple of applications. For example, Address Book. You get the--there's a little action menu at the top window of Address Book. It says search in Spotlight. It's basically doing this. And in TextEdit, if you select a word within your document, you can go Spotlight and you find all occurrences of that word everywhere in--on your hard disk. This is great to provide--present related items to the user, but you as an application developer cannot interact with this window in any way. You can simply present it and that's it.

So if you actually want to do anything further, you need to write a query within your application. Executing a query in your application is three simple steps again. So the first step is to create an empty query ref. The second step is to register the callback so you get notified of updates and changes to the query. And the third is to execute the query. So step one.

is to create the MD Query ref. So all I'm doing here is I have a query string. In this query I'm searching for KMDItemContentType is equal to public.plain-text. I'm searching once again for all plain text documents within my system, on my system. And you just call empty query create and the second parameter is the query string. This creates a standard CF type object which you can retain and release and you can put in all the CF collection classes.

So the second step after you've created your query is to register for the callbacks. The reason that this is the second step and not the first step is in CF Notification Center at Observer, you're basically telling it which query you want to observe. It's the second to last parameter. It says query. I want to be notified of any changes on this query. You can also pass null and get notified of any query happening on the system.

The other thing that's really interesting to note here is that I called the CF Notification Center. I got the local center, not the distributed center. This just got added into Tiger recently. You would do the same thing in a foundation class. You just register with NS Notification Center, the default center. So in my example, I'm only listening for the finished notification, but you'd also typically listen for the progress and the update notifications.

So of the three notifications we sent, the first notification we sent is the progress notification. So as your query is running and the server is gathering results, as it sends the results back to your client application, you get the progress notification so you could do something in your UI. So you could update a table or increase the size of a menu and show the person the results as they're being fetched.

The second notification we sent is the finish notification. So as the server's finished and completed your query, it's gotten all the results it can at this point, it's going to send you this notification saying it's all done. So maybe you could stop the spinny cursor, I mean the little spinny progress indicators.

or notify the user that they're done. The last notification that we send is the update notification. So after the query is finished, it goes into the update phase. So as the user creates or destroys files that match or don't match your query, the results that will change as this is happening. The only thing to note here, this is only get sent if the query is live, which I'll explain in just a second.

So, step two and a half, you have to implement the notification callback. So this is what you would do in CF. So this is the standard thing you would do in CF. There's really nothing special here. You could do the same thing in Foundation. It would just be Objective-C classes instead.

So the thing that's interesting here and the same thing applies to in Foundation is the object that sent the notification is your query. So you can use this to look up whatever information that you need. You might need to get from the query to know what's So the last step we need to do is now we need to actually execute the query. So this is simple. We simply call md query execute.

The first parameter is the query and the second one is flags. By default, you could pass zero, which I'll explain what that means. So in this case, I'm calling--and the flags can be one of two things. So we can say--if we say on the flags and you order them all together, you can say k md query synchronous which means KMD Query will not return until it's fetched all the results that are available. Basically, until you would have gotten that finished notification.

The second flag is KMD Query wants updates. This is telling you after the query is finished, if the user deletes or creates any new files, let me know about it. So by default, if you set the flag to zero, your query will be running asynchronously and you will not get updates.

So the next step after you've executed the query and you've gotten the notification somewhere is you need to retrieve the individual results. So the retrieving results basically involves two calls. It's only two calls because conceptually you can think of an MD query as an array. So an array, a read-only array really has only two appropriate calls.

You can ask it how many things are in the array and you can get an individual item anywhere within the array. So hence we have these two calls. So we have MD query get results count and MD query get results at index. So MD Query Get Results Count just simply returns the number of items in the result set.

And there's really not much to that. You just get the count and you can iterate over them. The next, as you're iterating over them, you can get the particular item at a particular index. So you simply call mdquery, get results at index. And just get the mditemref at that particular index value.

So the thing to note with retrieving results is if you've started your query in live mode, as your query is executing, the query result set is constantly changing. So if at one point you ask it, give me item at index 4, by the time in your code, you want to know you get index 4.

So if you stashed away this index you're interested in, then later in your program, after some period of time, you go to array index 4, it might not even be there. And it might be out of bounds now. So to get around this problem, while you're currently iterating, you can update, enable, and disable the query. So basically everything's kind of freeze dried for that period of time so you can look at it.

You don't need to do this. If you're only going to be looking at the results set within the callback, within the notification callbacks, you don't really have to enable and disable the query. That's really done for you. And enabling and disabling a query is stacked. So if you've done four enables, you need to do four disables. Or, I'm sorry, backwards.

If you do four disables, you need to do four enables for it to get turned back on again. If they don't match, it'll stay in the last state that it was. So the biggest thing to take away from this is the results are live. Everything can come and go as you're working on them, so be aware of that.

So we'll do a little demo to do a query. So this is going to be this application, which is a simplified version of the search window. So basically in that top search field, you're going to type some string that you're looking for and then you'll get the results below. So can we switch to demo one please?

Let me just get a drink of water. So the first thing we do as the user is typing into the top search field, what's happening is the start search method is going to get called. So as they type in a character and the appropriate delay, however the Cocoa search field operates, we get the string that the user entered. So the first thing we want to do is create a query string from what the user just typed in.

So we're going to basically do a very simple query. We're going to find all the metadata star is equal to what the user typed in. And we're going to know our case. We're going to be word insensitive, case insensitive, and diacritic insensitive. And we're also going to search the text content equals what the user typed in. And we're going to do case and diacritic insensitive.

The thing to note here is I'm not checking if they've typed any special characters like quotes and any of that kind of stuff, so if they type a quote, the query's probably going to be malformed. But for this demo, it'll be just fine. So now that we have the query string, the next step is to create the MD query ref.

So given the query string right here, I'm going to create the MD Query Ref with a default allocator. So once again, to make sure that the person hasn't typed in just some garbage, so they can type "hello" quote there, which in this case wouldn't parse because we didn't have enough quotes.

You really would have to escape it, but I'm not doing any of that. So what we're going to do is check if the query is okay. If the query is okay, the next step I'm going to do is I'm going to register for all the notifications that I'm interested in. So in this case, I'm going to do this in Objective-C.

I'm going to register with the With a default notification center. So I'm registering the finished notification, the progress notification, and the update notification. And I'm going to have them call my update data method. And I'm only interested in the current query that I'm executing. So now that I've registered for the notifications, the next step I need to do is actually execute the query.

So I'm going to execute the query and I'm going to tell the system that I want to be notified of updates. I'm also running it asynchronously because I haven't put any flags there. So now that I've got the query back, so now the query's going to start running in the UI. So the next step I want to do is I want to update the UI.

So as the query is executing, I'm just going to call-- my update data colon method is going to get called. And in this case, I'm going to reload the table view. This call, and then I'm going to set the title of the window. So the title, I'm just going to say the query that the person entered in and how many matches they currently have at this point.

And then the number of matches at this point, I'm just going to call MD Query Get Results Count, which will say that I have 10 or 14 or however many I have at that point. So the final step is now I need to update the table view in the result query.

So the number of rows in the table view is equal to the number of items in the query. So if the person did enter a query that we have, I can ask the query for its result count. If we don't have a query, then I'll simply return zero.

So then the last step in the table view, table view wants to get the data that it wants to display. So it's going to pass me the column that it's interested in, which is that, and then the row that I'm interested in. So the first thing I'm going to do is, given the row, I'm going to get the query result at the particular index. So I'll get the MD item ref. You'll notice here I'm not-- yeah, I probably-- in this code, I really should be enabling and disabling the query. I'm not doing that here, and technically, that's incorrect too, but it'll work for this demo.

So if I'm updating the first column-- All I'm going to do is I'm going to get the path of the individual attribute, calling mditemcopyattribute, with the mditemref that I'm interested in, which I've just gotten above right here, and then I want the attribute path. So I'm going to take the path and then I'm going to call NSWorkspace and get the icon for that particular path.

And then the last step here is So I'm going to have the two columns in the table view. The first one's going to be the icon, and the second column will be the display name of the file. So the display name is equivalent to what you would see in the Finder if you were looking at it there.

So I'm just going to go the same thing, the same-- do the same thing again. I call mditemCopyAttribute with the mditemRef of interest, which I got above again, and I get the display name. And I set that to object and then release them. So if we build this one-- whoops. Oh, so I have a syntax error here. There we go. This is just to show you that it's all live, and I'm not faking this all.

So here's the query. So if I type in "Lotus," I end up getting 18 matches, and here's all the files. So in this case, they're all pictures. I can go a little further and type "dress book," and I get... Some cards from address book, applications, and then any source code that happens to be on the system. Okay, can we switch back to the slides please?

So now you know how to do all the simple type queries. Now we need to take this a step a little further and do something more interesting. So this is where I'm going to talk about some more advanced topics. The first thing you noticed in my little demo, I didn't do any sorting.

They just came back in the array-- in the order that they happen to come back in the server. So the metadata library provides some simple sorting you can do. The only sorting that it actually does is sort in ascending order. So in this case, the last parameter of mdquerycreate is a CFArray containing the names of the attributes you want to sort in ascending order.

But you can do further sorting, but you have to provide your own callback function. So for example, the search menu sorts the dates in descending order and names of the files in ascending order. You can do that kind of stuff. You have to provide the callback on your own.

The other thing you'd like to do is scope the query to a particular directory. So maybe you only want to search the person's home directory or only a particular volume or only a particular hard drive. Finder lets you do this when you set the search scope in the little search slices.

So we'll go back to the demo machine again and we'll end up and I'll add sorting and the scoping of the individual queries. The thing to know about the sorting and the scoping, you need to do this before the query execute. After the query is executed, you cannot change any of these values. So if we go back here, now that we've created-- so now we've created the query. The first thing I want to do-- oh, sorry. Instead of creating the query-- so in the past-- oops. Oh, jeez.

So I'm going to create the query now. So the last parameter, I'm going to pass it a CFArray or an NSArray in this case, and I want to sort by display name. So it's just the array containing all the strings. I could have easily added more and more attributes.

So that's that. So now they'll be sorted. And then the next step to do is I want to limit the scope of the query to only search in the person's home directory. So in this case, I have the query again. I call mdquery setSearchScope. The first parameter is the query.

The second parameter is an array of CFStrings containing the path or CFURLs pointing to the path where you want to search for your files. Here you can also pass, there's a bunch of known constants you can pass. In this case I'm passing KMD query scope home to only limit it to the person's home directory. There's some to limit it to just the network or the entire computer.

So if we do the same query again and I type Lotus, I'll get nine matches instead of the previous time, which I think I got 18 because it found the rest of the files on the rest of the hard drive. So these are just the pictures in my home directory instead of a picture of a lotus, which was somewhere else of a flower. Okay, can we switch back to the slides again, please?

So the last thing I'm going to talk about is fetching the query attributes. So basically, all this entails is as your query is executing, the query is actually sorted on the client library. So in order to make this more efficient, what happens with the server sends-- when we send the request back to the server to get the query results, we send it another message saying, please send us these list of attributes along with the result. So basically, if you can conceptually think of this as the query starting the array of the results and alongside of it, it's got an array.

So the first array is array of empty item refs and alongside it is another array of the attribute values. So if you're sorting by display name and author as an example, which-- so in this case, I'm doing it by content type. So when I sort by these values, I've also got them locally in the client library. So it would make sense for you to have access to this. So this is all basically a bulk call. So as I get the results, send me back these values because I'm going to need them immediately.

The only reason, the big reason why you want to do this, because every time you're making, so we do this in the sorting because if we kept doing, to sort them we would need the value and to get the value we'd have to make a round trip back to the server, ask it for the value and get the value back and sort them. If we were sorting 100,000 items we would be sending 100,000 message, 100,000 plus messages as we're doing the sort.

So to get these values, there's only two basic calls. So if you, once again, if you conceptually think of the query as storing off to the side, so we have an array of the MD item refs that we're interested, and then off to the side we have another array of the attribute values. So it seems that we only need two calls.

So we call mdQueryGetIndexOfResult given the query, and then the mditemref that we're interested in will give you the index on that second array of the value you want. And then to look up values within that sort of subarray, we call mdQueryGetAttributeValuesOfResultAtIndex. The first parameter is the query, the second parameter is the attribute that I want, content type, and then last is the index.

If the value is not there or it's empty, you'll get a nil back. So, and that was the talk. So I'll invite Xavier back. So, as you notice, it's not that much code in your program to try to do a query. It's very, very simple. So you should all try to integrate it in quickly within your application.

Hello, hello. OK, so to summarize, I think today, between Dominic's talk this morning about plugins and our talk on MD queries, I hope we gave you a good overview of how you, as a developer, can integrate with this great technology. I mean, you've all seen all the marketing we've been pushing behind Tiger.

Spotlight is obviously number one. I think there are really a lot of things that you guys can do on your side of the fence in your application to take advantage of that, to distinguish yourself from the marketplace, but as well to bring tremendous innovation on the platform with your application.

So remember, there are a couple of things I want you to remember from today's talk. The first one is that Spotlight is totally integrated inside Mac OS X Tiger. So for you as a developer, number one, if you have your own file type, if you have your custom file format that your application generates, please do write a Spotlight importer. OK, very, very important.

Then after that-- So with that, obviously, I think-- I hope today we gave you a quick overview, and we really convinced you that adding query APIs inside your application could make a lot of sense. And hopefully, we showed you the different ways you can integrate that in your application.

For more information, we have a couple of new features. Today, this afternoon, the UTI. Obviously, we talked this morning, UTIs are very important across Mac OS X now, the Unified Type Identifiers. And so Chris actually will be talking later on at five about UTIs and how you can declare them in your plugins, inside your application, whatever different things you should look for. The lab, which starts, if I'm not mistaken, not at noon, but at five, I think, is that correct?

3:30, very soon in all cases, so check your little agendas. Where you have actually the Spotlight team pretty much there to answer any questions you may have. You know, it could be like on plugins or on the MD Query APIs, it doesn't matter. Just come by, talk to us. I know this morning a couple of folks had questions, and so this is a great way to get your questions answered.

And if you want to learn more about the file system, we have a session as well on Thursday, Tuesday, Thursday, today at 5:00 And Dominique will be part of that if you want to learn more about the file system. Correct. And thank you for reminding me that there is another lab tomorrow morning starting at 9:00. And we hope to see you there. And thank you.

One thing that is very important, whatever you're doing, if you're going to write using the MD Query APIs, if you're going to do a Spotlight plugin, please send me an email. We want to track who's doing what and we hope we can help you actually promote the integration of Tiger technologies inside your application.