Configure player

Close

WWDC Index does not host video files

If you have access to video files, you can configure a URL pattern to be used in a video player.

URL pattern

preview

Use any of these variables in your URL pattern, the pattern is stored in your browsers' local storage.

$id
ID of session: wwdc2006-202
$eventId
ID of event: wwdc2006
$eventContentId
ID of session without event part: 202
$eventShortId
Shortened ID of event: wwdc06
$year
Year of session: 2006
$extension
Extension of original filename: mov
$filenameAlmostEvery
Filename from "(Almost) Every..." gist: ...

WWDC06 • Session 202

PDF Kit

Graphics and Media • 1:06:54

Learn how to enhance your Cocoa application with the robust graphics and final form document capabilities of the PDF format with PDF Kit. Don't miss this opportunity to learn how to harness the power of PDF Kit in your application.

Speaker: John Calhoun

Unlisted on Apple Developer site

Transcript

This transcript was generated using Whisper, it may have transcription errors.

Hi, welcome to session 202, PDF kit. I'm John Calhoun and I'm the engineer for PDF kit. So what I'm going to talk to you about today in this session is I've kind of broken it up into four parts. The first part, I'll tell you what PDFKit is, kind of what constitutes it, kind of how all the pieces fit together. And then for the last three parts, what I'm going to do is give you some practical examples of how you might use PDFKit. And for the first part of those, part two, I'll show you what most developers want to do, which is display a PDF in their application. And I'll probably spend the most time actually on that section. But I don't want people to be left with the impression that all you can do in PDFKit is display PDF. So part three and four, I'll spend a little time on, I guess, kind of maybe what are a little bit more, I don't know, creative uses of PDFKit.

But first, what is PDFKit? Well, it's a suite of Cocoa classes. It's built on top of Quartz 2D. So you already have, in Core Graphics, you already have the C API for PDF, handling PDFs. And so what PDFKit is, is our Cocoa classes that take advantage of those C API and give you kind of a Cocoa object model for dealing with PDFs.

It's been used in preview, if you've ever opened a PDF in preview on Tiger or Leopard, you've been using PDFKit. It's been used for the default PDF plugin in Safari as well. And on Leopard, I don't know, the new preview services framework is using it, so Spotlight's using PDFKit when you search for PDFs.

So the where is it's in Quartz.framework. So if you go to System Library Frameworks, there's a Quartz framework. And that's sort of the umbrella framework. And inside that are PDFKit, the new ImageKit, Quartz Composer. Oh, I should have-- let me back up. The new version of Interface Builder that you have on your Leopard Seed, they've integrated PDFKit classes. But I don't think it's going to stay that way. I think when we actually ship, we're pull the PDF kit classes out into a separate palette. And if you're using Tiger in the old Interface Builder, you do have to add the PDF kit palette into Interface Builder to get the PDF kit classes. And I should say, similarly, if you write an application that uses PDF kit, then of course you'll have to link against the Quartz framework in order to get the PDF kit calls. So it came out in Tiger, and it's been there and used it in Preview and Safari, as I said. But kind of the focus in Tiger was really just to display PDFs and handle interactivity. So if you think of Preview and you think of Safari, in particular Safari, where you want to display a PDF and the user wants to select text, click on links, the links will take you to the respective page, that's the kind of focus that we had in PDFKit in Tiger. But what we tried to do in Leopard is kind of enhance and expand PDFKit's role. So there's a lot more in sort of the PDF creation and editing arena. And in parts three and four, I'll show you some examples of that.

But I guess probably a good place to start, I'm going to try to give you sort of a context to understand all the pieces of PDFKit. So I'll start with kind of the notion of the PDF file. We'll start with the file itself. And I'll show you kind of some things you're used to, you know, if you've ever used Adobe, you know, open a PDF in Adobe Reader or in Preview. I'll show you some elements you're familiar with in a PDF, and then I'll explain how all those kind of map to classes inside PDFKit. But first is the PDF file, and PDF, of course, is a portable document format, so it's supposed to be sort of a digital representation of, you know, a physical file. And so like a physical file, it's got pages. And in the screenshot here, you see it's a preview on Leopard. But on the left is a large page showing you the content of that page. And then on the right is a thumbnail view, and you see all the various thumbnails. And I just kind of illustrate that to show you that we know that, for the most part, PDFs are comprised of all number of pages. Pages can have annotations on them. Probably the most common annotation would be a link annotation.

You click on the link on the page, and it typically takes you to another page in the document. There are other types of annotations. And the last part, I'll tell you a lot more about those. Some PDFs have an outline associated with them. In Adobe Reader, they call these the bookmarks. I've heard it called table contents or chapter sections. If the PDF has the outline in, again, preview, you would see it over in like the split view. In Adobe Reader, it's on the left under a tab.

Here's a text selection. I'll kind of tell you why that's significant in a minute, but I kind of -- I chose this screenshot intentionally to kind of show you that there's a text selection that spans two pages, and that's a little interesting. I'll tell you about that in a second.

And finally, and maybe I should have started with this, there's the viewer itself, whether it's Adobe Reader or Preview or any number of Safari or Internet plugins, there's the application that you view the PDF within. So the way all these sort of elements map to PDFKit, the file is represented in PDFKit by a PDF document class. So if you have a file on disk and, you know, the PDF document will map to some file on disk or some data or something like that. And so sort of file elements like the number of pages and stuff you'll find by asking the PDF document class. Pages are represented by PDF page objects. So for each page in a PDF, there will be a PDF page object that represents that page. If there are annotations on the page, then from the PDF page object, you would get back an array of those objects, and they come to you as they're returned as PDF annotation objects. PDF outline, if it has one, is represented by a PDF outline, or actually several PDF outline objects. They kind of form a tree structure. If the user does things like select text and that sort of thing, the way we represent that is with a PDF selection object. It wasn't really enough to use, say, if you think of like an NS text view where you just have contiguous text, you might be able to represent a selection with just a range. But because of the nature of PDF, and selections can span multiple pages, and there isn't really sort of like contiguous text that runs from one end of the PDF to the other, we had to create this object, this PDF selection object.

So it can--from the PDF selection, you can find out, you know, the pages that are covered by the selection, where--you know, what text is covered by the selection, that sort of thing. And then finally, the viewer, PDFKit has the PDF view and the PDF thumbnail view. The thumbnail view is new for Leopard. I'll show you that in a second.

Here's the class hierarchy as of Tiger, and you see that at the top I've got PDF view, and since it's a subclass of NSView and NSResponder, you get all those kinds of view things and responder things, so the PDF view has a bounds, has a frame, has a draw method. It can take mouse events, key down events, that sort of thing. The other class is just a subclass of NSObject, so they're kind of more basic classes, and you see PDF border, PDF destination. are kind of esoteric. I probably won't go into those at all this session, but there's the document outline page selection.

There's the PDF annotation and I didn't have room on the slide, but there's maybe a dozen PDF annotation subclasses where each one of the subclasses will map to, you know, a particular type of annotation like a link annotation, for example. So there's a PDF annotation link class to represent that. So what we did for Tiger is we added another view class, the PDF Thumbnail view, and we added a whole new sort of base class, the PDF Action. And it has a number of subclasses similar to the annotation. And we also added a few more annotation subclasses as well.

So I guess one way to kind of think of all these PDF kit classes is to really think of them in two camps, the view classes and everything else. So the view classes, like the thumbnail view and the PDF view, those are sort of high level classes. And as you might imagine, they're using all these other classes.

The PDF view will be associated with a PDF document. From the document, it can find out the pages and display the pages and handle annotations and that sort of thing. So the rest of the classes I guess I call utility or kind of base classes. And even though PDF view and PDF thumbnail view use these classes and if you're just writing an app that just displays a PDF, you might only need to really use the higher level view classes, the thumbnail view and the PDF view.

But it doesn't mean that you can't use these utility classes and take advantage of them. And I'll show you in the last two parts of this session how you would do that. But first let me talk about what most developers would want to do with PDFKit and what seems to be being used most right now with PDFKit, and that's displaying a PDF. So I'll switch over to the demo machine. And I guess if you saw Peter Graffagnino's kind of State of the Union -- it's not up?

If you saw his demo, I guess it was Monday. I'm going to show you that demo again, but I'll try to -- I'll try to explain a little bit more about what's going on. So I'm going into the developer folder. And this is a standard Leopard install. So if you've installed your Leopard, you could do this as well. Go into the applications and open the new interface builder.

I'll go ahead and create a Cocoa Window template. And so it's given me an empty window here. If I bring up the library, Here are all the objects. And like I said, for this version of Leopard, the PDF view is right there. It looks like a little PDF icon. So I can just drop that in to my view. And we're kind of cheating here. If I see I'm inside Interface Builder in PDFKit, I go ahead and give you some sample content.

I say cheating because you'll have to provide your own content in your application. So here's a PDF view, here's a thumbnail view. I'll control drag from the thumbnail view to the PDF view and you see that one of its outlets is a PDF view. So by clicking that and establishing this relationship, the thumbnail view can always sort of, I don't know, sort of parasitically, I guess, it can kind of query the view to find out the document that the view is showing and any of the sort of display mode, whatever the document is in, the thumbnail view can-- or, sorry, any display mode that the PDF view is in, the PDF thumbnail view can kind of get that from. So it's important to tell the thumbnail view whose PDF view he has. Let me... Before I group these together, I'll show you a few other things that you can find right here in the inspector. If I bring up the inspector and interface builder for the PDF view, you can see a few settings. I can change whether or not it auto scales. I can say display two up. I can turn on and off page breaks.

There's other ways you can configure the PDF view as well, but you have to do that programmatically. These are the only ones that we expose through the inspector. But I'll go ahead and go back to the single page continuous. And while I'm here, I'll check the springs. OK, I'll set the springs on the thumbnail view. And the thumbnail view has a few settings as well. We can set the maximum size for the thumbnails, the maximum number of columns, a few other things. So let me go ahead and group these into a split view and set the springs on that.

So kind of the point of this demo is show you how much how much you get, I guess, for free from PDFKit, from PDFView and PDFThumbnailView, without writing any code. Again, like I said, I'm kind of cheating. I'm giving you a sample document here, but you see I can select text. I can scroll. I can even drag this text to the desktop and get a text clipping out of it. I can, if there's links in the document, you see the cursor changes to a hand. I get a tool tip telling me where that link, the destination for that link. I click. It takes me there. And all the while I'm doing this, the thumbnail view is kind of following along. It's getting notifications from the PDF view that the page has changed. And so over here, it's updating and selecting the current page. And similarly, or in reverse, I can select pages in the thumbnail view, and it tells its PDF view, go to this page, go to this page, go to this page. And something that we added because we could is in the thumbnail view you can reorder the pages. So I can drag page two to page one and you'll see that sure enough the title page has moved to the second place and the Apple legalese has moved to the top of the document.

So I can reorder pages however I want. If I had two thumbnail views up I can even drag a page from one PDF into another PDF. In fact, if you've got Leopard running, you can run preview on Leopard and try that. Open two PDFs and drag pages from one to the other. Because as I say, preview is built on top of PDF kit. Since we had drag support, we went ahead and added support for dragging in files. So if I grab a couple of images here, if I drag a couple of images into the thumbnail view, what it's doing is generating new PDF pages and inserting them into the current PDF document. And sure enough, they just behave like full-fledged PDF document citizens. And if I were to save this PDF out or print it, the pages would be there. I'll show you something else that Peter didn't show you. I can even drag PDFs in. Let's see. I'm going to take a small one. Here's a two-page PDF. I'll insert this two-page PDF there. So I just inserted another PDF into this PDF. So anyway, this is kind of some of the stuff that you get. Thanks. that you get for free from PDFKit. I can go back to the slides now.

But I think that's a great demo, but in a way it's kind of misleading, not only because I create kind of fake content for you, but I think it's a little misleading because I think it can give the impression that that's all PDFKit is, is that it just displays PDF content and, you know, it's kind of like a PDF viewer in a box. And it kind of is, but I like to think there's more, so I'll focus on that on the last two sections. But let me go ahead and continue with sort of like how you would write an application to display PDF content. I cheated.

You do have to associate a PDF document somehow with that view. So the way you do that is with the PDF document class. And this is probably the only utility class you really would really have to use if all you wanted to do is display PDF. You have to create a PDF document object. In this particular application, I have, say, a file on disk. So I'm going to call PDF document Alec and it with URL, and then just pass it a URL that represents that file. If I had data instead, there's a similar call to create a PDF document from NSData. And as long as everything goes OK, as long as I passed it a real PDF, I'll get back a PDF document object. So on my view, let's say it's called My PDF View, I just call Set Document. And as soon as I do that, the PDF View retains the document. So first of all, I can release it. But just as there's a Set Document method on the view, there's also a Document method. I can always ask the document-- or sorry, I can always ask the view for its document again in case I want to know the number of pages or something like that. So as far as I'm concerned, I can just forget about the document at this point. So it's really just one or two lines of code, and you can forget all about PDF document. Oh, and I should say too-- well, let me go to the next slide because it kind of ties in. As soon as you associate a PDF document with the view, there's all kinds of things that the PDF view now has access to.

And here are some of the attributes that you can call methods on a PDF document that you can call. And these are the things that the PDF view is doing. So you can find out the number of pages. And by extension, you can imagine that the thumbnail view is also finding out the number of pages so it can determine how many thumbnails to draw. For each one of the pages in the PDF, you can get that page object. And so there are methods on the PDF document to say, Give me the page at index, you know, I. I should maybe mention that PDFKit is zero-based, so if it's a five-page PDF and you want the last page, you'd get page four, and page zero represents the first page. So there's methods on the document for adding, removing, deleting pages, reordering pages, and again, you can imagine that the thumbnail view basically is calling those methods, but there's no reason you can't programmatically call those methods as well if you want to insert pages into a PDF document.

Things like subject, title, author, keywords, those kinds of document information, you get those from the PDF document class. Preview, when you bring up the info for the PDF, it's getting that from the PDF document, those kinds of things. If you want to support searching in your application, you would do it with the PDF document. And I'll show you a quick example of that in a minute. And finally, if the PDF has an outline associated with it, it's from the document that you would get sort of that root outline object. And I'll talk a little bit more about that in a second. But then finally, if the user has edited the PDF, if you've inserted pages, reordered pages, whatever, then you can save the document in the PDF document class. So there's a write to file, write to URL, data representation methods that will give you back that modified PDF as a PDF. So here's some stuff we added for Leopard. I guess this is kind of in the spirit of the sort of editing PDF creation features. There's a set method for the outline route. So if you have a PDF that doesn't have an outline, for example, like if you're in Safari and you print a few pages and you save them as a PDF and you have no outline. You can programmatically create that outline. So you can create the whole outline that indicates, who knows, chapter one, chapter two, and then you can associate it with that document with the set outline root call. And something else we did that's kind of more of just a minor enhancement was the ability to search for multiple strings. So I'm going to switch over to preview just because I think Preview kind of illustrates a lot of some of these things.

some of these PDF views aspects. I'll open up the PDF kit guide. So here's the preview on Leopard, if you haven't seen it already. Change the UI a little bit. So PDF view on the left. I've got a toolbar, and there's a previous next button. And you'll see that the previous button is disabled. I click on Next, it goes to the next page. Now the previous button is enabled. Next, next, next, et cetera. I can, if I customize the toolbar, there's a few more tools I can bring in here. Here's a back forward button. Here's a page text field.

So if I click on a link here, like where does this go? To page 10. I'm on page 5. I click on the link. If I hit the Back button now, I go from page 10 back to page 5. So PDF View is maintaining a whole history, kind of like a navigation history for you. And I'm able to just wire my back and forward buttons, or menu items, up to the PDF View's back and forward methods. I can enter in a specific page number here. Let me type in page 20. Hit enter. It goes to page 20.

And you'll notice, too, that as I've been navigating the document, that page number has been updating as well. I'll show you how that's done. Here's some kind of more display options. I can zoom in, zoom out. You saw in Interface Builder how I was able to change kind of the layout, one page, two page. There's some other.

Let me go to page one. I'll show you. There's some esoteric things in here too that you can find buried in previews preferences. You can turn on and off anti-aliasing for example. It's probably not obvious up here. It's pretty subtle for large text like that. "Greeking," everyone asks me what text "greeking" is. If I change it to 50, you see how the text turned into those gray rectangles?

It's kind of a performance enhancement. It says if the font is less than the "greeking" threshold, don't even bother to draw the just represent them as gray boxes. You can imagine even with the, I think the default Greek-ing threshold is probably three, not three points, three pixels, because, let's do that again. If I were to zoom in here, for example, you'll see that as soon as the text goes past that threshold, it's not Greek. So it's not a function of point size, it's a function of how large the text gets drawn. So I'm glad I explained that. I get that a lot. But to give you sort of a practical example of it, the thumbnail view here is with the Greek set to three, obviously there's no point in rendering all the text in these thumbnails because you're not going to be able to read it so it can be a lot quicker, the PDF page can draw its content a lot quicker if the Greek threshold is set to some reasonable value.

Let me just show you the outline again real quick. So this PDF has an outline. If I collapse it, you'll see there's something called PDF Kit Programming Guide. And it's taking me to the first page. It's got a twist down disclosure triangle, which tells me that it has children.

And in fact, it has six children. Its first child is contents, tables, introduction. And then its fourth child has children as well. So I can twist that down and you see that it has two children. PDF Basics has more children. And then these are sort of-- I guess you think of them as sort of like leaf nodes. So the outline is kind of a tree structure. And it starts-- actually, it doesn't start here at the PDF Kit Programming Guide. It actually starts at the root. And the root is kind of a special instance of the PDF outline. It's never drawn. It's never displayed. It acts simply as a container. And in fact, the root in this example for this PDF has only one child, the PDF kit programming guide. But then that is a PDF outline as well, which has-- let me twist this up again-- which has six children, et cetera, all the way down. So let me go back to the slides, and I'll show you how you-- Where did I leave the clicker? I'll show you how you do that in code. Well, it's pretty straightforward, the navigation part. So my next button, previous button, or menu items, I just wire those up to go to next page, go to previous page. There's a go back, go forward, and then, again, PDF view is maintaining the history for you. If you want to enable and disable your controls, there's some convenience method, can go back, can go forward, that just return Boolean, so you can just set those up to enable your controls. And similarly, there's can go to next page, can go to previous page as well. So for the example where the text field where I typed in page 20, that requires two lines of code.

You have to take the 20 that the user entered and subtract one because we're zero based. And then there's a method on the document. And here's an example again where the PDF view, since it holds the document, I can ask my PDF view to give me the PDF document. And then the document has the page and index method. So I say to the document, give me page at index 19, and I get back a PDF page object. And so on the PDF view, there's a method called go to page. Takes that object, and bam, you go to page 20. So pretty straightforward.

The reverse, where as I navigated through the document that text field was updating, that's handled with a notification. And there is a PDF view page change notification. In fact, that's what the thumbnail view listens to as well. So if your application listens to the page change notification, you can find out the current page, get the page's label, which is like the page number, and you can display that in your text field. So that's also trivial to do.

So some of the other ways that you can sort of handle the PDF view display options, I mentioned or I showed you in interface builder how you can change the layout and if you do that programmatically, there's a call called set display mode and we've defined some constants that are the sort of layouts that PDF view supports. You can change the display boxes, you can turn on/off page breaks, I showed you that. Scaling, I showed you zooming in, zooming out. anti-aliasing. I'm kind of hesitant to talk about the display boxes. It's kind of a strange PDF thing that--gosh, now I guess I have to.

Suffice it to say that PDF pages can have, not always, but they can have different sort of bounds, basically. What you're supposed to typically use, like in a display application, like in Adobe Reader or Preview, is to show the crop bounds, the crop box for the pages. But a page could have a media box that's larger, that might have extra white space that's good for printing, but not necessarily-- it might take up too much screen real estate if you were to display it using the media box. There's bleed boxes, art box, trim box, and PDFKit supports all those things. And you can tell the PDF view to show the media box or show the crop box.

So the thumbnail view, I don't think there's much I really have to say about that. I pretty much showed you all the aspects of that. It's got an outlet view. There's a few display options on that, the maximum number of columns you want it to display. It gets the page change notification, get drag and drop, delete. I guess the one thing I should mention, though, is if the user does that, if they reorder the pages, or drag in images, or delete a page, or something like that, the PDF thumbnail view will send your application, or whoever's listening for this notification, like a PDF edited notification. So that way your application can set the dirty flag or whatever you want to do. You can also turn off all that page dragging stuff. So here's how you get the outline from the PDF document. There's a method called outline root. And again, I'm asking for the PDF view for its document. So I get back this PDF document object, and I ask for its outline root. And what I get back is if the PDF has an outline, I get back a PDF outline object. If it doesn't have one, I get back null. And so that's sort of the root PDF outline. From that, I can ask for its children and its children's children, et cetera. And that's how I can build up that whole outline view, which by the way, that is just an NS outline view. There's no special PDF outline view for PDFKit. So here's some other attributes of the outline. And I should mention, too, that there's some sample code. Actually, if you go to developers, examples, I think it's under Quartz, PDF kit, there's a PDF kit viewer application. And they show you how to display the outline.

It's probably four or five lines of code. You just give it an NS outline view. And you put an NS outline view in your window. and you just apply a couple of the delegate methods, and basically the outline view just runs with it, and you're able to display the whole tree of the outline. Some of the outline attributes are the number of children. So that's how many children the outline item has. For each of its children, you can get those children, and those children are themselves PDF outline objects that may or may not have children as well. There is a label. That's the string that gets displayed. The root, again, is special. It won't have a label. And there's a destination. And that's where, if the user clicks on this item, what should I do? What action should I perform? Generally, it's a destination somewhere within the document. Go to page 14 or whatever. Again, the root outline is special. It won't have a destination.

So what we did in Leopard-- and there's a lot of editing methods here. First of all, there's an init method. So you can create a PDF outline. So your application might, if there's no outline associated with a document. Your application might, for example, create an outline object. Say, OK, well, this is going to be the root. Then you can create other outline objects. Let's say you're going to have-- let's say the document has five chapters. So you might create five more PDF outline objects. And for these, you would set the label-- chapter one, chapter two, chapter three-- on each of those. So there's a set label method. Once you have those five outline items, you'll want to make them children of the root. So there's a set child or add child method so that you can take the five, add all of those to the root, and now you've essentially got your tree. For each of those children, then, you can set an action. You can say chapter one goes to page four, and chapter two goes to page 14 or whatever. And then finally, as I mentioned on the PDF document slide, then you finally, you can say set outline root, and then the document's going to retain the whole tree. You can release it. And when you save that document out in Leopard, the whole outline gets preserved. And in fact, again, if you want to sort of try that out, go into preview on your Leopard install and open a PDF that has an outline and save it, and you'll see that the outline is preserved. And in fact, I should say, too, I don't think I'll have time to demo this, but if you kind of play around with preview, I don't know that this is going to stay in when we ship, but right now there's a way that you can go in and create and edit and change the outline of the PDF inside preview on the Leopard install. Okay, so let me go once more to preview just to show you searching.

Okay, I'm going to type in outline. Okay, so what Preview did is it went out and, or rather, I should say PDF kit did, it went out and found 123 occurrences of the word outline in this document. And the first one is selected here. You see that it was found on page three. Section. Great. There's another name. So think outline. Think bookmark.

The first instance was found in the contents outline or section. If I switch back to the outline, you'll see that, yeah, right here, there's a section, one of the items in the outline is called contents. So in the search result, it's telling you that this instance of the word outline was found in the contents. And in fact, on the right, there's, let me make this bigger so that you can see, there's a whole bunch of sample text here, kind of contextual text that shows you other text that was found around that word. In the middle is the word outline there. And you see that pages 10, outlines 11, that's some of the text that was found around that. The next example was also found on page 3. And as I'm selecting these over in the table view, you'll see that they're being selected over in the PDF view as well. Then there were four of them found on page 5 in the tables, figures, and listings section. One was found in the introduction. But one thing that's kind of nice is that as soon as you're able to do that, as soon as you're able to, for any given instance of the word, find out the page, the outline section, sample text, what you can do is do things like this. You see the check box in the top. We've got above the search results, we've got group by section. And if I check that box, we've kind of switched in a new table view that shows you kind of of a relevancy ranking. So you see that 88 instances of the word outline were found in the section called Creating Outlines. So maybe that allows a user-- let me turn on the scrolling here.

So now that tells the user that, you know, maybe this is the section of the document that you're looking for. And you see that every instance, in this case, of the word outline is highlighted. So that's that. I'll go back to the slides, and I'll show you how you can do things like that or even more in your application. So there's kind of two ways of searching that PDFKit supports.

One method is I call search method one. Well, I call it the, like, text edit style of searching, where maybe a dialog box comes up, and the user types in a word like outline, and there's a Next and a Previous button. And they click on Next, and it finds the first instance, and they click on Next again, and it finds the next instance, et cetera. That's supported in PDFKit, but I'm not going to go into that. I'll instead show you what Preview's doing, and I call that the sort of Google search method, which is search method two, to say basically, go out and just find every instance of the word outline. And the way you do that is at the document level, the PDF document, you say, begin find string, and you pass in the string outline, and options. So these are AppKit search options, so I can say search backwards, case insensitive search, or literal search. And the PDF document is just going to go off. It's an asynchronous call. You make the call, and you're done. The user can continue to use your application, resize the window, click on links, et cetera. PDF document is off, racing along, trying to find every instance of the word outline. It's going to start on page one and search for the word outline, go to page two, page three, and it's going to work its way through the whole document. And the way your application will find out about sort of the progress is with notifications.

Or with the delegate method. So I give an example here that if you have an object in your application that's a delegate for the PDF view, or for the PDF document, it'll look for a method called did match string and if you've implemented this your did match string will get called every time the word outline was found. And so your did match string might look something like this. I'm passed in a PDF selection and I'll tell you a little bit more about that in a second.

So all my application does here is I just take that instance and I add it to an array and presumably I have some kind of a table view that's showing the the search results, so I need to tell that table view to reload its data, because we've got another instance came in.

So why is PDF selection passed in for a search method? It's kind of strange. I guess one way to think of it is to imagine that the user literally went through and started selecting each instance of the word outline. I mean, it's kind of a strange analogy, but by passing your application this PDF selection object, because of all sort of the attributes that you can query about the PDF selection, it gives you the ability to do those things we did in preview. So for example, there's a page or pages attribute on a PDF selection.

So when your delegate method gets called, the word outline was found, you can find out which page that is just by asking the selection for the page that the selection is on. From the page, I can get the page label, page three, for example, and that's how I populate the first column of my table view and preview. You can get the bounds for the selection for a given page. You can find the nearest PDF outline item. So this is the way I was able to find out, for example, that the first instance was found in the contents section. So again, once I have the PDF outline item for that selection, I can get its label, contents, and I can put that in the second column. Finally, you can add selections together.

It's not as interesting for searching, but for example, if a user was, oh, if you wanted to support, this is more in the text selection domain. If you wanted to allow a user to select some text, hold down some modifier key and select some additional text and add those selections, you can do that. It's kind of a Boolean ad. And you can get the string for a selection. Basically, you're asking the selection, "What text are you covered?" And obviously, when you're doing a search for the word "outline" and your did-match-string method gets called, the selection string is going to be the word "outline," so it's probably not that interesting. But this next method that allows you to extend the selection allows you to arbitrarily kind of grow the tail of the selection, so many characters, let's say, you know, 20 characters, and grow the head of the selection selection, 20 characters so that you've got 40 plus characters selected so that now if you call the string method on this selection you can get that third column in preview which is kind of that contextual text. So in fact you can grow a selection all the way to the front and bottom of the document if you want.

So, wow, okay, so that was how you use PDFKit to display PDF in your app. And there are plenty of sample code. In addition to the one I mentioned, there's other sample code that you can download from DTS that shows you how to do the searching and that sort of thing. But let me see, are we ready for a demo? Yeah, okay. So I guess I'm moving on now to, kind of more of the fun things you can do with PDFKit. So I wrote this little application called PDF Calendar. And the point of it is to show you how to do-- I've had users or developers ask me how to create pages. And they were trying to do some kind of strange things like Like in AppKit, you can get a PDF from a selection or from a rectangle, from a view, and then you get a PDF, and you get data from that, and it's basically a PDF document. So they create a PDF document in AppKit and then get the first page from that and try to add it. I thought, you know, that's not really what I'd intended for users to do. I thought that what you would do if you wanted to create PDF content would just be to subclass PDF page, the class, subclass, the object, PDF page. Create your own object, and then just supply the draw method. And I'll show you what I'm doing here. I've just, this is kind of a lame interface. I'm not going to sit here and drag 12 images in. Four will suffice. But, okay, so it takes 12 images. I click the Make Calendar button. And basically that's just what my app did, is it, here's the thumbnail view again. I just put it into a drawer for old time's sake. That's basically what my app just did, and the source code to this is available.

But basically what I did is when the user clicks on Make Calendar, I created a new PDF document object. Now, if I don't initialize one with a file or with data, if I just call init and create a PDF document object, I get an empty object back. It has no pages. It's a document, but it has no pages associated with it. So since it's going to be a calendar, I want 12 pages. So I created 12 PDF page objects. But like I say, I used a subclass of PDF page. So let's say I called it My PDF Page. So I allocated one of my PDF page objects, or 12 of them, in fact. And for each one of the 12, I gave them an image. I told them what month they are. And then in the draw method for that page, I just simply draw the image. I draw the month. I draw a grid.

and if I had more time I suppose I could draw the dates and you know sync up with iCal or something like that but anyway this kind of gives you an idea of kind of thing that you could oh and once I create it so I should finish so once I created these 12 page objects and I add them to the document and then I tell the document you know to associate itself with this PDF view I call set document on the view and so it just behaves as though you know it's a regular PDF document except that it only exists in memory right now. Again, if I save, if I print, the documents, methods for saving it called. What happens internally is for each page of the document, in this case a 12 page document, for each of those 12 pages, the document will create, it creates one PDF context and then says begin page for page one and then it will call the first page and say draw. And basically the page will draw as though it were going into a PDF view or anywhere else, except that in this scenario, it's being captured and recorded into a PDF context. So I go end page, begin page again. I go to page two. I do this for all 12 pages. And then the PDF context gets flushed to disk. And that's how you get your PDF out of that.

So let's see. I'll go back to the slides. So that's PDF Calendar. So I'll show you just a little bit in detail what I did. I guess since I've talked a lot about PDF Page here, but I haven't really told you what it is. Here's some of the attributes of PDF Page. If it's been added to a document, then I can ask the page for its document. So there's kind of a way from sort of just the page level to walk up the chain, so to speak, and find out which document that page is associated with. page has a bounds associated with it. In fact, it can have multiple bounds. It can have different crop boxes and media boxes and that sort of thing, as I explained. Pages can have a rotation associated with them. It's usually zero, but it can be any multiple of 90. Pages can have annotations. The page is where the text in the PDF lies, unfortunately.

So it means that you know, 100-page PDF, each page contains that little piece of text from that PDF document. So it's at the page level that you would extract, that you would get the text from a PDF if you want. And then finally, the PDF page has a draw method, obviously. So the PDF view is calling the PDF pages draw method. That's how the PDF view is able to display the page contents. And in my application that I showed you, I just overrode that draw method and drew my own contents.

Some people would like a kind of a convenience method. So for Leopard, we added this init with image. So there's an init method on a PDF page where you just pass in an NS image and you get back a page. And in fact, again, this is what the PDF thumbnail view is calling when you drag an image in. And so finally, here's what my code did.

So I had to override bounds for box. There's only really two methods that you have to supply if you subclass PDF page. The bounds for box, you have to tell it how big you are. It doesn't know otherwise. All coordinates in PDFKit are in points. That's what Adobe defines for PDF. So that's 72 points per inch. So if I want my page to be 8 1⁄2 by 11 inches, I have to return that in points. So that's the 612 by 792. That's 8 1⁄2 by 11 in points. So then the other method, I now say the document or by extension the view knows how big I am. Now it's going to say, well, draw yourself. So here's my draw method. I don't care about crop box or media box, so I'm pretty much ignoring that box parameter that was passed in. And I'm going to, this is an oversimplification of the code, but I'm going to draw an image, I'm going to draw some text, and I'm going to draw a grid. kind of drawing that you want to do, including drawing of other PDFs, you just do within your draw method and that will all get captured and recorded into a PDF context for saving. Okay, so now we're at the last part, annotations.

Okay, I don't expect everybody to know what all the various annotations in a PDF are. If you go to Adobe's site, you can download the PDF spec. It's almost 2,000 pages, I think, and it goes into all the flavors of annotations. I'll show you a few of them in a minute. I've got a demo that illustrates a lot of these. But here are some of the names of the classes that were in Tiger for the subclasses of the annotation. Here's button widget, circle. You can imagine what some of these are. link and you can imagine line is a line annotation, markup square text. For Leopard we added three more. We added another kind of widget annotation, the choice widget. We added a popup annotation and the stamp annotation.

All annotations have a few attributes in common. They all have a type, so you can ask an annotation for its type, and the line annotation will tell you that it's of type line. Link will tell you that it's of type link. Then if an annotation has been added to a page, you can ask the annotation for the page that owns it.

So again, you can go all the way up the chain. You can ask the annotation for its page, the page for its document, and go all the way to the top if for some reason your annotation needs to know about the document. The PDF annotation has a bounds associated with it. That's where it lays on the page in the page's coordinate space. And again, it's in points. And the annotation, like the page, has a draw method as well. So that it knows how to draw its annotation.

So then all these sort of various subclasses, like the link, the line, the circle, they're gonna have kind of like subclass specific attributes and calls. So you imagine the link annotation will have a destination or a method on it to tell you where this link is going to take you if the user clicks on it. The circle annotation is going to have a color and, you know, the line width for the circle, that sort of thing.

So what we did, in addition to the three new annotations that we added for Leopard, is we added, as I mentioned, this whole new class, this PDF action class. And so the way that fits in is there are setters and getter actions now on the PDF annotations. So you could either ask for the link's destination or you could ask for its action. If you ask for the action for a link, you'll get back an appropriate go-to action, which basically is an action that means go to this destination. There are setters, getters, and in fact there's methods on the PDF view perform action. So you could get the action from an annotation and then tell the PDF view perform this action. And if it's a go-to page or something, the PDF view will just go to that page. And then all the annotations have all kinds of setters. So all the attributes that you can read from an annotation you can write as well, you can set.

So here's the PDF action class and really the only thing that all the actions share is the type method. So any given action you can ask for its type and it'll tell you if it's a type go to or a named action. It's the subclasses that have kind of the subclass specific attributes. So a go to action indicates a destination. So there's going to be a method on the go to action for getting and setting the destination. A URL action, which is an action that would launch your web browser or your mail application, that's going to have a getter and setter for the URL. So I'm actually going to spend the rest of this session talking about the URL. in a demo. Let's see how many minutes I have left. Because I could actually demo for hours if I wanted to on this. I won't. I'll try to keep it to 15 minutes or less.

You know what, if I have time, maybe I'll show you the outline editing in preview. So here's an application called PDF Annotation Editor. And I should point out that this, as well as the calendar application, are downloadable from the developer. I'm not sure how it works. Somehow, session 202, there's files associated with it. You'll find all these demos that I'm showing you today and ones that were done last year for Tiger downloadable associated with session 202. So let me just open up an annotation here. But this one only runs on Tiger. So the calendar-- I guess the calendar demo only runs on Tiger as well, or only runs on Leopard as well. Did I say this one runs only on Tiger? I meant Leopard. Okay. The demos I'm showing you today only run on Leopard. So I've just opened a PDF here, and I went ahead and chose one. We don't need the thumbnail view in this case.

I picked a PDF here that has a lot of annotations on it, partly to kind of illustrate to you what annotations are. And something I'm doing here in this application, and you can look at the source, I'm subclassing PDF view. And the PDF view has a draw page method so that every time the PDF view is about to draw a page, it calls its draw page method. So all you have to do, or all I'm doing in this application is I subclass PDF view and I implement draw page myself in my subclass. Now if I don't really care about doing anything in particular, I can just call super and inside my draw page method and let the PDF view handle drawing, for example, you know, this page of the PDF. But then what I can do and what I'm doing here is after the PDF view has drawn the PDF, I'm going to go in and I'm going to loop over, well, I'm going to find out if there are any annotations on the page and then I'm going to, for each annotation, I'm going to loop over and get its bounds and draw a gray rectangle.

So that's why you see what might look like odd rectangles around some of the annotations. And in fact, I'm doing hit testing in my subclass, so I can determine that an annotation was clicked on, and I can make that annotation, I've got a notion of like an active annotation, so I can draw it differently.

Here I'm keeping track of it and drawing it with a red boundary around it, so I can click on various annotations and you'll see the red boundary. That's all handled with the PDF view subclass. And you probably noticed over here on this little panel, that's what most of this code actually entailed was writing this panel. You'll see that as I click on an annotation like this circle annotation here, you'll see this panel is reflecting all number of attributes. These are some of the things, the calls, the attributes that you can get from the PDF annotation classes. So I see that this is a subtype circle. That's by calling the type method. I see there's contents, it's got some text, some content attribute. It's got a border color, red. There's no interior color, that box is not checked.

I see that it's flags indicate that it's to be displayed, which it is, and printed as well. If I uncheck this, it wouldn't print. You'd see it when you opened it in your viewer, but when you printed it out, that annotation wouldn't be there. I can also uncheck this, would be really mystifying to users. They up.

Although I suppose you could put an annotation that says, you weren't supposed to print this or something. And there's no action, although here are the actions that PDFKit supports. It does have a border, and it's got a thickness of 8 pixels, and it's not dashed. So I can try some of these other annotations and see that, okay, here's another one with a different border color, no contents this time. Here's one with obviously a dashed border. This one does have an interior color.

Here's a markup annotation. This one is a type highlight. There's strikeout and underline. Here's one of the strikeout annotations. Here's an underline annotation. This one's known as a free text annotation, as opposed to a text annotation, which is an annotation that has a popup associated with it. In fact, if I check the is open-- oh, it's over here. If I check the is open attribute, you can see what it looks like when its pop-up is open, see?

No action again. So here's some line annotations with different line start and end styles. These are interesting. These are interesting because they're stamp annotations. They're interesting because they don't have, if you looked at all the rest of these annotations, like the circle, for example, there are parameters that describe them. You know, red border, eight pixels thick, no dash, here's its bounds. I can display that just with those attributes alone. So there's sort of a parameterized, I guess you'd say, version of this annotation. But the stamps don't really have a parameterized version. I mean, I see that the contents are big stamp and that the name for that annotation, for that stamp annotation is pound D confidential. But, you know, that doesn't really tell me. There's no sort of set defined stamps that you can just, given these meager attributes, draw the stamp. So what Adobe did is they have this notion of an appearance stream. So every annotation, in addition to sort of a parameterized variant, there can also be an appearance stream.

And when an annotation has an appearance stream, we honor the appearance stream over the parameters. And in fact, they don't even have to match. I mean, if it's an appearance stream, we just draw that appearance stream. And the appearance stream, it's really like a little bit of Post Crypt. I mean, it's just like this. It's the same PDF drawing model. It's a stream of drawing commands. So for this circle here, the appearance stream would set up the bounds for this path and stroke it with this line thickness and this color. So unfortunately, what that means is I can go in here and check the dashed attribute or change the line thickness to 60, and you don't see any change in the annotation. And the problem is the annotation still has that appearance stream, and we're still drawing that appearance stream, and we're ignoring the parameters. One characteristic of an annotation with an appearance stream besides that is that when you resize it, you can see that very clearly it's not respecting its parameters. Otherwise, it wouldn't get so thick. It would stay at 8 pixels thick. In fact, let me take this annotation here, this circle that has the dashed border on it. And here's the fun part of this application. There's this whole annotation menu. So I'm going to say new circle. So this is me calling PDFKit-- there it is-- and creating a circle annotation, and then associating it with the current page. But I get kind of a vanilla circle annotation here. But this one, because there's no appearance stream yet, I can change all of its attributes. So I can bring up the color picker, and I'll go ahead and try to match that one. I'll pick a kind of a pale yellow, and I'll make that-- I should probably check that first. I'll give it a yellow interior color and kind of a reddish orange border and set it to-- what was that?

Four pixels? Four pixels dashed. And if I match the size, you'll see that-- just the parameterized variant looks pretty close. The only difference is when I resize this, it maintains its sort of, you know, it's being drawn solely by its parameters, whereas this one with an appearance stream just scales the appearance. Well, when... The way that these annotations are drawn is, again, through their draw method, and it turns out that in PDFKit, you can subclass a PDF annotation and supply its draw method, and you can draw any kind of an annotation you want. So that's how you would, in fact, do a stamp annotation, and that's why the stamp annotations are kind of interesting.

Because they don't have parameters, the only way to draw a stamp annotation is to supply an appearance stream. And the way your application, in creating a stamp annotation, the way you would do that, it would be to supply your own draw method. And so I've got one here. If I say new stamp, you'll see that I created-- I don't know, it's called my PDF annotation stamp or something.

And I've supplied a draw method for it that just draws the text Apple. So now I've basically got no parameters. I mean, I can give it one. I can tell it its name is Apple. But I can now-- when I save-- this is how this all works. When I save this PDF out to disk, the annotations draw method is called, and that draw method is recorded into an appearance stream. And all the parameters and the appearance stream are packaged up into a dictionary for that annotation, and all the annotation dictionaries for that page are associated with that page and written out to disk, so that when you open up this PDF inside any PDF viewer like Adobe Reader, all the annotations are there and this Apple stamp annotation will look just like that. This one here that didn't have any appearance stream will have an appearance stream which unfortunately means that as soon as I save this PDF and then reopen it, I'll no longer be able to actually edit it unless I strip off its appearance stream. I don't know, I could show you that. Just go ahead and save it. So I just wrote that file out, and if I switch here to the desktop, let me bring up Adobe Reader.

that on and here's our PDF. Well, there's the, I can see right off the bat, there's that red circle that I'd stretched. And scrolling on down, there's my stamp annotation with the word apple and here's the one I created sort of a parameterized version of. So, that just works. So, the last thing I'll show you, in this annotation editor is, let me open up, let's open up the small PDF because it's not as busy as the one. And I'll show you this kind of interesting. I can create a button widget, which is a kind of annotation that the user interacts with. And button widgets in general have field names associated with them because internally there is an Acroform dictionary.

basically a database. It's kind of a document database. And for each one of these widgets, there's the field name represents a key in that database and then there's a value associated with it. A simple push button doesn't have a value that's associated with it, but if I make another button and make it a checkbox here, this one's going to have a value. It's either on or off or in this case it's either on, yeah, it's either yes or off. And I could change this to single, married, that sort of thing. Let me turn off the background color. And let me change this field name. Let me change the field name to married.

So what I'm saying is that this checkbox now represents a way that the user can interact with the value for the field named married in the Acroform dictionary. And to show you another kind of widget, here's a text widget. Text widgets don't have real interesting draw methods. But I'll just put it next to the checkbox here. And if I switch over, do you see this little button in the bottom here? I put this little test and edit button. Basically, when I'm in edit mode, I'm handling all the mouse downs inside my PDF view subclass. And I'm drawing the gray rectangles. But when I'm in test mode here, it's as though I just opened this PDF. And I'm going to just call super for all my PDF view methods. So if I click on the checkbox, sure enough, I can check and uncheck that. I can type my name or hello or something in the text field. Oh, you know what I forgot to show you though? This button here, let's give it an action. And there's an action called Reset Form. And I can specify here which field names, basically, to reset whenever the user clicks on this button. Or I can say Reset All Items Except Those That Are Listed. And if I leave it blank, it says, in effect, reset all of them, reset everything. So let me go back and try that out. So now when I click on this button, sure enough, it un-checks the check box and it clears the text field. I can redo those and try it again and it works. Again, if I just save that out to the desktop, and open that up in Adobe Reader. I just want to show you that I know tricks up my sleeve. This is a proper PDF. There's the button and there's our check box and I can type in text. I'm not doing very good today. And the button, oh, auto complete, no. And the button, the button, the button works.

So I guess that's it. I've got one more slide, but I think I'll tell you what, I've got one minute, so I'll do this. I'll open up a PDF that doesn't have-- well, this-- that's not interesting. OK, the small one with two pages. Here's something you can do in preview. And I show you this because it illustrates some of the functionality in PDFKit I've been telling you about.

Over here in the Inspector, there's a Sections tab. And I can just add a new section. And what it's done is-- I should have shown you this beforehand. There weren't any sections in this document. In fact, if I delete it, you'll see there. It says no sections. But I say add a section, and it goes ahead and creates a section for the current page. And it's labeled page one, but I can call it title or something like that. And similarly, I can go down to this page, add another section. And so I'm basically creating those PDF outline objects. And if I select one and say make child, I'm telling that to make a child of the one above it, and so now I get this whole twist down sort of behavior with sort of subsections. And again, if I save this out and to the desktop, that whole tree gets flattened out. And then when we open it up, I see it's still there. And uh-oh, some of the fonts went wiggy. And open that up in-- In Adobe Reader, I hope I didn't quit Adobe Reader. Oh, there it is.

You'll see that it's got it. Oh, I did. Okay, well, it'll take just a minute. There. Bookmarks, and there's the one. I twist it down, and there's the other, and they work. So there you go. Let me just go right back to the slides for one slide more. There, more information. So if you go to the developer help on Apple's site, there's a couple of PDF kit documentation, a couple of PDFs, in fact. And you go to Adobe, of course, and they've got the PDF reference, and that'll tell you everything you would ever want to know about annotations and everything else in the PDF spec. PDF Kit Viewer I told you about. It's in Developer Examples. You can download PDF Kit Linker, Link Snoop. Those run on Tiger. And then the ones I just showed you today, PDF Calendar and Annotation Editor, those only work on Leopard, but you can download those as well. So that's it.