Graphics and Imaging • 1:14:58
The powerful graphics technologies in Mac OS X play a critical role in the success of applications and provide a rich platform for developer innovation - enabling the delivery of seamlessly composited 2D and 3D graphics. This session features the latest developments in Mac OS X graphics technologies, including Quartz Extreme, Quartz 2D, and OpenGL, and provides a framework for the other sessions in the Graphics and Imaging Track.
Speakers: Travis Brown, Peter Graffagnino
Unlisted on Apple Developer site
Transcript
This transcript was generated using Whisper, it has known transcription errors. We are working on an improved version.
Thank you very much. It's my pleasure to welcome you to session 200, which is a Graphics and Imaging Overview. What we typically do at the sort of overview session for the Graphics Track at WWDC is we followed a formula in the past, and that formula is essentially tell you about the new announcements for the upcoming technology announcements for the upcoming version of Mac OS X, and then point you to various sessions that are going to talk about those particular technologies.
And we're actually going to change things up a little bit this year. And what we're going to do is sort of recast the session. And we're going to actually recast it as a Graphics and Imaging Direction session. Because there's a very important thing that's happening at Apple, that our rate of innovation in terms of the graphics technologies that we build into Mac OS X is incredible.
We have fantastic technologies such as Quartz Extreme, a compositing windowing system that gives us the ability to create dramatic visual effects such as expose, and also the fast user switching that everyone seemed to like yesterday. And we're going to be able to create a lot of different kinds of visual effects, such as the image that we're seeing here, and the image that we're seeing on the screen.
And we're going to be able to create a lot of different kinds of visual effects, and we're going to be able to create a lot of different kinds of visual effects. And we're going to be able to create a lot of different kinds of visual effects. And we're going to be able to create a lot of different kinds of visual effects.
And we're going to be able to create a lot open graphics. And with your release cycles being anywhere from 12 to 24 months, it's often difficult for you guys to see where the trend is, to see which technologies you guys need to really invest in in terms of building your products on to make sure you make maximum use of the platform and create really compelling applications for our mutual users. So what we're going to do with this session is we're going to focus on two main themes that we have that run through our sort of technology innovation and graphics at Apple.
That's going to be PDF and also the ascendancy of the GPU as a way to do a lot of even 2D graphics work. So what I'd like to do is invite Peter Graffagnino, the director of graphics and imaging software to the stage, and he's going to take you through the session. Thank you. PETER GRAFFANINO: Thanks, Travis. Great. Thanks, Travis. Thanks, Travis. Thanks, Travis.
Thank you very much, Travis. Good to be here at WWDC. Good to see all you guys here. I think we've got a great week for you today, this week at WWDC. And today in particular, we're going to take you through the overview of the graphics and imaging sessions, which are always fun sessions at WWDC.
The first thing I'm going to do is, for those of you who are new to Mac OS X or new to WWDC, is do a brief overview of the graphics architecture of Mac OS X. Then I'm going to go into, as Travis said, two kind of central themes, underlying themes that we see in the industry and that we're taking advantage of in Mac OS X that I think it's really important for you guys to understand. The first one is about PDF, which we call Mac OS X's digital paper. The next is the GPU computing revolution, which I think is a real significant development. And finally, we'll go and do the tour of technologies and show you what's new in Panther.
So first of all, the block diagram review to all of you. We've got our Core OS, Darwin, at the bottom, based on FreeBSD and Mach. Graphics layers on top of that. And then the frameworks, Cocoa, Carbon, and on top of that the user interface, Aqua. In the graphics areas, we've got all of our technologies: Quartz 2D for 2D graphics, OpenGL for 3D graphics, QuickTime for multimedia and video, and our compositing windowing system called Quartz Compositor.
QuickTime you heard about if you were in the session before. It was a general overview of QuickTime directions. They've combined the QuickTime live conference with WWDC this year, so there's a lot of great content on QuickTime, but I'm not really going to talk about it right here. Quartz 2D, I'll give you a brief overview of what's in Quartz 2D. Basically, it's our 2D imaging model based on the industry standard PostScript and PDF model.
This imaging model's been around for a number of years, almost 20 years now. And it's really kind of a credit to the original designers that it's able to describe basically every page that's ever been printed. In fact, Adobe is celebrating the 20th anniversary of PostScript, and they published a nice book that was kind of an interesting read if you're into the history of this stuff.
But anyway, our Quartz 2D imaging model is based on the industry standard PostScript, and it's really just a lightweight C library that implements the PostScript imaging model primitives. And so there's no display PDF, as some people have asked about, or display PostScript or anything like that. It's really just a lightweight C library.
We can read and write PDF with that library, so you can think of PDF as kind of the meta-file format for the graphics library. In fact, we knew we wanted PDF to be the meta-file format, so we kind of worked backward from that to figure out what the good API would be for that.
We've got really fast anti-aliasing. As you see in the Acquius interface, the quality of the presentation of the line art and graphics and icons is really important, so we spent a lot of time making things look good and run real fast. It has Destination Alpha, which allows us to record coverage information as we draw. This allows us to do composited icons and sprites that can carry along their anti-aliasing information as they animate or move around.
One of the great things about PostScript, it really revolutionized digital typography, invented outline fonts, device-independent fonts, and over the years there's of course been Type 1, the original Adobe digital font format. There's been TrueType that Apple invented, and OpenType, and others have come along. And Apple has a great technology called Apple Type Services that Quartz 2D leverages to bring you all the typefaces and just handle them seamlessly for you, including a full Type 1 scaler.
ColorSync is also built into Quartz 2D. It's our implementation of the ICC standard for color-managed workflow and allows us to manage color end-to-end in the Quartz 2D library. And obviously one of the big things about a 2D library is how it relates to printing. Basically the whole WYSIWYG notion of being able to have the same imaging model for screen and printing. And so Quartz and PDF play very much a central role in the printing architecture as well.
The spool file we create when you go to draw into your printing context is just a PDF file. And then that PDF file is rasterized for inkjet printers using the same high-quality Quartz rasterization, or converted to PostScript for a PostScript printer. And again, we can use the end-to-end color management we have in the system to manage color through the whole process.
And the infrastructure we build all the core graphics conversion around that's part of printing, and just important, is sort of the spooling and sequencing architecture and the networking architecture around printing. And for that we're using an open source technology called CUPS, which stands for Common Unix Printing System, which is an implementation of an open standard called IPP. And it's kind of an upgrade if you're familiar with the LPR/LPD suite that comes along with Unix. This is kind of an upgrade to that. There are still command line utilities that do the same LPD/LPR functions, but CUPS provides much more modern infrastructure for that. for that.
So that's Quartz 2D to give you a brief framework in which to think about our 2D graphics. For 3D graphics, it's OpenGL. Again, industry standard technology, been around for maybe 15 years as the original Iris graphics library on the silicon graphics machines. And Apple's implementation is really a state-of-the-art implementation. We took our job very seriously in terms of building an OS infrastructure around driving graphics cards and virtualizing resources like video memory usage. And we have a lot of data flow optimizations to make sure things like multimedia and video can flow through the system very quickly.
[Transcript missing]
So that's OpenGL. Last kind of review bullet here is about the Quartz Compositor. The Quartz Compositor is our implementation of the windowing system. Basically composites all the layers together on the screen. It's based, again, on some pretty tried and true techniques in the industry. A seminal paper, a 1984 SIGGRAPH about the compositing algebra by Porter and Duff.
They basically introduced the notion of the alpha channel, RGBA compositing, and a whole algebra for doing that. And in those days, they were doing things like they had one program that could render fractals and one program that could, say, render spheres. And you didn't want to run them both every frame if one was static. And so you needed a way to combine the results after the rendering with high quality anti-aliasing.
And that's where the compositing algebra came into play. And it's been used pretty much ever since as a real fundamental primitive of graphics. And what we're doing is just using that in real time on the display. And we're using that to composite the output of all the applications to make up the graphical user interface.
So here's a block diagram of Quartz Extreme, which we announced last year, which is the implementation of the Quartz compositor on top of OpenGL. So the application can be drawing in whatever graphics library it wants, whether it's Quartz 2D, OpenGL, QuickTime. And basically, the compositor will blend those together using GL into the frame buffer.
Now, Panther is kind of our third generation of this sort of desktop compositing engine. We originally, in the first release of Mac OS X, it was all software-based solution with some hardware acceleration for like moving opaque windows and things like that. In Jaguar, we put everything on top of GL. In Panther, we've been refining it further to make it suitable for things like the expose feature that you saw. And it's really the natural evolution of windowing systems. I really think people and users just expect Windows to composite and I think everyone's going to be doing this eventually.
And Apple OpenGL, since we've proven in Jaguar and been running the desktop compositor on top of GL ever since Jaguar, I think it's obvious that Apple GL is robust enough for 24 by 7 operation. And the other thing that's key to understand about the Quartz Extreme implementation is once it gets the textured polygons it needs to draw to make up the GUI, there's really no special OpenGL calls it makes.
All of the acceleration in terms of minimal CPU copies and getting data to the frame buffer and the blending modes and all of those things are accessible to OpenGL developers. So we did a lot of tuning on the data paths to make sure we did minimal copies because when you're throwing around, you know, megabytes per window, you really can't copy any data.
But all of those advantages, advances are usable by you. And in fact, the keynote guys take advantage of a lot of those because they have pretty big textures when they're compositing and doing transitions as well. So the big thing that we did new with Quartz Extreme this year, one of the main things was the expose.
I'm not going to give you a demo because you saw it in the keynote, but basically the idea of animating all your windows so you can see everything, have a nice ability to pick up an icon off the desktop and without having to drop it, release the windows and be able to drag it into something else. So that's basically the Quartz compositor.
So there you have all the graphics technologies in Mac OS X in sort of a brief review. Let me go into now one of my first kind of extended themes that I'm going to talk about today, and that's PDF, what I call Mac OS X's digital paper. And this is not really a general kind of PDF in the industry discussion, but kind of how we view PDF as an OS implementer and provider, and what we're using it for in terms of the OS and the kind of opportunities we're trying to make available for you guys as developers.
So first off, this digital paper notion. PDF, the best way I can always think of to describe to someone what PDF is, is just to say digital paper. Because it really is a digital representation of a printed page. So in other words, it's got pagination built in, and it can basically represent the output from any application, so it's application independent. It can present output to any number of devices. It can handle any color spaces or whatever, so it's device independent. And it's extremely high fidelity. I mean, basically, the imaging model behind PDF is the same as the imaging model behind PostScript.
And as I said before, it's been able to describe basically every page ever printed in the last 20 years or so. So that's pretty high fidelity. And there's 500 million viewers distributed, so it's obviously universal. You can send someone a PDF file and be pretty guaranteed that no matter what platform on, they'll be able to look at it.
So the way I think about it is, PDF is really universal sort of view level abstraction in the MVC paradigm world. And when I put these slides together, I didn't know MVC was going to be such a theme at WWDC this year. So I have a brief review of MVC from a graphics guy's standpoint.
So MVC is a standard way object-oriented programmers have thought about factoring code into a model of view in a controller. And my example is, basically, the model is the application data. So the model is the application data structures like the variables and the algorithms and the data behind the model of the program. The view is a visual representation.
So it could be a pie chart or a bar graph, whatever you want. The idea is you can have multiple views of the same model data. And the controller might be a user interface area where you let, you know, the user type into a field to change the data to manipulate the model.
And so basically, the value of MVC is the fact that you've factored your code so you have a model that can have multiple views. So if you have a pie chart or a bar graph, you don't have to change your implementation of how to recalculate the spreadsheet. Models can have multiple controllers, so you can have a simple user interface, an advanced user interface, or even a scripting interface, and have your model code be the same.
And kind of from a graphics standpoint, the interesting thing is that any model, whether it's spreadsheet, database, audio file, or whatever, can project themselves into a 2D representation to show to the user. Otherwise, the user wouldn't be able to run the app because they couldn't see the data.
And so views of vastly different models can share a common visual language. And this is how graphical user interfaces work when you think about it. I mean, you basically have similar-looking windows, but presenting vastly different models of--depending upon which application is being run. The same thing with documents, compound documents, where you may have part of a document coming from a spreadsheet, part of a document coming out of a database, part of a document from a text flow. And because you can project all those things down to a view-level format, which is, say, a PDF file, you can have them all share the common visual language, even though the actual data backing them is pretty different. So let's move on here.
I might have to use the old fashioned way here. OK. So document-based applications are really MVC in their nature. You can think of the model as kind of the application file. The data structures of the application, in this case Excel-- the Excel file is really you can think of as the model. The view would be like the document window that comes up, where you get the row and column representation of the model. The controller would be the user interface, the application, all the menus and panels that you interact with.
I'll try this one more time. There we go. So the interesting thing to realize from a graphic standpoint is when you create a PDF from your spreadsheet, it's no longer a spreadsheet. You've projected that model into a view, and you really just have a representation of a spreadsheet. And I think it's really important to keep the difference between models and views and look at it this way and how PDF serves.
So model versus view. You can imagine if I'm interacting with someone, I have a choice of whether to send them the Excel file or send them the PDF that represents the data in the Excel file. And really both are valid choices. On the left-hand with the model, I get a very high fidelity model representation. I can exchange data and they can modify the model if they want or change the formulas. It requires that they're going to have that application and also have any fonts or plug-ins that I may have used in creating that model.
On the other hand, on the view exchange side, if I send them a PDF, I can guarantee that they're going to be able to see it because there's 500 million viewers out there. They're going to see the right fonts. They're going to see everything. They're not going to be able to interact with the data or change the basic calculations that go on. In fact, in many cases, you don't want that. But they are going to get a high fidelity view representation.
So one of the reasons why this is, I think, important to lay out is that if you look at the typical file menu of a application, sometimes we get asked why save as PDF isn't just in there. And I think the reason has to do with this model versus view idea.
Normally in the file menu, when you're talking about open, close, save, or even export, you're talking about exporting or saving the actual model data of the application, which could be, you know, anything. But when you talk about saving as PDF, what you're really talking about is essentially going through the print process, the mapping of the model data to a paginated representation, but just not sending it to any device, just keeping that as a digital file around.
And so saving the PDF, I really conceptually think of as just printing the digital paper. And that's really the notion, I think, that carries us the furthest in terms of how to think about this. And so what really hasn't been said before, but I think is a really important point, is that Mac OS X is the first commercial OS to have a system-wide standard for digital paper that's universal. And PDF truly is universal in terms of being able to view it.
So PDF is not a great model-level construct, and there are some cases where maybe it's possible to encode sort of flowable text in PDF, and this can be interesting in sort of closed-loop situations, but basically the thing to remember here is that you are starting to impose model-level constructs on what is essentially a view-level idea, and if you think about a text-flow model, actually, they're quite complicated, and a precise specification would include your styling model, your obstacle-avoidance model, how your containers connect and flow, a pagination model, whether you allow widows and orphans and those kind of things, justification model, and a hyphenation model, which is then going to require a whole-language dictionary so you know how to break up words. So a complete specification of an actual text flow is fairly complicated, and I think kind of beyond the scope of what PDF, and it's certainly not a universally agreed-to notion of how to do it. flow text.
So, and the main reason I think that's true is it kind of sacrifices the universality. I mean, the amazing thing about PDF is that everyone agrees that it's a universally agreed-to abstraction for marks on a page. I don't think everyone necessarily agrees that there is one universal abstraction for global text documents or spreadsheets or even that there should be one. So our basic idea is that models are precisely where applications innovate and differentiate, and we want apps to innovate and differentiate, but just always be able to print down to a PDF because then we can take that through the PDF workflow.
So again, our strategy is basically to use PDF as digital paper, use it as a final format of paginated documents and also vector artwork. Use other formats such as XML to encode model data. That's kind of our advice. I mean, that's what we do with Keynote and other applications.
And really build a rich framework for processing this digital paper in the operating system. And encourage applications to differentiate themselves by developing innovative models. So long live MVC separation. So that's my little talk on MVC. And let's see how that, once we get all the applications projecting their models into PDF, how that really benefits the graphic arts workflow.
So here I've drawn a sample graphic arts workflow. You can think of it starting with application-dependent files, removing the application-dependence to get to some kind of digital marks on a page representation, and then finally going device-dependent and either rasterizing to a set of TIFFs for CMYK separations. Basically going from device-independence to device-dependence and application-dependence to application-independence.
So one of the traditional ways this is done, and this is really kind of the key that made PDF possible, is that all applications, whether it's Word, Illustrator, Quark, whatever, can make PostScript. And what Adobe realized is that if they can make PostScript, we can sort of rebind the PostScript language to this format called PDF that's easier to read and have a reader application for.
And so this is kind of the workflow that was popular when PDF initially came out. Basically everything projected into PostScript, then you'd run a process, a distiller process, and create your PDF, and you could maybe tune it for the web and downsample the image or tune it for print optimization or pre-press.
Now that's great, but there's really a better way to do things, I think. I talk about creating a purposed PDF for the web or for print, but isn't PDF itself device-independent? Suppose I create my web-optimized PDF and later I realize I'd rather have it print-optimized. Well, then do I keep that PostScript file around so that I can go back to it and repurpose it? Or do I keep the application file around? Then I need the application. So do we really want PS to be the application-independent digital master? I think it's pretty clear the answer is no. If you think about taking this picture and just replacing the hub with PDF, that's really the architecture that we've been going for in Mac OS X.
And the nice thing about putting PDF in the middle there is it's really a better suited kind of digital master format than PostScript. Again, it's viewable on all these copies of Acrobat out there, and it really can losslessly encode the application intent since it's exactly the same imaging model as PostScript.
That digital master PDF that sits in the middle doesn't have to pass judgment on what the application is trying to draw. If the application is trying to draw high-definition images or high-resolution images or weird color spaces, or true type fonts or open type fonts, I don't really have to care because I just want to record what the application is telling me to draw. And then later on, I can purpose it out if I need to.
And so the nice thing about adding PDF as that center of the hub of just recording what the application drew is not only can I now later make a late-binding decision whether I want to go web or print, but I can also develop all these device-independent PDF processing tools instead of this opaque PostScript file in the middle that I need a language interpreter, and I can't even tell you how many pages are in it until I execute the document. I now have a much better device-independent representation there, and I can write little tools that do cover page or imposition or whatever on top of that. So that's really why I think it's key to have PDF in the center of that.
And PDF processing on Mac OS X is a pretty popular thing. There's over 50 little PDF applications, some not so little, on Mac OS X, some from big names, some from small names, and Panther is going to bring new opportunities. And this is kind of one of the thrusts with Panther on the 2D side is really getting to leverage the possibilities with PDF. So we add a bunch of new PDF workflow tools in Panther, addressing PostScript file handling, user-level scripting. We have something called Quartz PDF filters, printing to a PDF workflow, and PDF introspection APIs. And let me go through each of those briefly.
PostScript to PDF conversion basically allows us to deal with the PostScript legacy files, not necessarily put PostScript in the middle, but provide a graphically lossless transformation from PostScript to PDF. It works with EPS files as well. It's based on a real PostScript interpreter, and it's not a replacement for Distiller.
It doesn't do a lot of the finishing options you have in Distiller. But it basically allows us to graphically transform PostScript to PDF. The other nice thing on the printing side is it allows us to accept PostScript jobs to any printer from, say, a Windows client or a Mac OS 9 client.
So again, back to the picture, now PostScript can feed that hub, so we can take PostScript files, convert them to PDF, and then have them in that hub and subject to all the transformations that all the great PDF tools you guys are going to write can do. Another aspect of the PostScript legacy that was kind of lost along the way was the ability to have some level of user-level programmability. PostScript was actually a programming language, and some people used it to dynamically generate graphics. So the PostScript program would calculate the picture rather than just being the output from a driver.
PDF can't really do this, but the reverse Polish notation of PostScript and its abilities as a language also kind of made this a bear, and this always turned out to be more difficult than you'd like it to be. But nonetheless, some people found this incredibly useful - the ability to have simple scripting based down to the 2D graphics.
And although I never used PostScript as a pickup line, I found this on the web, which I thought was pretty amusing. So what we've done actually is take the Quartz 2D API and use a real scripting language, Python, and to create Python bindings for the Quartz 2D API. It's basically the C language API entry points that you can see in the Quartz header files, but just bound into a Python interpreter with a module. And this allows simple PDF processing from scripts.
We've added some convenience functions for dealing with QuickTime, for getting images in and out, for dealing with Cocoa in terms of drawing HTML, RTF, and Unicode, and also for dealing with our PostScript to PDF converter so you can read PostScript and EPS files. So basically, I think this is going to allow a new generation of script writers to really write simple tools to be able to manipulate PDF.
And it's really handy for small one-off processing scripts. And one of the reasons we did this ourselves is we had a bunch of places in the printing path where we just need to make a simple change to the document and rather than write code, it's just easier to do it this way. So you can imagine some examples like rasterizing an EPS file to a bitmap and exporting it via QuickTime, doing some advanced imposition algorithms for booklet typefaces, and then booklet printing or something like that.
Just concatenating two PDF files together, it's pretty--a few lines of code. Adding a cover page or watermarks to a PDF is pretty easy. Dynamically generating graphics on a server is something I think that's going to be very interesting in terms of being able to have all this power in a server-based application. Unlike PostScript, you can't really override the marking operators, you can't redefine show and show page, but for now we consider that to be a feature because it's a little bit unmaintainable.
So as an example, I actually did this, went off and looked at the old PostScript Blue book. I don't know if any of you guys remember that book, but there's a simple example in there called Wedge, and it draws this little starburst here. And that's the PostScript code on the right and the Python code on the left. Python code's a little bit longer, but probably more readable, certainly these days.
So. Next thing I'm going to mention briefly is Quartz PDF filters. Quartz PDF filters allow us to do some transformations on PDF, mostly dealing with color space transformations. And you create these recipes in the ColorSync utility. And at the ColorSync session, they're going to demo this and talk about it.
And it can be used for color conversion effects or even just we use it in the printing path when we're going to a black and white printer over a slow link. We want to get rid of all the color and just bring it down to black and white before we send it over the wire. It does have some imagery sampling and compression options built in as well.
Print-to-PDF workflow - this is something we announced back in one of the Jaguar updates. It allows you to extend the print panel, again leveraging print as the hub of where PDF workflow takes off. And basically that "Save as PDF" button in the print panel can grow to any number of options that the user wants in terms of applications that can open and deal with PDF.
For example, you could send the PDF off to Illustrator and work on it, or send it to Mail to bring it up in a Compose window, or Encrypted, or whatever. So this is a great place to hook in if you're writing a little PDF processing tool, and some of you guys have already done this, a great way to leverage into the system.
And lastly, our PDF introspection APIs, which are new APIs in Panther which allow you to have complete access to the PDF document structure as kind of the tree of objects. It basically models the dictionaries, streams, strings, arrays that are in the PDF file itself. It doesn't go and model the internal of the graphic streams, but it is useful for extracting things like links, annotations, and metadata and stuff like that. So enough talking for now. Let's bring Ralph up to the stage and give you a demo of some of the stuff we have as far as PDF processing in preview in Panther. Welcome, Ralph. Ralph Bezir: Hello.
So what I'm going to show you first is preview in Panther, which makes use of these new PDF introspection APIs that Peter was mentioning. So I'm opening a PDF document here. It's one of my favorite documents. And it has a table of content information embedded in the file.
And in Panther, if that's the case, then the drawer to the right pops open and shows you the table of contents. You can navigate through chapters and see, for example, all the instructions that the Velocity Engine has. Click on one of them. And, well, there's the back floor. But similarly, we also added search for PDFs. So I can actually look for string not, oops.
[Transcript missing]
Well, one thing that Peter was mentioning is the PostScript to PDF conversion. And what I'm going to do is I essentially just double click a PostScript document from Finder, and the PostScript to PDF conversion kicks in, and it will open in preview briefly. Yes, here it is.
So what you see here... This is a paper I got from the internet. It's in PostScript. It just got converted. Text is there, line art is there, all the mathematical formulas are there, so pretty much what you would expect. And because it's now PDF, all those tools we have to work with PDFs now work on the converted Postgres file as well. So, for example, I can go and search for the word "image". Let me zoom in here. And it actually highlights it for you.
And one of the coolest things, I think, is you can even go and copy-paste out of the document. Like, I select a paragraph here. Copy. And I get a text representation of the part I just copied out. Okay. Next thing I would like to show you is part scripting.
What I have here is a Python script that - well, let's just go through it. It opens a PDF file, an existing PDF file, and it creates a new PDF file. And then it enumerates all the pages in the PDF file, gets the size of the particular page, creates a new page in the output file, and just draws the content of the original page into the output file.
So, so far we didn't really do anything exciting, it's pretty much a copy operation. And then we add some custom drawing. In this case, we add red text to the margin of the page. Once we're done with it, We tell the system to open that file with the default PDF viewer of the system.
So what I'm going to do now is - well, let me tell you first. If you put that script into the PDF services folder, then it will appear in the print panel as Peter was showing. So if I go to the print panel, where is it? It takes a second. So I'm going to print this script, and I'm going to print it through the script. So it's kind of an Azure-esque thing to do. So let's see. And there it is. So I have to print out the script and it added that confidential mark on the side.
So just to make that point perfectly clear, this works with everything. So I can take my PostScript file I just had before, print it through the script.
[Transcript missing]
So for all of those of you who have your theses locked away in .ps files that you can't read anymore, you can search them there. I know a few people like that.
Okay, so Quartz and PDF summary. PDF really provides Mac OS X with a universal representation of digital paper. We plan to leverage that a lot more in the operating system and hopefully build lots of opportunities for you guys as well to process PDF. And our strategy is really to continue to build on it as a view-level abstraction, final form presentation metaphor for PDF.
And I hope we've convinced you that in Panther we're adding a lot of really significant tools to the PDF toolset. And as my last comment, I'd say if you're still thinking Pixmaps and GWorlds, come join the Quartz 2D party. Go to the sessions, learn about it. It's pretty fun stuff.
That's it for the PDF and Quartz 2D session. I'm going to change gears a little bit now and talk about another development in the industry. This one's much bigger than Apple. It's going on right now. I'm going to call that the GPU computing revolution. There's really something going on now in terms of the ability of graphics processors to compute graphics at much higher rates than even CPUs can.
[Transcript missing]
Then the fragments are processed, so we apply texture mapping, we calculate what the final color in Z for that particular pixel is going to be, and then the last step is fragment rendering, where we do the Z check against the frame buffer, we do alpha blending, compositing, fog, those kinds of things. And so that's the whole graphics pipeline, and this is sort of the way it's been for a while.
And what's happening is the GPU vendors are opening up these two parts of the graphics processor as being programmable. So you can think of that little stream execution unit kind of getting plugged into the vertex processing and fragment processing areas. So when you hear those terms, that's really what's going on. You have a little, a small little data flow engine that can do a small calculation per vertex or per fragment.
When I talk about fragments, I also sometimes talk about pixels, so I think sometimes it's easier for people who are new to 3D to think about pixels, and that's okay with me. So I'm going to talk about pixel programming. So pixel programming uses a data model similar to AltaVec.
The programmable units operate on 128-bit four vectors of floats. Obviously, your texture lookups may come from 8-bit data, but it's all expanded for you into the floating point. And you basically get to write a small program that's executed per pixel to calculate the result of the output pixel. So you don't get control over the blending and the Z test and all that sort of stuff, but you can get total control over the source color of the pixel.
You get access to the iterated values, so those little, the vertex attributes at the corners of the triangle. You get access to what that value is for the particular picture that you're drawing. You can look in some global data. You can look at, do memory reads off of textures. You can't look at the destination picture. You can't know what you're going to draw to.
You just leave your output in a special result register. And there's a lot of powerful ALU instructions, like power, reciprocal square root, cross product, all these kind of things, and a bunch of swizzling instructions for changing order on the data units and the vectors. So this is pseudo code.
This is not in any particular language, just to try to communicate the simplicity and the job of a pixel program or a fragment program. Basically, they all have the same function signature. Calculate pixel, returns a result, gets the iterated values, and it can do whatever computation it wants.
It has access to some global constants, as read only, and a global set of textures that it can go look up values in. And that's really all it does. 1 Now I'm going to explain to you why I think that once you constrain the problem like that, you can make it go real fast, and why GPU designers kind of have this advantage with the parallelism. So this isn't how any particular chip works, but just how if you've constrained the calculation like that, how you might be able to design hardware to make it go fast.
So first let's consider just a four instruction-long fragment program, and I have one vector unit right now, so I'm going to just do sequential processing through it. I'm going to send my fragment values into the vector unit, and then basically clock the instructions through. So, I'm basically doing one operation per clock.
I'm able to keep the processor busy every cycle, and I get one result every four clocks. So, no big news there. That's just sequential processing. Now there's a lot of data level parallelism in this problem. Pixel calculations, because of the way fragment programs are constructed, are independent. So the result pixel at a particular location can't depend on the result pixel at another location. The pixel calculations can basically execute in parallel. So call this parallelism in space, because it's kind of in the plane of the triangle.
What I can do here is just replicate the fragment execution unit in width and just do the same instruction on multiple vectors at one time. So here's how I might have that laid out. Now I have eight vector units. I can feed in simultaneously eight fragment values into the pipeline.
and then basically sequence through my instructions one at a time and at the end get my eight results out. So that gives us eight vector operations per clock. We're able to keep all eight of those busy every clock. We get eight results every four clocks because it's four instructions long. So on average, throughput of two pixels per clock.
But wait, there's more. Instruction-level parallelism is also true, because again, we're constrained. We can't really write to any memory. The machine state's only going to differ between one instruction and the next by the register that was written to in the previous instruction. So you can imagine compilation techniques or even maybe some hardware register renaming techniques coming into place to just build a little pipeline out of - or assembly line out of a sequence of instructions.
So let's look at how that might work. Now we're going to line up the vector units in time, and basically I'm going to teach each vector unit about an instruction. So vector unit one gets instruction one, vector unit two gets instruction two, and so on. And now I'm going to feed my fragment value in the top, operate on it with instruction one in the first vector unit, move it on to instruction unit two.
Meanwhile I feed the next fragment value into vector unit one and kind of keep the pipeline full. And once the pipeline is full, I'm really just processing, again, keeping all the units busy at the same time. And so now I'm fully utilizing all my four units, four operations per clock, and I'm getting one result every clock once the operation is full.
So obviously the next thing, no surprise, I can put both of these things together and basically exploit the time and the space dimension at the same time. So I just get a bigger chip, drag out more vector units, and make sort of a 32 element fabric where I've got them four by eight just for illustration purposes. And now basically I can feed, you know, teach each row about a single instruction, feed the fragments in, fill up the pipeline, and then basically get eight results per clock with doing 32 operations per clock. So eight results per clock.
So I think you can see that as time goes on, and maybe today is number 32, tomorrow the number is 64, you're not going to run out of gas in the time dimension until the average size of a triangle you begin to approach. Which for image processing things where you're rendering big triangles, there's lots of parallel fragments. And in space, this is only a four instruction program, you can obviously imagine much longer instructions. So there's a lot of headroom just in terms of parallelism.
So basically my argument is it will sustain, and I think the chip capabilities certainly hasn't peaked. I think the parallel computing possibilities haven't even come close to peaking yet. We've got sort of 8x4 as I drew here, and I think there's still lots of headroom for that. And I think the entertainment industry is going strong and has not peaked. And as operating systems and applications get into the game, I think we're just going to add fuel to this fire and really have an interesting world where people are doing massively parallel computations on the GPU to free up their CPU to do other stuff as well.
So you might ask yourself, "That sounds great. How do I do it?" And the answer is you use OpenGL. In Panther, we have standard cross-vendor programming languages at both the vertex level and the fragment level. Those begin with the ARB prefix, which is the architecture review board for OpenGL, so you don't have to learn vendor-specific extensions. There are higher-level language being worked on, and those will come out as well. But assembly-level language will always be available, and I think these architectures are so new. We found you get rid of one temporary, and the thing runs twice as fast.
So people are probably going to be tweaking assembly on these things for a while. Programs are relatively small, so maybe that's not a big deal. But the higher-level languages are coming as well. So, without further ado, let's bring Ralph back up and show you some Fragment programs. We've got a Radeon 9700 plugged into this machine over here, which is running our Fragment program.
Go ahead, Raoul. Okay. So the first thing I'm going to show you is the OpenGL Shader Builder application that is in Pantor. And well, you see it here. On the left side, you have your little fragment program, and on the right side, you have a very complex OpenGL scene, which consists of a single rectangle.
So that rectangle has a texture on it, and there's actually a fragment program running right now that does that texturing. So if you look on the left side, what this fragment program does, it goes to the texture zero, uses the current texture coordinate to look up the color at that point, and then copies that color to the result.
So you get a textured quad. Well, this isn't terribly exciting because non-programmable hardware does exactly that for you, so there's no point to actually write this program. But we can go and modify it a bit. So for example, instead of copying the color back to the destination pixel unmodified, I can say, well, only copy the green and the alpha channel. And, well, the red and the blue channels are lost, so you see the result here. It is this rather unhealthy-looking cat.
Okay, so let's add an instruction in the middle here. After we did the texture lookup, we take the red component of the color and square it, and then copy that to green and blue to produce some kind of a sapient tone effect. But it doesn't look right. So let's tweak the exponent a bit.
Let's say we take that one to the fifth, yeah, one point two, something like this, until you have the look you're going for. Now what you noticed is whenever I type in the background, the program is compiled and run and shown to you right away. So you get immediate feedback what your program does, which is very nice to experiment and to get into things.
Okay, that's it for Shader Builder. Now that I've shown you what you can do with three instructions, let me show you what you can do if you put a bit more effort into things. Okay, so by the way, this picture has been called Demo Monkey, and so have I.
So the first thing I'm going to show you is a motion blur effect implemented as a fragment program. The interface is essentially I click somewhere and drag the mouse in some direction, and I get a motion blur in that direction. So the first thing you notice, it's pretty smooth frame rate.
And really the GL operations, the GL commands that are going on here is there are four vertices, four corners of that big rectangle, and then it says draw. So how you run your fragment program is you draw a rectangle, essentially. And then for every pixel, the fragment program gets executed to do that effect. Travis Brown, Peter Graffagnino So another effect we tried is an axial blur, like this. And you can set the focus point to wherever you want it to be, like this.
So because what the CPU does here is really negligible. I mean, it says, you know, here are four vertices, go. And from then on it just waits until the result is done. So this is pretty much, this is a very, very nice effect that is fairly expensive to compute, but it has pretty much zero CPU cost. Let me show you a different one, a glass distortion effect. So you can make the bumps bigger and smaller and actually move the glass around. Like this.
The last effect I would like to show is like an emboss effect. So what this does is it takes the picture we had before and interprets the brightness values as hills and valleys in, you know, like some kind of relief thing, and then puts a spotlight on it.
So you can actually go and drag the spotlight around. Like this. "Make the beam wider, narrower, things like that. And again, you have to do that in a while to realize you don't use any CPU at all. The only thing the CPU does is update the sliders." Okay, I think that's it for the Fragment Program.
Thanks, Ralph. So in summary, for this section of the GPU talk, I really hope I've convinced you that GPUs for certain classes of data parallel workloads and algorithms really have an advantage over CPUs. And I think that this advantage is going to be sustainable for at least the foreseeable future. And you can use this access to this power via OpenGL. And my advice to you is learn how to do this stuff before your competitors do. Because there's some pretty cool things that are going to be happening.
So the last section I'm going to talk about today is another effort we've been working on, which is Quartz 2D on OpenGL. Quartz 2D on OpenGL basically accelerates Quartz 2D by turning it into GL calls, and it's really the logical next step after Quartz Extreme. It ties together the two key thrusts we've talked about today of programming the graphics processor and PDF/2D implementation. And in Panther, we're going to have an initial implementation of this that really focuses mostly for GL developers who want to get high-quality text and line art into their applications, which has always been a really difficult thing to do with GL. Okay.
So the way that you do this is you take your GL context and you pass it to a function called cgGLContextCreate. You give the size of the CG context you want and the color space you want to render into, and you just make CG calls on that CG context ref. It's high-quality 2D rendering. It's virtually identical to software quality. We do use the alpha blending in the hardware, so it's not pixel. The values aren't going to be exactly the same, but it basically looks indistinguishable from software.
And it's anywhere from 2 to 10 times faster than Quartz 2D's software rendering. The way it gets to be on the order of 10 times faster is when we can actually cache things in the graphics unit. So we'll cache fonts, glyphs as textures. And if you hold on to your CG image refs or CG pattern refs and reuse them, the implementation will cache those in video memory as well. So you can draw very quickly with that.
The reason we're kind of calling it an initial implementation right now is because there are certain Quartz 2D operations that are not yet supported by this context. For one thing, we can't do the high quality LCD text that we have in the system without relying on fragment programming. And since fragment programming is kind of at the high end right now and it's coming down through the system, we can't really kind of turn on Quartz 2D acceleration everywhere right now. The other thing is the PDF 1.4 blend modes, which are also going to require fragment programming.
We also, you have to be using the Core Graphics API only or the Quartz 2D API only. You can't turn around and draw some quick draw with this thing. You can't ask for the locking the port bits. And you can't use high level frameworks on this context. So you pretty much have to be going right at it with the Quartz 2D API. We do require Quartz Extreme capable hardware, so we have the non-power of two texturing, which is important for drawing images and things like that.
And generally, as this path becomes available, I think applications, are going to have to revisit some of their assumptions about the cheapness of accessing the drawing buffer, the window buffer. So it's another thing to be aware of in your usage model if you want to use this kind of acceleration. So the basic way I tell this story is that having a wide pipe is really great, but it also increases the cost of reading back pixels and turning everything around.
So if you imagine you're on a little stream towards the frame buffer and you drop a few pixels in, and you realize you want them back, you're going to have to go back and forth. So if you imagine you're on a little stream towards the frame buffer and you drop a few pixels in, and you realize you want them back, you're going to have to go back and forth.
Yeah, maybe you can just reach down the stream and grab them, but if you drop them over the top of Niagara Falls, we've got to stop the falls, let all the water fall down, climb down there, get the pixels, bring them back up. You're not going to be running any faster than software, in fact, in some cases slower.
So to use Quartz 2D on OpenGL, you have to buy into the whole asynchronous and the pixels. They'll show up when they show up. I never want to look back at pixels I've drawn, except very rarely. Otherwise, it's just not going to be any faster for you. Thank you. So I'll invite Ralph up one last time for a demo of Quartz 2D on OpenGL.
Okay, the first application I'm going to show you is iChat. As you've seen in the keynote, iChat has now this video conferencing feature. The way that video conferencing view is implemented is actually fairly interesting. It is an OpenGL scene. So there is a video is put on a texture and this is displayed. Oops, I think. I'll be your cameraman. Please be my cameraman. That's a good point. Okay.
So video is put on a texture and then GL composites that texture. And the reason why that is done is if you have a two-way conference going, you have this little picture-in-picture which has a drop shadow. And sometimes there is content being shown as translucent alert messages lying on top of live video and stuff.
So GL is really good at these kind of things and it also does the conversion from a video YUV to RGB. So that takes all that load off the CPU so the CPU is free to do the actual video encoding and the networking stuff that is necessary. So this is a slightly modified version of iChat to make this demo.
What I'm doing here is I'm drawing text on top of it. And this is text drawn into an OpenGL scene. And it's not text on a texture. It is essentially CG show text at point and then point it at that texture. So you just draw into it and you get all the font management, the kerning, and all these kind of things that are usually very hard to get in OpenGL.
My second example I have here, I have a PDF document. This one here. It has a little frame and a bit of line art in the corner. By the way, the way I made this PDF document, I wrote a little Quartz scripting script, a Python script that takes the place of the line art and draws an oval. So, I'm going to take that and drag it into my view.
So we put a little oscillator on it to change size. But the point here is, again, this is not drawn into a texture. Every single frame, we reinterpret that PDF and draw it on top. So the scaling you get here is not scaling of a bitmap. It's vector art scaling.
So, let me show you something about the performance of these things. What I'm running here is a PDF document. It's the AppKit reference manual, which is 1,300 pages long. And I'm just trying to flip through the pages as fast as I can. And this is the standard "Quartz software renderer that you get in Panther. And you see that little frame rate meter in the top right corner? Yeah, we're getting, you know, between... "It's kinda low.
50 to 60 frames per second. So, that's actually, when you think about it, pretty impressive. It's, you know, 60 pages per second. That definitely beats the LaserWriter." Okay, but now I will replace the content view here with an OpenGL view and then point the same PDF file at the Quartz OpenGL renderer.
"It gets a bit better. So, we're around 160 to 180 pages per second now. And literally the piece of code that needed to be done there is something like three lines of setup. So the rest of the actual rendering code is exactly the same." So, why is it fast? Well, it's what Peter said about that asynchronous model.
So, what PDF parsing involves, well, you have to do the parsing and you have to do decompression of the stream, and then you draw. So, what happens in this case, once you did the decompression of a graphics primitive, you just submit it to the GPU, and while the GPU is doing the graphics, you're ready to decompress the next part and do the additional parsing. So, you have now the CPU and the GPU running in parallel very nicely.
So, that's, by the way, the little CPU meter you see there. This is a two-CPU machine, so it runs at 50%, so one CPU is fully busy. Well, you might wonder now, well, if I take all that parsing and decompression stuff out of the loop, because when I, my application draws, it doesn't do that. It just calls the CG APIs. So, and then I will say, well, I'm glad you asked.
Because that's what we did, we essentially took out all the graphics primitives off that PDF file and wrote in a big list, and then just called the CG APIs to do the same drawing. So it's technically no longer a PDF - the drawing of a PDF is a drawing of the PDF content.
And when I do that, things start to look like this. See that the CPU is still busy. We're actually still - the CPU is still completely saturated with sending these commands over to the GPU. So I would consider that a tuning opportunity, but we'll see. Okay, that's it for this demo. Thanks Ralph.
Yeah, that's pretty amazing. I think Ralph was telling this story about when we were bringing this demo together. I think he had to go back, what, twice to make the frame counter go higher? 100, that should be enough. No, 200, that should be enough. 400, okay. So anyway, so the recap of kind of the talk today is basically that there's really a new era of innovation in platform graphics that's happening in the industry. And I think Mac OS X is kind of leading the charge here. And it's really kind of a state-of-the-art visual computing platform for all your applications.
So in your apps, please leverage all the great infrastructure we're building in, of Quartz 2D and OpenGL. And I think unlike sort of in the old days, there was a period where people had to work around operating system infrastructure to try to do what they wanted to because of limitations in Quick Draw or GDI or whatever platform graphics was there.
But I think we're kind of coming upon a new era where the platform graphics is getting good enough that you build on top of it and really can go places. You don't have to go back and reinvent all of your blit loops again. Let us do that work and you guys add great value on top of that.
And with all these new things going on in the industry, particularly with OpenGL and Fragment program, it's really fun to go just learn a few new tricks and crack open the GL book or something like that and teach yourself some new techniques. Because there's a lot of new stuff happening. So for the last few minutes of the talk, what I'm going to do is briefly whiz you through some of the pointers to other sessions that you'll want to check out if hopefully we've piqued your interest during the talk today.
I'll start with Quartz 2D. Some of the new things in Panther for Quartz 2D is PDF 1.4 support, the PDF introspection API we talked about, the Quartz scripting with the Python bindings, CMYK rendering context so now we can drive raster printers in their native color space. Numerous performance optimizations and of course as you saw Quartz 2D on OpenGL.
There's a Quartz 2D in depth session on Thursday and then an intro to Quartz services which talks about some of the display management infrastructure on Friday. For Panther obviously we're shipping 1.0 of our X11 implementation as we talked about yesterday. This will be a merge with a 4.3 X386 that's on your seed. We have double clickable X11 applications now in full screen mode operating. And a lot of bug fixes. So check this out in the seed if you're an X developer. It's all in the disk you have.
Also new for Panther in printing, we've got PostScript support. We're really excited about that - the ability to basically have any Mac OS X attached printer be seen on the network as a PostScript printer. User interface improvements - we've got an improved version of the old desktop printers, which is coming out in Panther. We have job submission APIs. If you know how to calculate your own PDF file or your own PostScript file, you can just hand that to the spool system directly and not have to watch that page counting dialogue go by.
We're merging with CUPS 1.1.19, the latest version of CUPS. And we're also going to be shipping the GIMP drivers this time around - the GIMP print drivers, which are great for legacy printers. It supports a whole bunch of printers. And the drivers are really first-class citizens with their user interface and integration.
The printing session is on Thursday, so you'll want to check that out. For ColorSync, some of the new things. If you went to the QuickTime talk, you saw Tim talk a little bit about ColorSync and its relation to QuickTime Graphics Importer to apply profiles by default. Cameras are going to embed profiles by default with a standard profile.
Cups and GIMPrint drivers can be integrated with ColorSync and Vend profiles and have matching occur. We've got the Quartz PDF filters that I talked about. You'll want to go see those. And Sips, which is a command line image processing tool, which is meant to interface with AppleScript. So AppleScripters can do basic image operations like crop and rotate without having to launch preview or whatever.
New APIs for abstract profile generation. ColorSync has the facility to do actual manipulations in abstract color spaces like the sepia tone and LAB. There's going to be new APIs to how to express those kinds of color transformations in Panther. And also a new display calibrator that's tuned for LCDs. So go hear about that on Wednesday.
For image capture, a bunch of new stuff. There are new ways to integrate your applications with image capture. There's automatic task. Image capture services, which is the Cocoa and Carbon Service menu, you can integrate that with image capture. There's also a common UI layer for if you want to do scanning from within an application, you can call on that. There's also network support for sharing and monitoring image capture devices over the web, so that's kind of cool as well. on Wednesday, session 204 is image capture. and OpenGL... Excuse me. I need my water.
OpenGL, new for Panther. There's Fragment Program, Pixel Buffer support, lots of optimizations about copy text image and copy text subimage. We have recoverable GPU support for drivers that support it. The ability to reset the GPU on the fly without bringing the machine down, which is pretty handy, especially if you're debugging drivers.
OpenGL Shader Builder and Profiler. These are two really great tools we have for OpenGL developers, and I think you want to go to the sessions and see those, because those are really making great strides in Panther. Again, the ability to use Quartz 2D in an OpenGL context, and just lots of bug fixes and optimization techniques you can hear about.
So this is a bunch of sessions that we have for OpenGL. I'll point out in particular the last one, Session 212, on Friday. We're actually going to have some of the demo engineers from ATI come and show you how they built their demo engine and run through a couple of the techniques they used in their demos. Last year we had NVIDIA do this, and it was a great session, and it's going to be great again this year.
Another special session later on today, we thought it would be really fun to have the keynote engineers come and talk about Keynote as an application and how they really leverage the platform services in terms of Quartz 2D, OpenGL, QuickTime, Cocoa, to build what I think is really a great application that really kind of swims downstream with all the technology and builds on top of what we're doing in the operating system. So that'll be a really interesting session to attend as well.
So, and last but not least, our feedback forum, which is at 5:00, the last thing in the conference in the North Beach conference room. So come to there, tell us what we're doing right, what we're doing wrong, what you'd like to see us do in the future, and we always enjoy talking to you guys and getting your feedback.