Application Frameworks • 1:02:27
This session offers a wide variety of tips on improving performance in your Carbon application. Learn faster and more modern replacements for common Mac OS 9 programming tricks. Discover the most efficient ways to look at the file system, draw, handle events, manage memory, and many other typical application tasks.
Speakers: Xavier Legros, Curt Rothert, John Iarocci, Guy Fullerton
Unlisted on Apple Developer site
Transcript
This transcript was generated using Whisper, it has known transcription errors. We are working on an improved version.
I'd like to welcome you to session 416. Today we're going to be talking about performance in your Carbon application. I suppose everybody has been seeing what we've been doing with the G5. You probably went down to the lab. So the performance, it's a great machine. Applications are going to get a boost in performance. You're probably all very happy about that.
It's very cool. Now you probably had a chance to install Panther and run your application as well. You probably noticed that actually a lot of operations are very fast in Panther because our engineers spend a lot of cycles, a lot of resources, and time to make sure that operations are very fast in Panther.
We talked a little bit with Scott on Monday in the Mac OS X session, the State of Union, about things such as text drawing being twice as fast. All of this is great. But when our customers are going to buy the next G5, the dual 2 gigahertz, if they launch your application and they get the spinning cursor of death, that's not good.
That doesn't do us any good to have the best machine on the planet, to have the best operating system, if when somebody gets a third party application, the performance level is not there. And this is really the goal of this session, where we're going to be talking about that.
We're going to be talking about how you guys can take your application to the next level. We're going to talk about tools, because now we have actually very good tools to enable you to find out what are the performance issues in your app. We're going to talk about technologies that can help you improve the performance in your application.
We're going to talk about two different kinds of performance today. We're going to talk about the real performance, which is how long does it take to process an image that is 500 megabytes? How long does it take to run a filter? How long does it take to access the web, to download a URL, to access a database? But we'll talk as well of what we call perceived performance, which is what happens when a person executes.
Do you give feedback to your user, or does the application look kind of hanged? And if there is one thing I want you guys to realize and to understand with this session is that now I think Apple has shown you that we spend a lot of cycle and resources and we have an OS that is very fast. Mac OS X is way above mature, and you're going to see with the hardware, we've done an excellent job in looking at what we were doing wrong in performance.
Now I think it's up to you guys to go back to your rooms, to go back to your offices and start using Shark, start using Sampler, moving to Mac OS and use the latest technologies that we have. In the toolbox. In networking. In file IO. To really get the best of Mac OS X. And for that, I'd like to invite Curt. Curt is going to be talking about all these topics, and I'll see you for a Q&A.
[Transcript missing]
Thank you, Xavier. So, like Xavier mentioned, performance is critical to your application. If your app doesn't perform well, users are going to make judgments about the quality and the polish of your app and a Mac OS X as a system. So while we've been optimizing parts of the toolbox and of Carbon, we've identified some techniques to identify where there's performance problem areas, and we'd like to share those with you. We've also identified some techniques for improving those areas once you've identified what those problems are. And we'll share those with you as well. And also talk about new technologies in Panther. And also to reacquaint you with some technologies which will give you the best performance on Mac OS X.
So there's four problem areas I'm going to talk about today, and that's application launch. If it takes a long time for your app to come up and start processing events, the user's going to feel as if the system is sluggish. We'll talk about low-level services, and that could be with your use of memory APIs, how you're allocating memory, also with file network I/O and threading.
Drawing. Now this is huge. This is a huge aspect of your application. If the user clicks on a background window and it takes a long time for that window to come up and draw its contents, the user is definitely going to get the impression that your app is sluggish or maybe the system is not performing very well. And finally, we'll talk about poor responsiveness. So let's get right into talking about application launch performance. So in identifying what you can do to improve the performance of your app, it's good to figure out what's going on during launch.
Well, there's a lot of file I/O. The second the user clicks on your icon or double clicks on your icon, the system starts loading in the executable for your app, the associated resources. You're probably loading in preferences, maybe registering and initializing plug-ins. And on top of all the file I/O, the system or your app is doing its standard initialization. It's going through and allocating large chunks of memory, maybe doing some table initializations, et cetera.
So how do you find some of these problems? Well, during launch, to identify problems with how you're utilizing the CPU, launch it with Sampler. And Sampler's a utility that's found in developer applications. And by launching it from the UI, what Sampler does really is it takes a periodic snapshot of where your app is in the execution. So when you're done sampling it, it compiles this all together and you get a list of where most of the time is spent in your app. And so you can identify areas that you can focus on and optimize.
Now in terms of the file I/O, what's going on, you can use the command line utility FS usage. And this will allow you to identify what areas are being--or what file I/O is happening on behalf of you by the system loading in resources and whatnot. And you can identify--maybe you're loading in files that you didn't expect to like document templates or images that could definitely be referred to later. your time.
So some solutions for this. All those areas that we talked about, all the file I/O and all the initialization that's going on, you have some control over. Now internally, we compile our frameworks with the optimization for size, and we recommend that you do that too. So this will make the binary smaller, the system has to load less in. And you can do this in Xcode, there's a preference for optimization level, and you can use optimize for size, and you can also do that using other compilers as well.
What's really important is to always provide feedback to the user. If it's going to take a really long time for your app to launch, it's a good idea to put up a splash screen. Let them know immediately that your app has been loaded and things are going on. Even better than putting up a splash screen is to provide some type of progress within that splash screen. Let them know that some components are being initialized, new plugins are being registered.
And the best solution of all is to put up your main document window as fast as possible. This gives the user the impression, this perception, that everything's up and running. Even if the foundation and other parts of your app aren't really initialized and ready to go, if you can get that document window up as fast as possible, it gives them the impression that you're ready to go.
You should also avoid I/O during launch. There's a lot, like I said, that the system's doing on your behalf. If you can avoid reading in document templates or some preferences that could be deferred to a later time, that's excellent. And you should always avoid writing during launch. Launch is considered a read-only process. You're reading in parts of your application, getting ready to go. You shouldn't write.
And then also take advantage of lazy initialization techniques, and I'll talk about what that means in a second. But it's a technique that allows you to defer allocation and initialization of components to a later time. And closely coupled with this is to avoid static class construction. If you're C++ based and you have a static class defined in one of your modules, and what I mean by that is a global that says static, foo, and then the instance. Keep in mind that that instance's constructor is going to be invoked before main is ever called. So before you can even start processing events, there may be a lot of code that's being executed.
So I mentioned lazy initialization. What does that really mean? Well, rather than initializing a component in anticipation that it's going to be used, just defer that until a later time, until that component is actually going to be executed. And it's really as simple as defining a static Boolean that's initialized to false to indicate that your component hasn't been initialized yet. And later on during the execution of your app, before that component is used, check that flag. If it hasn't been initialized, go ahead and initialize it and then toggle the flag.
Now the reason lazy initialization is a good technique on Mac OS X and was maybe not a good technique on classic Mac OS, and what I mean by that is Mac OS 9 and earlier, is because Mac OS 9 had really tight memory constraints and you had to deal with the problem of memory fragmentation.
So it was a good practice on Mac OS 9 and earlier to allocate as much as you needed up front that you knew that you were going to be needing. So if that were to fail, you could bail out as soon as possible rather than letting the user make a document that, it may be a destructive operation later on when you needed this memory.
But on Mac OS X there's a lot more memory, there's a huge virtual address space, you don't need to worry about that. You should always code defensively still, but be aware that lazy initialization is a good technique to improve performance of launch. Now, next, we'd like to talk about the low-level services such as threading, memory, and network and file I/O. And to do that, I'd like to bring up the manager of the core services team, John Iaroci.
Thanks, Curt. So, threading, memory, and file I/O. What do these guys have in common? One of the things they have in common is that they're often not very apparent when you're looking at your application in terms of contributing to a performance problem. But the benefit in looking at problems, performance problems in these areas is immense.
You can get a lot of time back by looking into these three general areas. I'm going to go through these areas, both suggesting what APIs to use for these technologies and also some tips in the areas in terms of figuring out what performance problems you have in them.
So starting with threading. First of all, if any of you haven't heard the message so far, the Cooperative Thread Manager, the Carbon Thread Manager, is really not a... It was supported for compatibility with OS 9, supported for the Carbon transition. It's not a great solution for threading on Mac OS X.
[Transcript missing]
MPThreads and Pthreads are both fully supported OS X APIs. There are some slight differences, but they're very similar. It's mostly your choice on which APIs to use. It's mostly going to be dictated by the code base that you have.
The thing to remember with threading in general is don't, just because it's a preemptive operating system, don't assume threads are free. They're backed by kernel resources, so use them well. Make sure that you're making, that you're just making good use of these threads. A lot of times, because they're available, because they're kind of a nice construct to use, it's convenient sometimes, threads will be overused.
And that will contribute to your performance problems. Sometimes in a way which is kind of hard to find. And I'm going to go into that in the next slide. One good guideline that I'd offer you is think about the work that you're actually going to do on that thread.
Because remember, you have to create the thread. At some point. And you have to message over in some way to the thread to get it to do work. And oftentimes, even if you create the thread and pass it some parameters, you have to still wait for the thread to complete and then message back to your main thread. So there are costs involved just in using threads. Not to mention, you typically have to lock when you're using threads. And there's a cost there too.
Okay, in the tip area for threading, first of all, if you're not already familiar with it, in GDB there's some support for debugging threaded applications. The thread apply all directive lets you do various GDB commands. The obvious one that most people know is thread apply all stack crawl or backtrace so you can see where all the different threads are.
In Panther there were some changes so that the ID that's reported in GDB actually maps to your underlying Mach threads. So you can actually, if you can't tell from the stack crawl or if you don't have symbols for whatever reason, you can track it that way. The other tool to be familiar with is Top. It'll tell you at least how many threads you're using. This is good because you can actually see if you have threads coming and going, how many threads you're getting at kind of a high watermark, how many threads you typically have.
And then look at those threads and see, do you expect them to stay beyond the duration of a certain operation? Sometimes people just leave the threads sitting there. They're not doing anything. There's kind of a trade off but you have to decide in your app if it makes more sense to get rid of that thread or not. There are kernel resources associated with it so if you're done with it, you might want to just clean that up.
And then lastly, I'd really encourage you to look at the Thread Viewer app. It's the best tool on the system for actually visualizing what's going on with the thread. It lets you look on a per thread basis how you're using memory, what kind of locking usage you have, and thread priority.
Okay, after threading, I would really encourage you to take a look at how you're using memory APIs, particularly the Carbon Memory Manager. First thing I'd like to say is that when we looked at all of these APIs at the Carbon transition, we actually looked at some of these and said, "Do we really, really need them because there's some issues there?" HLock and 9Hunlock, unfortunately, fell in the category of they had to be there. They had to be there so your app would continue to work on 9.
So in order to enable a binary that worked on both 9 and 10, we had to support HLock and HUnlock. There's really no good reason to have them on 10. And there's a certain performance problem with frequent use of HLock and HUnlock, especially when they're doing nothing of use for your app. So take a look at how you're using HLock and HUnlock.
A lot of our apps internally we've looked and seen we don't bother, we continue to use it and we don't really need to use it. The only time that a handle is going to be changed kind of out from under you, if you will, is when somebody explicitly calls set handle size. So look and see if you're doing that yourself or look and see if you're relying on the side effect of calling HLock and HUnlock.
And then in general, memory and free are the APIs that you should be using for your everyday usage. The memory manager is another layer on top of these, so it's not going to help having that layer when you're really talking about performance sensitive code. And memory and free are what you want to be using when you use our debugging and development tools.
If you have seen Sampler on the Panther seed, you've noticed that there's new support in Sampler. It's really been revamped, so it let you visualize some of your memory usage patterns. In particular, tracing capabilities are there, which is fantastic. We've done a lot of performance enhancements in Panther based on this technology, being able to look and see how often we're allocating and what kind of allocation patterns we have. You can also use Sampler directly to look for your use of HLock and HUnlock and set and get state.
And then of course, malloc-debug and the leaks command line tool. That's both for the performance of your app and the performance of the system in general. And I get in the habit myself of just using top over in a side window, top-nash-w, so it shows you deltas, so you can see if you have a leak or if you're growing larger than you would expect to be growing during a particular operation.
Okay, so threading and memory is one area where you can really improve your performance by looking at it with those tools. Files is another. Files is one where I really recommend that you get involved with the tools, particularly FS usage. But let me start with the APIs. First of all,
[Transcript missing]
You'll see the benefit of using these APIs on those kind of volumes. And I really encourage you to test your applications on these volumes. You'll see different characteristics. It'll tend to exaggerate your file IOs, and you'll see that it does impact your performance. In general, the stance we've taken internally at Apple is to make sure that we reduce the total number of IOs. Every single IO eliminated is that much more time that you can devote to other things. And on a network, every single IO, particularly on a flaky network, is a potential stall for your app. So consider that.
[Transcript missing]
And then in terms of how to best use the APIs, make sure you use large page line buffers. That's still a very good thing to do. And when you know you don't want the data, you're just trying to write something out to disk, you already have something in memory, you're not going to go back and reread it from disk, use that no cache bit. That'll stop us from buffering it for you.
Okay, in the area of networking, CF Network, which has a pretty in-depth talk Friday at 5, is our first answer for you for most of your networking needs. CF Networking is designed to be very high performing. It's designed to work very well as an asynchronous API with the run loop, the CF run loop or the Carbon event loop.
There, again, the guidelines are to use large buffers so that you avoid going in and out of the kernel on your IOs. And if you really, really care about optimizing your performance, particularly over varying different link speeds, you probably want to look at using adaptive solutions. So look at the size of your buffer, see how much you read and write, maybe start with something conservative, double it up until you get to a point where you think you're getting a maximum throughput. This is particularly important over slower speeds, slower speed links like modems.
And then on the don't use category of networking APIs, OpenTransport is another one of those APIs which we brought to OS X to help you bring your apps over. It is not optimal. OpenTransport by itself has several threads in its implementation and it's there for compatibility. We're not extending it. So the two really replacements is either sockets if you really need that level of functionality or CF network.
And then if you're at a higher level and using URL access, there's two solutions there too. CF network if you really care about some of the details of the protocol that you're dealing with, so HTTP or FTP, or the new APIs in Web Foundation, NSURL request and NSURL response. response.
And then again, you can use FS Usage for measuring network. And you'd be surprised, just leaving FS Usage running on a terminal and just using your app lets you see how you're using the network. And then the best tool, particularly if you're interested in what's happening at the protocol level, is TCP Dump. That's the best tool that we offer to let you actually examine the packets in detail. Okay, and now I'm going to turn that back over to Curt to talk to you about drawing performance.
Thanks John. So next, we'll talk about drawing. Now like I mentioned, this is a huge issue. If you're not drawing as quickly as possible, and this is in terms of the amount of your drawing or the efficiency of your drawing, the user is definitely going to feel a sense of sluggishness. So there's three problems that I'm going to talk about in terms of drawing performance, and that's too much drawing, inefficient drawing, and then there's the text performance aspect of drawing.
So let's talk about too much drawing. That can be broken down into problems of scope and frequency. And that means that in terms of scope, let's say that you need to update one control, and rather than updating that one control's visual state, you're updating the user pane that contains it, or maybe the entire contents of a window.
And then in terms of frequency, maybe that you're drawing periodically, you're drawing too much based on a timer, or maybe you have a real time-based system that's pulling in some information for display, but the user's disabled that, so there's no sense in updating that. And that consumes resources from other apps and utilities running on your system and generally slows down performance.
So how do you identify whether you're drawing too much? Well, an excellent utility is to use Quartz Debug. Now this is in developer applications, and this will allow you to identify problems both in terms of scope and frequency, and it works with all Carbon drawing, whether you're drawing with Quick Draw or Quartz.
And to find problems in terms of scope, you would check the checkbox for flash screen updates in yellow. And what this does is every time something is drawn or flushed to the screen, it'll show you the con-- or it'll flash in yellow that area that's being drawn to the screen. So you can visually identify what areas are being drawn too much. If you just need to be drawing one control and you're drawing the whole contents of the window, you'll be able to identify that visually.
Now, in terms of frequency, you can check the checkbox for flashing identical updates in red. And this will flash parts of your UI with red every time the bits aren't changing but you're flushing the contents. So it'll help you identify where you're drawing too much and these areas don't need to be flushed.
So once you've identified where you're drawing, how do you resolve this? Well, this is really up to you. You need to identify what it is you need to be drawing, when you need to do it, and you need to go through your own code and figure out these areas for optimization. Now, of course, we have technologies that we'd like you to take advantage of, which will help in that, and I totally recommend adopting HIVU.
And the reason for this, and this is for your custom content, and this is also known as compositing mode. And the reason I encourage you to adopt it is because it is designed specifically to draw as little as possible. It's based, it's architected around an invalidation model, which is superior to a drawing model, because when multiple areas of your view need to change, you invalidate those areas, and then the view system will come down and call your main draw bottleneck, so you'd only be drawing when the system is telling you to.
So this encourages the use of, if you need to update multiple parts or many parts of your app or your view all at once, there will only be one net draw operation. It also allows you to specify opacity. And this is important because any view system is a hierarchy. And that your view may need to be drawn after the views behind it. Like if it's embedded in a user pane, which is also embedded in a window.
If there's any type of transparency with your view, all of these views behind need to be drawn first. Well, HIV can optimize this if you indicate to it that you are opaque. Your view is opaque or maybe part of your view is opaque. And the view system can optimize by not drawing the contents behind your view. I encourage you to go see session 425, "HIV in Depth." This will give you more information about HIV.
Okay, well, my content's not in HIView. It's not drawing in a composite window. What can I do then? Well, keep in mind that every time you change the state of your controls or a value of your control, that control's going to need to redraw in a non-composite window. For example, a scroll bar. When you set the minimum, the maximum, and the value of that scroll bar, it's going to be drawing three times.
Another good example is the pop-up button control. When you add items to that control, each time you do that, it's going to redraw. So if you need to do any type of bulk value operations in a non-composite window, hide that control. Do these bulk operations and then show that control.
So rather than having a draw count of n for that many of operations you're going to do, you'll have a net draw count of one. Also, you can mimic HIView-like behavior by paying attention to update regions. So don't draw things that are already valid. So you can pay attention to update regions if you're paying attention to update events. Or you can mimic behavior by using the window the valid window rect or valid window region APIs.
So next area of drawing problems, performance problems, is inefficient drawing. Now this may be because you're doing too much work to get the bits you need onto the screen. And that could be because you're using the wrong APIs or using APIs inefficiently, or maybe there's some better alternatives on Panther to do your drawing.
So again, how do you identify that you're drawing inefficiently? Well, the best technique is to sample your drawing code. And like I mentioned before, it's going to take periodic snapshots of your app so you can identify where most of your time is being spent and where you can optimize that. Also, visually go through your code, inspect what's going on within your draw path, see where you can optimize areas out, and here's some stuff you can look for.
Triple buffering or multiple buffering. Now again, on classic Mac OS, Mac OS 9 and earlier, you're dealing with constrained, really tight memory and it was generally easier or more efficient to redraw the contents of the window than it was to keep those bits around. Well, Mac OS X obviously has a double buffering system so that it takes care of tearing and flicker for you. So again, like on a classic system, if you wanted to avoid this type of tearing or flicker, you would instead employ double buffering techniques which is to draw into an off screen and then copy the contents of that off screen all at once.
So if you're maintaining code that was originally written for Mac OS 9 or earlier or targeted for those platforms, keep this in mind that if you have application level double buffering, this is completely unnecessary on Mac OS X in a sense to avoid flicker and tearing. Also inspect your drawing code and see if region processing is showing up in your samples.
If you have complex regions, this can be kind of expensive. Now, I should point out that on Panther, that the internal implementation of regions has changed. So it has better performance and there's no 64K limit anymore. But if these are showing up in your samples, it might be a good idea, or is actually a good idea to use simpler regions, like rectangular regions is a good example.
And if dirty region maintenance shows up in your samples, and this is because, let me just identify why that is, is that Quick Draw, each time you do a draw operation, it needs to maintain this dirty region that is going to flush to the screen later. So if you're doing a lot of Quick Draw operations, maybe this dirty region maintenance is going to show up.
So if you know ahead of time that you're going to be doing a lot of operations into a particular region, you can tell Quick Draw ahead of time using the API QD set dirty region. Then perform all your drawing operations and it completely removes the overhead. of Quick Draw dealing with that dirty region for you.
Now this is another important area. Like I mentioned on classic Mac OS, it's easier to regenerate information, or it's more efficient to regenerate information than it is to keep this information around. Well, redundant calculations may show up in your drawing code. And a good example of this is when we were optimizing the tabs control, we were looking through that. And if you think of the tab, the tabs control, it's a user pane, it's got the tab items with it with text drawing in there.
And each time we would go draw, we were getting the text dimensions for each of those items, figuring out the metrics of each item, and then going through with the draw operation. Well, the metrics of the text and the labels, they're not going to be changing that often, so it doesn't make sense to recalculate that each time. It's a better idea to cache that information. So go ahead and cache to avoid this problem.
You may be mixing Quick Draw and Quartz drawing. Now, this can be inefficient because the Quick Draw port is not synchronized for you with the Quartz context that backs that. So avoid this type of drawing. So, for example, if you're going to be using Appearance Manager APIs for your drawing, which you should avoid, and I'll mention that in a moment.
There needs to be some synchronization that's behind your back to keep that QuickDraw port in the CG context in sync. If you must mix this type of drawing, quick drawing, quartz drawing, do it in blocks for each framework to avoid extra overhead. So draw with all of your quick draw APIs first, followed by all your quartz drawing, or vice versa.
So on the front of using new APIs, using more modern APIs, we recommend using Quartz. I mean, for all of your primitive drawing, you should be using Quartz. So this requires you to generate your information in Quartz native formats, and you can do this at build time. So if you can convert all of your information to Quartz native formats like PDF at build time, that's good. And you can also do this at run time.
For example, there's the Quick Draw API to convert or to draw a PICT into a context. So at runtime, you can take a PICT, you can draw that into a context associated with the PDF context and use that, cache that information, and then use that image for subsequent drawing.
So, you're probably thinking, "Well, Curt, you're telling me to draw with Quartz, but all my drawing code is based on Quick Draw. How do I do this?" Well, to get a CG context from this Quick Draw port, you'd use these APIs: Qt Begin CG Context and Qt End CG Context.
And now what this does is it will cache that context for you, so you don't have to. So this allows you to print with both CG and Quick Draw drawing on the same page. But you should keep in mind that in between these APIs, Quick Draw drawing is going to be disabled.
Now also, you shouldn't use the API QD begin CG context, draw using some courts, and then end that context, and then repeat that over and over. You should consolidate all of your course drawing between these APIs. Also keep in mind that the CG context and the graph port aren't synchronized, so if you need to synchronize the origin or the clipping region, you'll have to do that on your own.
So what's the difference between a flush and a synchronize? Quite simply, if you do CG context flush, it tells the draw system that that context is going to be flushed at the next opportunity it can be. So it's generally an immediate response. and CG Context Synchronize is for a delayed flush.
So this is more like queuing up different contexts to be flushed. Contexts, contuses? So what you would generally be doing is doing all of your drawing and then call the API CG Context Synchronize to let the draw system know that that's going to need to be flushed. And then the HI Toolbox during its draw loop will flush the context for you.
Other APIs you should be using are the new HITHeme APIs. Now, these are new in Panther, and it's a simple mapping between the old Appearance Manager APIs and these new APIs. So, for example, if you're drawing some theme primitives, like using the API Draw Theme Button, there's an equivalent API called HITHeme Draw Button.
So I encourage you to check out hitheme.h. And the reason these APIs are now available is because when you're using--because we're doing all of our drawing, the toolbox is doing all of its drawing with courts natively. So if you're drawing with the Quick Draw APIs, the toolbox needs to either create a context for your port or use a cache port and deal with the maintenance of that.
So by exposing these APIs, you're in charge of the context. It requires you to create a context and maintain that context for your drawing. And all of these APIs require an orientation. This is because the high-level toolbox in Quick Draw deal with a top left origin-based view system and courts due to its PDF background deals with the bottom left origin with an orientation.
So you can use these APIs for doing either types of drawing. And generally you would do this because if you create a context, it's going to be bottom left origin based. But if a context is given to you by the toolbox, so for example if you have a custom view, and during this draw Carbon event, it will hand you a CG context, and that will already be transformed to be top left origin based. So that would be normal orientation.
So also check out session 409, Carbon AHL Toolbox for information about this. So why would you want to do this? Well, number one reason is performance. I went through personally through many of the controls, the standard controls in the system, and converted them from using the Appearance Manager APIs over to the new H.I.Thema APIs.
In some cases, we were seeing the controls drawing twice as fast with respect to on Jaguar. Most of the common controls will draw 25% faster. So this is solely for performance and you should definitely be using these APIs rather than old Appearance Manager APIs if you have that in your code.
Text drawing. We got a lot of feedback about the text drawing performance on Mac OS X. We want it fast, and we've done a lot of work to improve that. So if you're going to be drawing user interface text, rather than using the old Appearance Manager APIs for getting the text dimensions and drawing, which were Get Theme Text Dimensions and Draw Theme Text Box, you should instead use the new HITHeme APIs, HITHeme Get Text Dimensions, HITHeme Draw Text Box. Now, this is over twice as fast on Panther than on Jaguar. And you can also see that the raw performance of Draw Theme Text Box has definitely improved.
You should definitely be using HITheme DrawText for your user interface drawing. Not only does it give you better performance, but it gives you better control over how you're going to be laying out the text and drawing that in terms of flushness, truncation, and how many lines are going to be drawn.
Now let's say that you're handling text drawing on your own. You're using Atsui for your drawing. So some tips for using Atsui. Well, keep in mind that that API is a paragraph-based API. So you should be creating a text layout per paragraph. Also, reuse the layout and the style objects where appropriate.
Rather than destroying a text layout when you're done with it and then recreating another one to set it to look at a new text buffer, keep that layout around and then set it to look at a new run of text using the API atsu_layout_set_text_pointer. So reuse these objects. It eliminates the overhead for a lot of redundant destruction and then reconstruction of those data structures.
Also, if you need to lay out a paragraph at a fixed width, use the batch API for breaking a line, which is @sue-batch-breakline. Now, of course, this is rather than manually going through a break loop and using the @sue-breakline API. This will be much faster for doing that.
Also, when you're drawing text, if you need to get the dimensions of that text, use ATSU Get Glyph Bounds. Now, this is a good technique also because under the covers, it's going to cache the layout for you. So a subsequent draw using ATSU Draw Text is going to be very fast.
Also, you should be creating and associating some font fallbacks with that layout if performance is a concern. If you don't specify a font fallback's object, it's going to rely on the system fallbacks, which may be expensive to generate. And again, keep in mind that the ATSU styles, it's not one-to-one in terms of ATSU layouts. You don't have to have one per layout. So what you'll generally be wanting to do is creating a style and associate that with a number of different layouts. Finally, I'd like to talk about just general responsiveness issues on Mac OS X.
Spinning Cursor. Most people are very familiar with this spinning rainbow cursor. That's definitely a problem that we want to avoid. Other parts of your UI may become non-responsive while you're tracking the mouse. The user's tracking a custom control, for example, and the app becomes non-responsive. Parts of your screen aren't updating. Maybe the CPU is pegged while you're tracking this. This is a starvation of resources from other utilities that are running on your system.
Or maybe the app is just slow to respond, it just feels sluggish. Now, you can find some of these problems using Activity Monitor in Applications Utilities, and this will allow you to identify the CPU utilization. You can see when the processor is pegged. and many others. And you can also identify problems using developer applications spin control. Actually, that's singular.
And this utility, what it will do is when it's launched, whenever the system spinning cursor comes up, it'll start sampling your app or it'll start sampling that app that has caused that to spin. So you can see where the time is being spent and why did it stop responding. Also from the command line, you can use top.
So do I identify how to resolve this problem of the spinning cursor, which seems to be an issue? Let's understand why this is happening. Well, In general, what it is, it's being caused because your app is no longer responding to events. So the user has input some keys in the keyboard.
And in a perfect world, all of those events get picked up by your app and processed immediately. But sometimes your app is busy. You're processing loads of information. You're looking for intelligent life. And so events that are coming in the system are not being picked up by your application.
So what the system does is it puts up this spinning cursor. And that really is correct behavior. What it's doing is it's telling the user that your app is not responding anymore. If you're an event-driven application and you are no longer picking up events, you are not responding. And that's correct behavior.
So what are some solutions for this? Well, the best solution is to call into the Event loop and start processing those events and that means that if you're completely Carbon event based and using run application Event loop, get into that API, start allowing events to come into the app. So that means to adopt Carbon events or--because by adopting Carbon events, the system takes care, the toolbox has its standard event handlers and they'll take care of tracking for you.
And if there's improvements in the system, we can give those improvements to you for free. And thread your app. If you do need to do heavy duty processing, you need to do--look for intelligent alien life, consider putting that on a worker thread and then continue processing the user events on your main thread. And also look for polling.
Of course, like, am I polling? Again, you would use the same utilities as I mentioned earlier. You'd be using activity monitors to see when the CPU utilization is at or near 100 percent, or you'd be using top to find out the CPU usage. Now, if the user clicks the mouse and the CPU goes up to being pegged, you've got a problem and you need to figure out what to do about that. So, there's four main reasons you'd probably be polling.
And there's four alternatives for doing this. Now, if you need to track the cursor, for example, that the user's clicked in an area of your view or your custom control, and you need to track where the position of the cursor is, rather than using the very familiar while mouse down, get the location of the mouse, you would instead use the API track mouse location.
And when you call this API, the system blocks. So rather than pegging the CPU, you're being extremely quiet. Nothing is happening in your app until some user interaction happens. When the user moves the mouse, this will return and give you an indicator of where the mouse location is and what happened. Why did it return? Did the mouse move or did the mouse button come up? Now, if you need to determine where the cursor is on your screen because you want to provide some type of rollover behavior, maybe you'd be pulling the cursor location periodically.
Well, instead of doing that, you would want to use the APIs for tracking regions. Those are the mouse tracking region APIs. And these APIs are incredibly cool because they will allow you to specify a region of your window that you're interested in, and then you will get notifications via Carbon events when the mouse has entered and exited that region.
Another reason you may be pulling is because you want to find out when a condition has changed. I want to see when preferences have changed or a seed value has changed so I know to reload those preferences in. So, rather than doing that, you want to watch for notifications.
Now, this is as simple as installing a callback, which is essentially what a Carbon event notification is. You're installing a callback, registering for certain notifications to come in, and that callback will get called when that has happened. So you can watch for notifications, like, let the system tell you when conditions have changed rather than asking the system continually.
And finally, you may be idling by pulling the system clock. And what that means is you want to wait for a specified time to elapse before continuing execution. Well, rather than pulling the system clock and spinning, you would want to use timers, Carbon event timers, which you would install and have some code execute after this time has elapsed.
Or just jump right into the event loop. There's an API called Run Current Event Loop, and it allows you to specify a timeout time. So you'd call that API and your timers will continue to fire at that point, and then the execution will proceed when that function returns.
Finally, I'd like to mention asynchronous window dragging. This is definitely in the realm of perceived performance. This is a new facility for Carbon application, and it allows the Windows Server to deal with the dragging of the window. Now, this frees up the app from doing this. So, if you are very busy and your application has become unresponsive, the Windows Server will take care of dragging the window for you. So, this will contribute to the feeling that the user is still in control of the system, and there are still things that they can do even when your app is hung.
So you're probably tired of listening to me talk, and so I'd like to bring up the high-level toolbox demo boy, Guy Fullerton! Okay, so I want to give you a brief example of how you might be able to track down some performance problems in your application. And to do that, I'm going to use an application we've been working on, which is a game.
Now, we've been working on this game for a little bit, and our beta testers, we have this extensive set of beta testers, and they've all been complaining about two things. One, the app launches slowly. And two, after you play the game, actually there's three things. Two, after you play the game, the system kind of slows down. It gets sluggish. It's like our app is stealing cycles from somewhere. And the third problem is that on slower machines, the game doesn't deliver a very good experience.
The interface gets choppy. But to understand this, let me go ahead and launch the application. I'm going to do it via the command line, just because that's going to allow me to more easily do some other tests in a second. So if you watch down on the dock, this is the application. Watch it bounce, and it's going to take a little while to load up.
Yeah, four or five, yeah. So it took a couple of bounces. So here's the game. It's a pin the tail on the donkey game. And, you know, pin the tail on the donkey is pretty easy. You just have to drag the tail over to the donkey. Oh, I better start a new game first.
So I start the game and the countdown, count up, actually starts. And then I start staring at you. And that's actually the challenge of the game. Can you possibly get the tail on the donkey while I'm staring at you? Well, I did and I got a high score. I'm pretty good at it.
But, you know, I've been riding it. I've got to be pretty good at it. So that's the game. But the main complaint that users have been saying is that it takes a while to launch. So let's go take a look at that. And there's two ways I want to look at this.
The first way is through the use of SpinControl. Now, SpinControl is one of our performance test tools. It will detect when an application is unresponsive, and it will automatically sample that application. Now, I know my app is unresponsive during launch, so SpinControl is going to automatically detect that and do a sample. But I wouldn't necessarily have to use SpinControl to generate the information I'm going to generate. I could use Sampler to generate the same information. It just so happens that this is a little bit more convenient for my situation.
So we'll launch it again. And as it's launching, you'll see SpinControl updated its interface, saying, hey, it was unresponsive and it sampled the app. And then I get the results. If I double click on that and open it up, I can see the backtrace for the time various operations took during App Bring Up. So of course I was in Maine for a couple seconds and I was setting up my game window and setting up my high scores dialogue. Oh, and then I started doing some preferences operations.
Well, you know, it occurs to me that I load my preferences, my application preferences at Bring Up time, but most of my preferences is just a high score list. I store a really big list of all the scores that have happened on the game. And, you know, the high score list really doesn't matter at App Bring Up time.
The only time I consult it is after the user has played a game and I need to compare and see if it deserves to be on the high scores, or if the user displays the high scores window. So this real quickly, by rethinking the way I initialize my preferences, would allow me to cut about a second and a half off of my application launch time.
So the next thing I want to look at with respect to app launch time is FS usage. John talked about FS usage. There's a bunch of different options you can pass to it to generate different amounts of information. And I'm going to pass the -w option, which generates some extra info about errors and file sizes and things like that. And I'm going to pipe that through to grep just so I see the file system operations that happen with respect to my application. So if I turn that on and relaunch my application.
it's going to generate a bunch of information. Now, most of the work that is in this log is stuff happening on my behalf because of the system. You know, my application executable is being loaded, things like that. And that's all fine. I've already gone through this and looked at one interesting thing. Now, I load some images in my application in the form of PNG files.
And the first one that gets loaded-- I don't know if you can read that up there-- but the first one that gets loaded is called ddonkey.png. That's the big donkey image in the window. I need that at app bring up because that's part of my interface. And that's fine. And then I load the donkey tail, and that's cool. But I find another load here, and it's a title PNG. Well, I don't actually use that title anywhere in my main window. I only use it in my rules dialog.
So that's another way I could actually speed up my application launch performance, is just load that PNG file only when the user brings up the rules dialog. Now, some of the extra information that the -w option passes back is the amount of time this particular load took. And you can see that it's just a fraction of a second that it took to open that file. And there is-- Yeah, if I look at the file identifiers, yeah, it's not a whole lot of time spent actually manipulating that file, but every little bit counts.
And if you look for things like this, you might actually find that the loading of files at app bring up time is just a symptom of a larger problem. So you should go through FSUs, find out what sorts of things your application is doing, and see if they really need to be done at app launch time.
So once the application is running and I play a game, I finish a game, the users say, "Hey, you know what? Subsequent games are either sluggish or my system just completely bogs down and let's go ahead and launch Quartz Debug here because it's got a little CPU gauge, frame meter, and oh wow, sure enough, it looks like my interface isn't doing much but I'm eating up some CPU speed and I'm trying to render over 90 frames per second. So something's going wrong here. There's a couple different ways we can look at my interface and figure out what's going wrong. The first way I want to look at is flash identical updates. So let's get this out of the way.
[Transcript missing]
"My time display, even though the game's not running, which is a first problem. But another thing I notice is that even though the time display, if I get rid of it, even though my text is only about 24 pixels tall, I'm repainting this entire area every time I repaint the text.
Well, so this is something I could go in and fix in my nib. What I should really do is I should shrink up that field that I'm using to display the time with. And that would save me some processing cycles. The OS would not need to render quite as much.
Another thing I can look at here with Quartz Debug is to flash screen updates. So again, we're going to see this time rendering constantly, which is unfortunate because I'm not playing the game. But I would expect that to flash yellow if I actually start a game. So let's start a game and see what happens. Okay, so far so good. Nothing entirely unexpected. But as I actually start to drag the tail around, wow, I'm seeing a lot of drawing happen here. And you know what? I didn't really intend to do this.
Every time I move the mouse, it's repainting the entire donkey picture. What a much better scenario for this game would have been to only repaint the place where the tail was and the place where the tail is now. And that's going to save me a lot of time because, you know what, just the delta between a tail move and its old position and its new position is pretty small. I'd probably say easily a 40th of that entire image size. So that's going to cut down my draw time there by about a 40th.
And that will make my users much, much happier. And then I can spend my time working on better things like good gameplay. So anyway, hopefully now you see a couple ways that you can use our tools to identify and track down some of the typical problems in your applications. Curt? Thank you Guy. That was such a great demo. He's like an excellent programmer on the Toolbox team because of that.
So, finally, I'd like to just kind of end up on some general tips. Now, I mentioned earlier that you should cache frequently used information or frequently calculate information with your views. Well, this is really important to do for frequently used information or information or data that's very expensive to calculate.
Keep in mind that there is a time and space tradeoff. And what I mean by that is, if you end up caching 50 megabytes of information, and then the user clicks in another application, and then the system all of a sudden needs that 50 megabytes, it's going to start paging it out to disk.
And then when the user clicks back into your app, it's going to page all that information back in. And it may have been--just been more efficient to just regenerate that information. So just be aware of that. The key here is frequently used information or very expensive data to create.
Again, always provide feedback. Now, if you're just going to be doing a simple operation, put up some chasing arrows. For some time-consuming process, put up a progress bar. I mean, if the user thinks that something's going to take 10 seconds and it's going to take them four hours, that's something that they should probably know. And always allow them to cancel. This is really important to let the user be in control of your application rather than your app being in control of them.
So finally, I'd just like to point out that you should get familiar with the performance tools we have on the system. There's a lot of utilities on the system, many of which we haven't talked about at all. But get familiar with Quartz Debug, Sampler, Spin Control, Mallet Debug, Shark.
Go check out this session on Carbon Performance Tools. Also, become familiar with the new modern APIs. Quartz, which you should definitely be using. H.I. Theme, use that on Panther. Take advantage of Carbon events. This is going to give you more control over your application, better performance. Take advantage of HIView for all of your custom drawing.
And keep in mind that we've spent a lot of time optimizing the toolbox and other parts of Carbon, and it's really everybody's responsibility to keep performance going. So finally, to go through the roadmap and other reference information, I'd like to bring back Xavier. Right on time. Excellent. Thanks, Curt.
Okay, I really hope that you guys got the meat of what we're trying to achieve here, and it gave you a couple of good ideas to go back to your offices. And I would really, really appreciate if you could spend some time finding out what's going on in your application. Once again, lunchtime, try to find out about the drawing. We have great tools. I don't know if we mentioned, but we have Shark as well, which is a very, very cool application.
A little bit like Sampler, but it's going to give you ideas as well on how you can optimize specifically for the G5. You know, saying maybe you should run some loops, maybe you should actually use Altevec in that part. So remember, we have a brand new set of tools that can help you identify the hotspots in your application. So make good use of it.
All right, just after this session, we have a feedback forum for the toolbox. I'd encourage you to assist if you had any questions, if you want to give us your feedback, things that you like, or things that you really like, or things that you really, really like, and things that you don't like, okay? Maybe. Session 305 on Tuning Software.
The first session of the talk with Performance Tools was Wednesday for those of us that can travel in time. And tomorrow morning, we're going to have a great session, session 310, on debugging and tuning common applications with Apple's tools. We'll be showing you a couple of cool tricks with Xcode on how to improve a debugging experience, how to get the most of your time while debugging common applications. I encourage you to go there.
Oh, I'm forgetting the best, of course, for the end. Session 425, where Mr. Ed will be talking about HIV in depth. HIV was introduced last year, actually not in this room, but at WWDC. It's really the future of drawing controls on Mac OS X. And if you have any custom controls in your application, you really have to go there.
Should you have any questions, you guys have been changing the slides again. Anyway, should you have any questions, please don't hesitate to send me an email at [email protected] and I'll be more than happy to find out how we can help you. We have some documentation and I encourage you to go on our website on developer.apple.com.
We're revamping completely our website to make it easier for you guys to actually find information. So check it out, see like, you know, all the sections that we have on Carbon. We've had it in the last month or so, actually, a brand new sample code as well for like, HIView. Look at it inside the Carbon sample code and you're going to see two different sections. You're going to have HIToolbox and HIToolbox for Mac OS X. So just check out the HIToolbox for Mac OS X section. Great start there.
A couple of technical notes: Mac OS X Quick Draw Performance to help you find out what's going on with your drawing in your app and what should you be doing now on Mac OS X to get the best of performance for drawing. We have things on the file manager and some QAs here available for improving at Suite Text Drawing.