Configure player

Close

WWDC Index does not host video files

If you have access to video files, you can configure a URL pattern to be used in a video player.

URL pattern

preview

Use any of these variables in your URL pattern, the pattern is stored in your browsers' local storage.

$id
ID of session: wwdc2005-637
$eventId
ID of event: wwdc2005
$eventContentId
ID of session without event part: 637
$eventShortId
Shortened ID of event: wwdc05
$year
Year of session: 2005
$extension
Extension of original filename: mov
$filenameAlmostEvery
Filename from "(Almost) Every..." gist: ...

WWDC05 • Session 637

WebObjects Performance Optimization

Enterprise IT • 57:39

Optimize and tune your WebObjects application. Learn about tools and techniques for collecting and analyzing application performance and identifying areas for improvement. We will give tips for improving WebObjects, EOF, and Java performance.

Speakers: Bill Bumgarner, Max Muller, Ravi Mendis

Unlisted on Apple Developer site

Transcript

This transcript was generated using Whisper, it may have transcription errors.

I'm Bill Bumgarner. I manage Core Data for Tiger, and I'm doing some Xcode stuff now, and I've always had a love with WebObjects and have been using it forever. And of course, as you know, WebObjects is the world's best rapid application development tool for app servers. And with that is the paradigm of just-in-time engineering. So in the interest of that, we're making sure that two of the other presenters are late. Ravi's here. Max will be here soon. So let's get going. We'll fill in the blanks. I'm Bill Bumgarner. I will be joined by Max Muller and Ravi Mendis. Ravi wrote a wonderful WebObjects book. And both Ravi and Max work on the Music Store. Let's--as we dive into this, we really wanted to do something completely new this year, and the first thing we wanted to do was take a step back and really look at the landscape. What's your target market for deploying WebObjects applications and application servers in general? And in the 10, 12, 13, some odd years that we've been doing this stuff now, there's been a few things that really have changed quite a bit, and there's some surprising things that have stayed the same. In particular, we have oodles of bandwidth now. Sadly enough, the United States has less bandwidth than most of the rest of the countries in the world. But that changes the picture a bit. We need to also look at the current state of the art of HTML-based user interfaces and the growing market of alternative interface technologies. And finally, we really need to look at how has this changed optimization of WebObjects applications and, more importantly, WebObjects solutions. because rarely is it just about one or two applications now. So bandwidth. Yes, there's just tons of bandwidth out there, and it's cheap as dirt these days. Broadband is widely deployed. I mean, you even have broadband to phones now. And it's more and more commonly a requirement in sites, because there's such an emphasis on multimedia content. We're seeing more and more often there's flash stuff being shoved in there and movies and sounds and you name it. So the little bit of HTML your applications are generating anymore, as long as you're generating it very quickly, the actual bytes on the wire just don't matter that much. Also, server environments have a lot more bandwidth than they ever have before. Your back planes and your server farms are going to typically be gigabit ethernet. They also have a much faster connection to the net at large, or at least you can obtain faster connections for cheaper than ever before. And just basically the whole server infrastructure has a lot more bandwidth and power available to it.

So where's the clicker thingy? There it is. And now let's look at the user interface, though. Now what's interesting about the HTML user interface is that it really hasn't changed at all in the last few years. It's still kind of primitive. We have CSS now, which has allowed us to shuffle some of the bits around for defining the user interfaces. You can push more of the interface architecture, the construction of the UI, into these CSS files, and then apply them across the site so that your customers even can have more control over the look and feel. But by and large, it's still TDs and TRs and forms and inputs and the same stuff we've been doing for a long time. But what has changed is that the baseline feature set of the sites have become much more intense. They're much more dynamic. In particular, there's more inferential content. More often now, you have to generate content based on where the user's been or what they may have done in the last year or purchase histories. You may have to be generating content based on partnership agreements and marketing agreements and buying records of all the customers on the site. And these are really the features you start to need to have to implement into your systems, and not just commercial marketing systems or point of sales systems, but also in back end systems to stay competitive. As we've seen with Spotlight on Mac OS X, what does Spotlight do? Well, it allows the user to stop thinking about the individual items in their environment and to start thinking about just finding stuff, just letting the system suggest where things are and point out the right directions. You're also required to provide more customization to your users. Given that you may have hundreds of thousands of items that the user is managing at any one time as opposed to dozens or hundreds. the user needs to have a lot more control over the presentation.

And what all this means is that you now have a requirement to use less static content. And there's also, obviously, the increased integration with external resources. Various different kinds of web services, like Google's Ad Sense and the search features and some of the product matching features and forms and things like that, they have really matured a lot. Some of them are very impressive. And integrating with those is becoming more and more critical to remain competitive.

Now, also at the same time, while HTML hasn't advanced, alternative interface technologies have become huge. It's just not just about HTML. Dynamic Flash production is becoming more common, Java applets, all kinds of things. And one of the areas that's really, really important or has grown hugely in the last few years has been web services. And really, this is one of those just overused garbage words that you can throw out and just think of it as an XML API. It's a bunch of methods. You call them with an XML call, and it gives a bunch of XML back. Now, on the XML side, one of the most popular ones is RSS, which you see a lot of new services being pushed out about that. That's really just an alternative user interface. And then, of course, the most popular alternative user interface in the world is the iTunes Music Store, which is really just an XML interface on a WebObjects application with a very nice rendering engine on the other side. Likewise, on Tiger itself, many of the dashboard widgets are actually using calls back to web servers to spew out information. So you have these beautiful photorealistic renderings of otherwise simple XML content.

Now, on the HTML front, while HTML hasn't changed, it's being used in very new and different places. You see, like, for example, in Xcode, the documentation window is an HTML viewer, and it has integration with the servers at Apple Developer Connection as well as static content on your machine. Plus, the user can command-double-click a piece of code, or, I'm sorry, option-double-click a piece of code, and go straight into the documentation. So now you have source code integrated with HTML. You also have the WebKit, and on Windows, you have other widgets that allow you to embed HTML in any random application. So now you suddenly have the need to have very small fragments of HTML that are sitting in the middle of otherwise normal desktop applications. This changes the way your backend applications need to deal with the data.

And of course, app servers are generally now have to talk with other app servers to keep track of what's going on. Sometimes it's not sufficient just to do all the communications through a database or through notifications and then faulting in new data. So the end result really here is your application is going to be generating content that's destined for use in all kinds of contexts outside of a big browser window. And that really changes the optimization picture.

In particular, now WebObjects applications more than ever before, application servers more than ever before, are all about the database. And it's about active monitoring of the environment and load balancing adaptively in response to the changing usage patterns of the application server environment and of your customer base. The increased use of inferential content Sorry, I can't even read that. The increased use of inferential content means that you've got a lot of content that's going to be dynamic where it was static before. Now, what's interesting about that is even though it's inferential data and its user interface that needs to be generated in response to users, that doesn't mean it has to be real time. So if you look at things like Akamai and some of the caching technologies that are out there-- and that's when we start talking about really huge sites-- the ability to pre-render and predict where users are going to go, adapt to the popular pieces of your site, and render that stuff statically so that when a user comes in and asks for something that is dynamically generated based on sales data or whatever, it's actually coming from a static cache. That's where you can really achieve a lot of performance. You can vastly reduce the loads on your servers.

The higher complexity applications also will have large traffic loads that exhibit these failure modes that are really weird. It's just, you know, you don't know where your customers are going to come from. You don't know what they're going to focus on at any one time. And because of the high traffic loads combined with this adaptive or inferential content combined with the need to integrate with a bunch of different systems, it's complex. And simulating that kind of load on the desktop environment or on a developer's desktop or even in a pre-production server rack is nearly impossible.

So it's absolutely critical that you instrument your application so that you can understand the load patterns and understand them accurately and adapt accordingly. And finally, with very complex systems, it's extremely difficult to always know what's going to happen. They are chaotic systems, with any chaotic system, you know, one little variable can change in exactly the wrong way, and the whole thing will go insane. So, you know, peaks happen.

Pathological user behavior happens. You got to deal with it, which is why not only analysis, but then putting the tools in place that detect that kind of thing and figure out what needs to be killed or figure out how to restart stuff or things like that is very important. And in some cases, you know, it's a case of adapting the loaded environment by adding servers on the fly or adding particular kinds. So that's where we are today. We are in a very different world than we were when I was on stage six years ago talking about WebObjects optimization. It's not just the HTML generation. It's a very complex picture. And we are lucky enough to have two people with us today who have worked on the highest traffic WebObjects site in the world. And you can pretty much rest assured that any performance problem you've seen, they've seen it on a scale 100 times larger. So with that, I would like to bring up Ravi Mendis to talk about optimizing design. Thank you, Ravi.

Thank you very much, Bill, for that introduction. Good afternoon, everybody. For those of you who didn't catch it, my name is Ravi Mendis, and I work with Max on the iTunes Music Store. Now, for the first half of this presentation, we thought we'd focus on design optimization. As Bill mentioned just now, typically you optimize an application once it has been built. That is, you identify those bits of your application that is used the most, and you optimize those bits. It's the 90-10 principle that I'm sure most of you are fairly familiar with. However, that does not mean we can't build and design our application with performance and scalability in mind. And to that end, we have three design goals we'd like to focus on. Thank you. today, and the first of which is inspired by a personal hero of mine, the legendary Bruce Lee.

true refinement seeks simplicity now his philosophy or mantra can be applied or translates very well into software engineering in that sometimes to improve something we need to make it simpler and for those of us who can't afford the luxury of re-engineering it helps to keep it simple to begin with Design goal number one, keep it simple. Reengineering is expensive. And this is particularly true of your database and model. It becomes incredibly expensive, if not prohibitively difficult, to reengineer your data and model once you have thousands upon thousands of users, and your database is gigabytes upon gigabytes in size. So you need to get it right from the start. And we do that by keeping the model simple. Now we're smart guys, and we like to come up with interesting solutions to problems. So it takes a huge amount of discipline and restraint to not over-engineer anything. As a rule of thumb, it helps to not over-abstract your model. Arguably, a concrete model is superior to an abstract one. And in general, a deeper understanding of your business or your problem domain will result in a more concretely architected model.

And finally, minimize use of inheritance. EOF inheritance adds a significant amount of complexity to your project. And in particular, in applications that scale up, like the iTunes Music Store, you must bear in mind the performance penalty that you might incur if you do implement inheritance. And while we're on that topic, let's compare the three inheritance methodologies.

What we're going to look at here is if you were to perform a fetch of entities-- of EOs, rather-- from an entity in a parent or root entity in an inheritance tree. So for a single table inheritance, this would simply require one fetch. But in an inheritance tree modeled using horizontal table inheritance, say an inheritance tree consisting of n concrete subclasses, a fetch into the parent or root entity will result in n fetches. The equivalent fetch in using vertical table inheritance will be n fetches. In fact, n joins over m tables if you have where your inheritance tree is m levels deep. So very clearly, vertical inheritance is at a disadvantage here. And perhaps that may be why it has gone out of fashion and out of favor in recent years. So typically, you'd be deciding to-- you'd be choosing between single table inheritance and horizontal table inheritance. Now, you might also say that, well, I rarely need to fetch the entities, the EOs, in a root or parent entity in a horizontal table inheritance hierarchy, and that just might be the case. It all depends on your requirements. However, if you do choose horizontal table inheritance, do be aware or be careful of what are referred to as ambiguous relationships, that is, relationships that point to a parent or root entity in an inheritance tree.

For example, if we have a horizontal table inheritance hierarchy consisting of 20 concrete subclasses, if you were to have a relationship into that parent or root entity, in order to resolve that relationship at runtime, EOF will fire 20 fetches to the database. And in a big application, that can be a significant penalty. So when in doubt, use single table inheritance.

Let's discuss a couple more features about inheritance, a couple more pointers and caveats. So the first one is try and use flat inheritance hierarchies. Avoid deep trees. Ideally, keep them simple, keep them one level deep. Use abstract superclasses. Don't use concrete superclasses if you can't help it. And actually, the second point is very, very important. Do not combine or mix and match the methodologies. One must remember that EOF is an object relational tool.

And these three methodologies are approximations to OO in a relational model. And although technically possible to mix and match, that is have a hybrid hierarchy consisting of part horizontal table inheritance. And finally, be economical with inheritance. And by this I don't mean avoid inheritance altogether. By all means, use inheritance if it helps to enrich your model. However, just don't go overboard.

And case in point, as Bill mentioned earlier, the iTunes Music Store is by far most definitely the flagship WebObjects application at the minute, the largest, certainly most visible and talked about. Now, we have over 400 entities defined in a dozen or so models. However, Max and his team of the original iTunes Music Store architects made the incredibly bold, if not quite radical decision to keep inheritance to a minimum. We have just one inheritance tree that is one level deep. And this decision has actually paid off hugely, if not enormously, now that the music store is scaling to the dizzying heights that it is.

And with that, I think we have design goal number two. Exploit the database. Now, these database servers have been around for a long time, two, three decades. They're based on solid mathematical foundation, and they're incredibly good and incredibly efficient in doing what they do. If you want to build a fast application, as Bill mentioned earlier, you want to really leverage the power of your database to drive your application. There are three features of EOF that I'd like to look at today.

I should say that EOF, although we love it for insulating us from the intricacies of SQL, at the same time, it also obscures us from the richness and the power that the language provides. And subqueries, for one-- Raw rows and prefetching can be used to more fully exploit SQL in our application. So subqueries can be used to implement aggregation.

They're excellent for that. Raw rows are ideal for fast search results, and prefetching, which can be used on edit and inspect pages. Let's take a look at subqueries in more detail. Given that there's not a lot of documentation on subqueries at the minute, this slide is a quick how-to construct an aggregate attribute as a subquery.

For those of you familiar with the movies model that ships with web objects, let's consider for a minute the requirement to add the number of movies to a studio list page. Now typically, this would involve performing a count on the movies relationship. Now there are two disadvantages to this. The first is that when you fetch and display the list of studios, that will perform not only the fetch on the studio's table, but will also perform n fetches on the movie tables for the n studios being listed. The second disadvantage is that in order to perform the count in memory, EOF has to fetch and store those movies in memory. So we can actually implement this slightly more elegantly or more efficiently as an aggregate attribute exploiting a subquery. So on the studio entity, implement a new attribute called moviesCount. And the SQL that you see up there is actually the sub query that we insert into the column field.

You then set the attribute to be derived, and you make it read-only and not locking. And that's all there is to it. And the advantage here is that when you now display the studios, you will only be performing one select statement. In fact, it will be a select with a subquery. And you won't have to fetch the movies into memory as well. So there are two advantages to this.

And a couple more advantages and caveats of subqueries. I should say that they're excellent for implementing other sort of aggregate functions as well, like max, min, and average. They're incredibly fast. These database servers are very good at executing these subqueries. And thirdly, they can be implemented fairly elegantly as EO attributes. There isn't need to execute raw SQL at the adapter level in order to do things like this.

However, do know that one of the caveats of using subqueries is that on large data sets, there can be a performance penalty. And this happens if the table you're subquerying is too big, or the table that you're subquerying from is too big. In either or both of those circumstances, you will get a significant performance penalty. So do use it with caution, as with any technology, it can also be abused. And case study, we use subqueries extensively in the music store. In fact, it plays a critical role in the content management system that we work on.

So next, let's take a look at raw rows. Many of you, I'm sure, are familiar with raw rows. So we'll go through this really quickly. The advantages are that they're lightweight. They're easy on memory because you fetch them as dictionaries instead of fully-fledged EOs. And the second point is you can generate optimal SQL, as in select or join statements.

However, one of the disadvantages, of course, is because ROROs are read-only, you need to fault them in order to convert them to an EO before you can edit them. And that requires a FETs to the database. So when to use ROROs? They're ideal, excellent for fast search pages, list pages, and finally, to cache read-only non-referenced data. Now some applications, front-end web applications, tend to be mostly read-only, and in which case it would make sense sometimes to cache some of your non-reference data into your shared editing context, a feature that we shall talk about very shortly. And an example of such read-only non-reference data, you could consider the iTunes music store. The front end is essentially a read-only application. So when an iTunes client views albums, artists, and songs, that would be an example of read-only, non-referenced data.

And with that, we move on to design goal number three. Now, this is-- Design goal number three is also inspired by a personal hero of mine, the aviator Howard Hughes. It can be said that building applications is a little bit like building airplanes, that sometimes to make planes go faster, or for that matter, go further, they make them lighter. Certainly they did that in the days of Howard Hughes in the film "The Aviator".

Howard Hughes breaks the speed record by flying one of his own planes with the fuel tank almost emptied. He only had just enough fuel to break the speed record. Indeed, nowadays we don't have to go to such extremes, but the challenges of modern aviation no longer about speed or distance, but about capacity, about building larger planes that can carry more people. and equivalently could be said of the iTunes Music Store, no longer is performance so much a challenge as is scalability. And in order to build ultra-efficient, highly scalable applications, you make them lighter. So design goal number three, minimize your memory footprint and optimize or minimize recession state.

And the logic behind this is the less memory your session consumes, the greater the number of sessions your application will be able to service. So try and implement as much stateless pages as possible. A rule of thumb, if a page does not require authentication, that is user login, it can then be implemented as a direct action. So use direct actions and stateless components. Number two, share data across sections. Leverage your EOF shared editing context. It's a relatively new feature of EOF, and I hope many of you do use it. Third, try not to add layer upon layer of middleware to your application. Apart from obscuring the use of standard APIs, it adds to the memory footprint of an application. And as I said before, the more memory your application consumes, the fewer the number of instances you'll be able to run and the fewer the number of sessions you'll be able to service.

And fourthly and lastly, partition your application functionality. A fairly typical partitioning of application functionality is front end, perhaps read-only, and a back end admin tool. And an example of that would be, again, the iTunes Music Store. We have a pseudo web service front end, which is pretty much read-only. And we have a content management tool as a back end. Now these two applications are just the tip of the iceberg. They form a suite of applications, in fact, that make up the iTunes Music Store.

Let's take a look at the first point in a little more detail. Optimize your session state. As I said, we do this by really taking advantage of direct actions and making as much of your application stateless. So use direct action pages wherever possible, and then fetch EOs into local editing context.

Now, for example, if we were to have a search page that fetched EOs or search results, into your session's default editing context, those EOs will stay in memory until that session is discarded, that is until the user has logged out. It makes more sense to fetch those sort of results, search results, into editing contexts that are local to the component or the action, because as a result, they will be discarded at the end of the request response loop.

And third, implement stateless WebObjects components. Now, these are singleton components that service the entire application. And they're very handy. They can be used in stateless direct action pages as well as WebObjects pages. And finally, and this is in fact very key, is to minimize your session state. Try not to scope your variables within your session. Only do that when absolutely necessary. And this helps to keep the session memory down to a minimum.

Let's look at EOShared editing context. Again, I'm sure many of you are fairly familiar with this, so very quickly, the advantages. It's ideal for sharing read-only data between sessions, and emphasis being on read-only. In fact, Bill suggests that your shared editing context is a bit of a misnomer. It should really be your read-only shared editing context. But there you go. It's also thread-safe. And thirdly, in fact, most importantly, it reduces database traffic.

The kind of reference data that we mean that should be stored in a shared editing context, things like countries, currencies, states, things like that. Now consider for a minute a page that has a pop-up consisting of countries. Now every time that page is requested, if countries were not shared or cached as shared EOs, the EOF will perform a select statement onto the database.

In a large application like the iTunes Music Store, these frequent but small fetches could, in fact, throttle the database connection. So shared editing contexts serve a second purpose, and that is to reduce database traffic. And case in point, the iTunes Music Store, we have over 40 plus shared entities, and we experienced a significant performance boost using shared EOs.

And finally, we come to the second half of the presentation. But before I introduce Max, who's going to demonstrate and talk about some advanced optimization techniques, I'd just like to say a few points. The key to successful optimization, just like any other science, is observation. It is careful observation. To monitor the application youth edge is what can help identify those bits of the system to be tweaked and optimized. And to that end, there are a set of tools at your disposal to help with this analysis. And the first port of call, as I'm sure most of you should or probably are aware of, is to turn on your SQL logging. And you do that by setting your adapter debug enabled on.

The chances are that 20% to 30%, if not more, of your optimization issues will be database-related, in which case you will actually identify them at this point. After this, we need to turn to some more sophisticated tools to optimize our application, and the first would be your WoEvents page to profile your application using WoEvents, something Max is going to show up very shortly, and monitor your usage using WoEstats. You can even take that a little further and use web server log analysis tools like Webalyzer and Webtrends to monitor usage.

And with that, I think I should introduce you to Max Buehler. He's Manager of Content Provisioning at the iTunes Music Store. Everybody, Max. APPLAUSE Excellent. Thank you, Ravi. As Ravi mentioned, my name is Max Muller. I am engineering manager for content provisioning and operations at the iTunes Music Store. I've been in the WebObjects field for many, many years and with the Music Store since the inception. So I've experienced all of the different performance problems and tuning problems that we've gone through as the load has increased and the performance has become very critical with the Music Store. So what I thought I would do is actually go through a number of the knobs and screws that you can tune for your entire deployment.

Oftentimes we focus all of our energy on basically getting the WOA to perform exactly the way we want it to, making sure it's not doing extraneous fetches, these types of issues. But then when we actually go to deploy the application, we find that the performance is suffering or the application is not responding in the correct way. we find these kind of very strange anomalies going on where we have one application that's taking down all the rest of the applications and these type of issues. So the first point at which any request comes into your site is through the web server. So this is essentially the WoW adapter. This is the piece of code that runs in Apache or one of the other web servers. And it is the Achilles heel and the choke point for all requests into your application. So it's important to understand exactly how to tune that before you actually even get to your application.

So in terms of the WoW adapter, there's a number of different ways in which it can be configured. And the most important bit being that the default configuration is most definitely not what you want for every single one of your applications. It's probably not even what you want for one of your applications.

So within the parameters, the first point to look at are all the timeout settings. Within one application setting, you have what's called the connect timeout, the send timeout, and the receive timeout. So breaking that down, and also I thought that I'd put up here what, yeah, this is essentially a snippet from exactly how we have Music Store configured today for one of our basically storefront, it's a sessionless read-only application that essentially renders all of the album pages, store pages, these type of pages. So the send timeout, or the connect timeout, sorry, is how long the adapter will wait trying to connect your application. If your app's down, if it is completely busy processing all the other requests and its queue is completely full, how long is it going to sit there and basically wait on your application? Setting this to something very high, you'll start to get, this is where you start to basically get no instance availables back. Setting it to something very low means that if your app is very busy and just doesn't have time to respond back, to basically take in the request, then you're going to start snapping it and it's going to start getting marked as dead. So playing with this app, playing with this setting is important in knowing how much time basically a buffer for that. The second one is the send timeout. This one's not nearly as important as how long is it going to basically wait to send the request into your application.

This is more important if you're dealing with large uploads of files, these type of things, because that's how long it's going to wait before it basically snaps the connection. The third one is the receive timeout. This is by far the most important of the three timeouts. This is how long the adapter is going to wait for the response. So setting this to a very high value means that if your application wedges, then essentially it's not going to be getting the response back, and the adapter is going to wait there forever. The most important part about that is that as long as it's opening the connection, your web server is holding that request open. OS X, you can handle about 800 requests before the whole box wedges up, and you have to go in and start killing Apache processes.

So having a very high timeout here means that if something goes wrong in your environment where your apps all of a sudden stop responding quickly, that's essentially your window of time for how long before your web server backs up. So for the store, we have it connected 5, 20, and 30. It means we're going to wait 30 seconds for the page to render, after which point we're going to snap it. So the second bit is the retries. This is, if you don't get a response back from an app, how many times should you basically, should you take the request and say, well, let me go try this application. Let me go try this one. So within the store, it's fine, it's read-only. But configuring this for a finance application where somebody clicks buy song, and all of a sudden it can't get a response back from this finance, so it says, hmm, I'll go ahead and hand it off to this one. And this one would mean that the user might all of a sudden say, well, I only clicked the buy song once, and all of a sudden I got charged three times. So you have to be careful with that one, because that's the number of times it's going to retry.

And so the third one is the dead timeout. That's the one on the far right-hand corner. So when you pull this page up, the dead timeout's actually the countdown. And what happens is that if the application doesn't respond or can't connect to it, the app gets marked as dead. When an app is dead, then no other requests are routed to it. So the dead timeout over here is 20 seconds. So that means if, for whatever reason, your box goes offline, that the adapter is going to route a request to that guy once every 20 seconds, which means that that request is for sure going to get rejected and handed off to another guy. Or if you have a finance application and you've just lost a box, that means that the user is going to basically get a nice little, sorry, music store is not available right now, come back again later. One out of every 20-- once every 20 seconds, one user out there is going to be getting that response for every application that's dead and completely offline. So having a very high dead timeout-- so this is also very important when you bounce applications. Because if it takes three minutes for your app to come up and you bounce an app, then that means that you're going to basically be routing over six requests to it-- no, sorry, nine requests to it-- before it's actually even able to process requests. So the temptation here might be to basically bump that up very high. But then you start to get this nice swimming effect, where if you have a three minute dead timeout, and all of a sudden and you get hit with a large load, and the apps are just taking a while to process it, then everything gets marked as dead.

until you basically only have like 10 guys who aren't dead, at which point then the adapter is just like, you know, zoomed in on these three guys and basically going da-da-da-da-da, da-da-da-da-da. And those guys eventually fall over, and then nobody's processing your requests. Your web server's back up, and then all the dead timeouts basically undo all at one time, at which point then the adapter's like, here you go, and throws them all in, and, you know, everything gets marked as dead again, and you get this nice swimming effect where, you know, your web server's back up, then your app servers go idle, I have app servers come back up, and it just goes back and forth like this.

So the no instance available means that it tried it, the number of retry amounts, and it wasn't able to get a response back. That's really all it means. I get the question usually from every person who starts on the team after a while, what exactly does this mean? So I thought I'd basically just throw it out there.

So that's the adapter. So these are all the knobs that you can basically adjust on the adapter level for how the requests get to your applications. So it's important to kind of sit down when you're basically starting to do your deployment and set it up and go through exactly what you want. The third thing I should point out is the instance numbers. If you basically have the default load balancers round robin, it's important to basically get the mixture correct in the sense that the first few times that we basically deployed our search application, which is a very high traffic, fully multi-threaded WOA, we would noticed these effects where we'd get these kind of ripples that would go through when we'd start to get under load. And it took us forever to figure out that essentially what had happened is that when the SAs had set up the systems or set up the adapter configuration file, they had basically done it according to, you know, we were running like four to eight instances per machine, and they had just basically put instance 1, 2, 3, 4, 5, 6, 7, 8 on the first machine, 9, 10, 11, 12, 13, 16 on the second machine. So we've got a whole fleet of machines here, but all of a sudden the adapter just goes sequentially down. So it's good to basically get a nice heterogeneous mix. Even if you're going to run four or eight instances on the same box, you want to make sure to basically spread it up in the instance numbers, or else you're going to get these effects where you're hitting your apps harder and not taking advantage of all your applications that are available.

So now after the adapter, before we've even gotten to start processing the request, so what are some of the tools that we can basically tune at the WOA level? So the first is the listen queue size. Now the listen queue is essentially the thread, the server socket thread that sits there and listens for requests. This is the one that if it's full, then the connect timeout from the adapter is basically not able to get in. This is also essentially the queue-- the complete queue size that your application will hold to basically hand off to worker threads to process. Now, the default is 128. So... What we noticed is that this is incredibly high value. And it's interesting because what we would find is that we would get hit with some slowdown would happen within the system. Either the credit card processing would have a blip or the web servers would freak out for a moment or we'd get a database oopsie or something along those lines would happen. But we would notice that it would take 15 to 20 minutes before all of the apps would come down in terms of, you know, the CPU would be up to the roof, and they'd be sitting there processing, doing a whole lot of stuff, and the web servers would still be freaking out. And we couldn't figure out what this was because we'd see this, you know, this one blip that we could definitely identify, and then we'd see, like, 15 to 20 minutes of pain afterwards. And, you know, so we're basically going through all the different settings. We're like, oh, what's going on here? And just nothing but broken pipes throughout the logs. So what we actually found was that, you know, we hadn't basically adjusted the listen queue size. So we're under quite a bit of load. All of a sudden, we get a blip, which means that your apps basically stop for five, 10 seconds, where they're waiting on the database to come back, waiting for a lock or something like this that we hadn't anticipated. Meanwhile, the adapters are sitting there continuing to feed your WOAs, and the listen queues are basically queuing up.

So then what happens? Then basically the lock comes back. Your app starts processing again, but it's got 128 requests to process, not to mention the adapter is all ready to basically give you more. And those 128, if the blip was long enough, and remember the receive timeout was set to 30 seconds, so if the blip lasted 30 seconds, you've got 128 requests there that the adapter has long since severed the connection on. Your WOA is going to sit there and just basically sit there and diligently chew through every single one, finish rendering the page, get it all ready to go, then go to write it out and be like, oh, broken pipe. Well, let me grab the next one off the queue. Work through the whole thing, then oh, broken pipe again. So the queue size is pretty important because this is kind of how, this is how many you should basically have, how many requests you can handle in flight and still be within that receive timeout window. So 128, you're going to have to have one hefty, hefty whoa there and have some really fast responses to be able to get 128 of those guys out. All right, so next step. So after the queue, you then get handed off to the worker thread.

So the worker thread, these are the actual ones that go through the dispatch request and actually generate something and then hand it off. So the thread size is interesting in how you basically build your application. You know, typically we won't have more than a worker thread size of actually four within the music store. And that's mainly because even for our multi-threaded ones, we don't want to get too many of these guys going at the same time. We're all on XSERVs. We've got two processors. And for our fully threaded ones, they're CPU bound.

So you get 256 of these guys going on an XSERV. And basically, all the threads are sitting there waiting for processing time. And so it actually brings down your overall performance level. So you've got to be careful with the worker thread size, because the worker thread size and your CPUs, the current processing that you have available, that's where you want to make your trade off there. And-- But even on a single-threaded app, the temptation is always basically to just have one worker thread there. But you typically want to have two or three, just because it does take a long time sometimes for large requests to basically stream back out. So even after you've finished in the dispatch request-- dispatch request, by the way, is fully threaded. FYI, even if you have a Woken current request handling set to false, dispatch request fully threaded.

So anyway, so those are some of the knobs that you can basically tune even before you start processing your requests. One word of caution I would point out, we did have cases where we were basically looking for long-running requests that had basically screwed something up. And so the first implementation was basically just to nuke the thread. But FYI, that'll leave EOF in a very nasty state. Best just to basically system exit if you want to basically go something like that. So there's your WOA. that's how you can basically tune your WOA server.

So the last bit that I'd mention is essentially how do you then tune your deployment. So you now have your adapter set configured correctly. You know basically how you want your WOA to behave. So now you're kind of looking at your whole deployment architecture because you're now basically ready to deploy. So one bit that I'd mention is you might want to separate your web servers because one of the bits that, you know, within the music store, we never want to stop taking orders, ever. But at the same time, you know, we have other very high-performance things like search and, like, the store pages.

And if you have all of your web servers essentially sharing one comp file and handling the HTTPS and the HTTP, then... The bit that you have to watch out for is that any one of your applications can basically take down your web servers. And when your web servers go, that's your Achilles heel, you're done. So one of the bits that we've had to basically separate out, and you know, if any of you basically went to the music store on the first day, it was about two to three hours of pain getting everything up. Turns out search was completely golden. It was fine. It was like, I'm ready. But, you know, we were having problems with downstream systems and other things. But because one of our woes was having problems, it basically took the whole farm down. So FYI, you can look at basically segregating it out and having web servers dedicated to each one so that you don't have a single permanent failure. And also looking at the CPU usages and what things are waiting on, it's sometimes better to have a heterogeneous mixture of applications instead of having just one box completely dedicated to one type of instance because it's definitely not going to be the resource utilization.

All of our finance applications, they wait on, They wait on commerce servers. They wait on PayPal calling out. They wait on AOL. They're doing all these external services. So they spend most of their time just waiting on external services to come back. Whereas the search applications, they're all about CPU. They're doing the search. They're basically trying to render out the pages. So doing a mixture of the finance and search applications means that we get to get the greatest throughput from using the fewest number of boxes. One word of caution I would also note is that the temptation to look at a box and say, oh, it's got two gigabytes of RAM. Therefore, I can put four search instances all running at 512 megabytes. So this is how we originally configured the systems. And we would get these spontaneous reboots where all of a sudden the servers would be going along, and we're getting all pissed off at the Java team or the OS X team. We're like, gosh darn it, you guys, man.

And why are our apps bouncing after we can't keep a server up for more than a week? It's getting really annoying here. What's going on? And lo and behold, we kind of had to eat a bit of crow on that one, because what was going on was that even if you say X and X, 512, the Java app should not take more than 512. On the system memory, it's actually going to take about, it can take up to 612, which means that we were actually, what would happen is that as all the instances would get going and everything like that, eventually one of them would trigger swapping, at which point when the box starts to swap, it's just dead.

And the box would start swapping so bad that we couldn't even get SSH shells in to figure out what was going on. And then eventually it would just basically spontaneously-- they'd have to run an ops guy over and push the button on the front. And so we were getting all pissed off about it, but lo and behold, it was actually our fault because we misconfigured all the servers. And so we had to go back and kind of-- so I add that as a little word of wisdom that look at your memory allocation usage. And one of the other bits that we've always struggled with is when should we basically go and add more instances? And every app is different. Every deployment's different. But as a general rule of thumb, what we've found is we essentially watch the CPU loads on the boxes, and we watch the average idle times for the responses. Because we have seen these effects where we'll be going up like this with the time, and then essentially we'll get a flat line for a while. And then sometimes it'll peak up again and then kind of go back down. And what we actually found was that we would actually basically be starting to cap out. That even though all the systems are running fine, CPU usage is fine, but looking at the average response times, we would be starting to basically throttle ourselves. The adapter can only basically be handing off so much time and so many requests, and there's always basically this buffer window of just the amount of time it takes to basically do the mechanics of getting requests to you. So even if your woe is completely fine and you're not CPU-bound, you might want to look at essentially what you are bound with your average idle time between requests for adding more instances.

And the last bit that I'd mention is when looking at your deployment scripts for bouncing large applications that are under load, you always want to look at doing basically rolling bounces. That minimizes the startup time if you're using shared editing context and things like this that might have a heavy database performance hit. Or it also minimizes the dead time. Because remember that once you basically take down that application, the adapter is going to basically mark that guy as dead. And if you have all your apps getting marked as dead because you just did a real fast bounce of all your apps, then you could have these periods where no requests are getting in. And when no requests are getting in, they usually will up on your web server. And again, when your web server stacks up, you're dead in the water. So those are some of the aspects that you can look at kind of for your deployment tuning. So one bit I'd just like to go through is, you know, my team works very closely with Daryl's team.

And with the music store, we've developed a number of kind of diagnostic aids that, you know, no promises, but we're working with this team. And so I'll just go over a few of the things that we're finding that we've done so that you guys can think about as well. So one of the bits that we've looked at is basically doing stats collections with standard deviations. Because we basically pull up a woe stats and be like, oh, by song took 113 seconds. Well, is that happening often? Is it just once in a blue moon? So we basically are looking at kind of the standard deviations and trying to figure out what the norms are. We also have looked at -- we also added a notion of a worker -- kind of a watchdog that will sit there and watch for long-running responses and then just start giving kill/quits to the application. 'Cause we want to know if something is taking more than 60 seconds to process. Throughout the environment, it could just be one particular instance that's having problems, or it could be a lot, so we kind of have these things that'll kind of be doing kill/quits to the instances that are just kind of watching the threads, it sees when it gets handed off to the dispatch request, and then it just kind of hangs out, looking for things, and we kind of have some scripts that can kind of look look for problems that are happening across the board. And then the detection of large database fetches is one that we've done just basically down at the-- it's the database context delegate. You've got fetched objects coming back. So we have these things that just look around, and if you fetch 17,000 objects, probably not intentional.

And we were seeing these ones. And this is where the watchdog and the large fetch would come into play, because we would see these things that-- we'd go through the whole QA cycle, and we'd get out there, And we'd be going, and we'd see these astronomically large fetches come through-- 27,000, 25,000 objects. And it was happening as part of a buy request. Well, it turns out that we had add object to both sides of relationship enabled for this table that records any time you're supposed to be downloading a song. And if you remember back a while back, Steve was talking about this one user who had purchased 27,000 songs. Well, it turns out it was that user who was a very, very good customer. He keeps buying, and every time he'd buy, it must have been getting slower and slower for him.

And so we were triggering these large fetches, of course, in our QA environment. None of our QA engineers had gotten anywhere near that close, so we hadn't tripped anything. None of this stuff was happening, so we weren't even noticing it. But that was a fun one. So we also basically had looked at getting statistical graphs over time using MRTG, which is an open-source, you know, tracking software for sampling over time, which is fairly easy to integrate.

And then also just being able to look at kind of editing context. Anytime they're created, they kind of get recorded, and they'll tell how many objects are going on. We've used this to find out leaks where we're calling editing context off of an EO. Very handy, except when it's in the shared editing context, and then you start using the shared editing context to fetch non-shared EOs in.

So we were watching, and every now and then, every about five minutes, we were noticing that the shared editing context was bumping up by one over time. And lo and behold, somebody is basically using a shared EO's context to basically fetch a regular object in, which is just like a tarball, at which point it's not going away until the app starts. SQL stats, this is one that we looked at just for sheer stats collection, because the large fetches are interesting. But if you're doing 125 single fetches, you're probably not prefetching something correctly. So even though those are very, very fast, and the database is like, I got the whole table of memory now, and I can give them back to you as fast as you want.

the latency of basically going and getting those things is probably slowing down the request response loop. Also being able to kind of look at the total properties that you have within the system. You know, we've got 15, 20 frameworks. They all have their own properties files. We have deployment properties. We have all these kind of mixture of properties. And so a lot of the times, you know, when you're trying it, when you're looking at it to debug a problem going on in the environment, you're like, what is this thing configured with? So being able to kind of look at and reload and touch properties has been very helpful. And again, I mentioned having the independent web servers. So we had to move to having independent web servers for our monitoring tools, as well as the front end. Because oftentimes, when you want the information the most, is when things are going wrong. And when your web servers are down, then you can't monitor as well. So having a different set of web servers, that you basically are talking into and looking into your applications, versus the web servers that are serving the applications, we found that to be very beneficial as well. And so I apologize. Demo gods were not with me. But my machine is actually kernel panicking when I connect in to this.

So I did have a small demo set up to basically show being able to do property turnaround, being able to dump stack for long-running requests. So I apologize for that. But that is actually the session. And now, more information. At this point, I think we are opening up to questions. So... Oh, I just don't need this. Yeah, we got a WebObjects feedback session after this. Um.