Automated Testing on Mac OS X - WWDC 2004

Development • 58:29

Using practical examples, this session teaches you how adding an AppleScript interface to your application can provide an efficient and powerful way to create thorough automated testing. Real-world techniques are highlighted that you can immediately begin to incorporate into your work. This is an intermediate-level session.

Speakers: John Comiskey, Doug Simons, Jonathan Gillaspie

Unlisted on Apple Developer site

Check out Bezel, our iPhone mirroring app →

Transcript

This transcript was generated using Whisper, it has known transcription errors. We are working on an improved version.

Good morning, everybody. Welcome to session 301, Automated Testing on Mac OS X. I'm John Montbrien from Apple Developer Relations, and I'm pleased to introduce your host, John Comiskey, who will be speaking to you. He's a senior engineer from AppleScript Engineering. And welcome, John Comiskey. Thank you. Thank you, John. Good morning, and thank you for being here. I'm John Comiskey. I'm an engineer in the AppleScript group.

I'm going to talk to you today for about 15 minutes about using AppleScript to test your application, and then some gentlemen from Redstone Software will be talking about Eggplant, a much more high-powered way of testing your application. So this session will cover ways to use AppleScript to test your application. Thorough testing is crucial. You want to make sure that your program works correctly before you ship it to your customers. And the only way to do that is with thorough testing.

And making sure everything works exactly right. We like to encourage people to use AppleScript to test their applications because it has a lot of advantages, it's very flexible. You can test what you want to test. You write the test yourself, you control them. It's extensible, every time you add something new to your program, you can add new AppleScripts that test those features. It can be very complete.

You can probe every single corner of your application, press it into all of its boundary cases this way. It improves your accuracy. You can keep track of what the results of your tests were from one execution to the next and make sure that you're getting the right answer.

You can repeat those tests time after time. Each time you change your application, you can repeat all your tests and prove to your boss that you haven't broken anything. And it's controllable. You say what's going to be tested. You can have a small test that you run every single time that you build the application, a larger test that you run just before you turn it over to your testing organization, and then they can have a massive suite of tests, like I said, that probe every single corner of your application. And the best thing is that it all goes a lot faster than if you're trying to do it by hand.

So we're going to cover lots of things today. You have to plan for scriptability. It just doesn't happen all by itself. You're going to want to instrument your code so that when you're driving it with scripts, you know what's happening. You can follow the flow and make sure that that's what you wanted to have happen. You want to ensure that you have code coverage. Like I said, probe into every single corner of your application. Make sure that it all works, not just the most frequently used stuff.

Last year in Session 311, we talked a lot about using Apple's GUI scripting to test your application. I'm going to say a little bit more about that today. And then the most important part is to sustain the effort. It's nice to have a good set of tests, but it's very important to keep them up-to-date as your program changes. And then, like I said, the fellows from Redstone will be talking to you about Eggplant.

Last year, we had a very long session about how to design a dictionary. That's session 414. If you have the DVDs, you can watch it there. If not, it's available on the developer website. We also promised you some guidelines for putting together your dictionary, and Chris Nebel did an excellent job of that, and those are now available also on the developer website.

So you should definitely look at the scripting interface guidelines. Even if your program is already scriptable, you should look at the guidelines because it might help you out. Even if all you do is go through and change some of the comments, your dictionary might be easier to use.

If you're using Cocoa to do your scriptability, it's necessary to align the objects in your dictionary with the Cocoa objects inside your program. Even if you're not using Cocoa, you're going to have to have some kind of mapping from your dictionary to your code. In Cocoa, it's very direct and one-to-one.

You're going to want to streamline this though from the user's point of view. You don't necessarily want to show them all the ugly guts of your program. You want to name things what your users call them and make them interact the way your users are familiar with using them through the GUI. If you absolutely have to, you can put in a private test suite that doesn't ship to your customers, which will allow you to get to some of the lower-level things that you know you need to test, but are not likely to be useful to the end users.

And you're going to want to cover all the functionality of your program. You're going to want to make sure that there's some way to test and get at everything your program does. That might be a pretty daunting task at first. So we encourage you to go ahead and phase this operation, and each release add a little bit more scriptability than the time before. And that way you'll get there at least in a couple of steps.

To know what your code is doing inside, it's got to be instrumented. And you can do this lots of different ways. If your company standardizes on a particular kind of profiling software, then of course you should use that. But if not, there's other ways to do it. First thing you're going to want to do is build a skeleton of your code.

If your application already exists, you're going to skip this part. But if you're writing a new application, you're going to need Cocoa objects for each of the objects in your AppleScript dictionary. You're going to have to have accessors for all the properties of those objects. And you're going to have to have accessors for all the elements that can be inside of those objects.

If you have commands in your dictionary, you're going to need methods to handle those as well. If the command is directed at a particular object, such as send mail, then the method goes on the mail object. If it's directed towards the application itself, such as sleep, then that method is going to be coded slightly differently. And it's going to attach to the application rather than one of your objects.

You're going to need to add some kind of instrumentation to your code. What I do is really very simple. I want to create a record of where my program's been. I want to know every routine that I've called, maybe what the input looked like, and maybe whether it passed or failed. So you're going to want to create a record of where you've been. You're going to want to log any relevant error conditions that occur. If a file is supposed to open and it doesn't, you're going to want to log that.

If you've got a problem with leaks in your program, you can track allocation and deallocation and make sure that everything balances. And like I said, it doesn't have to be fancy. This is what I use. It's just a couple of macros, and if you set that switch from zero to one, all of these macros expand. When you run an AppleScript, you'll get a very extensive list of what your program's done. And then if you set the switch back to zero and recompile, it all just goes away. So it's not part of the program that you ship.

When you look at these logs, you're going to want to do a couple of things. You're going to have to understand why your code did what it did and justify it. You don't want to be running off into the weeds and doing a lot of computation that's not necessary. If you generate an error condition, you're going to want to investigate why. You might be testing an error condition, so you're doing it on purpose. Or you might be trying to, like I said, open a file and for some reason it's not there.

You're going to want to balance all of your allocation and deallocations and make sure that you're not leaking. And you're going to want to retain a record of all of this so that when you repeat the test later, you can make sure that it did the same thing the same way. Or it didn't do the same thing the same way, and that's the fix to the problem.

And this is an example of what I see when I run one of my programs with the logging turned on. I've got my initials in there because it gives me a nice string to search for that's not likely to occur anywhere else. It lets the people I work with know who it was that put this log message in there in case they want to know what it's about.

And this here is fetching the startup disk from the System Events program. It calls the startup disk

[Transcript missing]

You're going to want to test all the corners of your program. You want to test trivial things, things that you don't expect to work, edge conditions, really large numbers, really small numbers. You're going to want to force all the errors. What happens when you tell your program to open a file that doesn't exist?

You want to make sure that the end user is going to receive a meaningful response that tells him something's gone wrong and you need to change your script. One thing that AppleScript can't do particularly well is get at the glue code that you write to bind your view and your model objects together. AppleScript drives your model objects more or less directly.

If you've got a lot of glue code that stages things back and forth between your model objects and your view objects, you're going to need a way to test that. One way is to write less code. If you use something like Cocoa Bindings, you'll write a lot less glue code and you'll have to do a lot less testing. You don't have to test the code that you don't write.

Apple's GUI scripting we talked about last year. It can be used to do a lot of things. Mostly it can access things that are not scriptable to begin with and allow you to get past blockages in your workflows. There are several things it does particularly well. It does exercise all of your glue code.

Since it does come in through the view, it's going to exercise your glue code. And if that's what you're trying to do, GUI scripting can be a good way to do that. It recreates end-user scenarios exactly. If you've got a report from the field that if I select this menu and then this menu and then press this button, something bad happens, then you can recreate that scenario exactly in your tests and see what it is that the user is talking about and then repeat that test over and over again.

If you're doing a phased implementation of AppleScript in your program, you're still going to want to test the rest of it. And you can use GUI scripting to do that. You don't have to write any code for GUI scripting to work. You just have to turn on accessibility. And if you're testing your program specifically for 508 compliance, there's a tool that's going to be introduced here this week for testing accessibility compliance. You can also use GUI scripting to do that yourself, starting right away.

You're going to need to use GUI scripting if your workflow depends on other applications that you don't control. Perhaps your application interacts with an application written by another company, and that application either isn't scriptable, or some key part of it that you need isn't scriptable. You can use GUI scripting to get past that. This is what GUI scripting was invented for, for unblocking workflows for scripting the unscriptable.

And it may be what your customers are using. If your application's not fully scriptable yet, your end users may have worked around that by writing GUI scripts of their own. If they've integrated those into mission-critical workflows, you're going to have to make sure you don't break those things before you ship a revision of your code.

GUI scripting has got some limitations, though. It shouldn't be used as a substitute for your own scriptability. It's easy to get started on GUI scripting, but it's hard to maintain. One of the reasons is a lot of applications, the view layer has a complicated containment hierarchy. There's controls inside of controls inside of controls.

And if you ever revise your application, relay out any of the screens, then you're going to have to also revise all of your GUI scripts as well. And whatever it is you're trying to do is going to be bounded by the GUI. You might not be able to generate the stress conditions that you'd like to just through the GUI. And if that's the case, you're going to want to use object model scripting to do that.

The most important part, I said, is sustaining the effort. It's easy to get started, add a little bit of scriptability to your program, be very happy with it, use it to do some important testing, and then set it aside. You'll revise your program every year, every couple of months. When you add that functionality, you're going to want to add testing for it, and the way to add testing is to add scriptability at the same time.

Whenever you do revise your program, you're going to want to run all of your prior tests, everything that you've ever done. Make sure that you haven't regressed anything. Make sure that you haven't broken anything. So you should hang on to all of your tests, and you should hang on to the results of all of your tests, so you can run them over again and make sure they do the same thing each time.

And when you add bug fixes to your program, or you get reports from the field from end users, you're going to want to write test cases that cover that. Make sure that the end user has what he needs. You're going to put those in your testing suite as well, and repeat them every single time you revise your program.

So, these are the things I've talked about today. You've got to plan for scriptability. It doesn't just happen. You've got to instrument your code so you know what happened and why. You've got to probe into all of the corners of your code so that you're testing all of it instead of just part of it.

You want to use GUI scripting where it's appropriate, but you're going to want to use your own model object scripting to the greatest extent possible. And you want to keep at it over and over again. Don't give up. And I'd like to thank you and turn this over to Doug Simons from Redstone, and he's going to talk about Eggplant.

Thanks, John. Good to see everyone this morning. I'm Doug Simons, one of the principal developers at Redstone Software, and as John said, we'll be talking about our eggplant testing tool. We also want to... Hello. We also want to give you a little bit of an overview of some different approaches you might want to consider in testing your application.

For each type of testing we're talking about, we'll look at what are the goals of testing and what value automation brings to that process. And we'll try and give you some concrete examples of each of these to give you some idea of how you want to spend your testing efforts.

So, as I said, we'll talk about Eggplant. Eggplant is an interface-level testing tool, and we'll give you some idea of how that works, because we'll be using that in some of the examples that we're presenting. Some of the different kinds of testing that we're going to talk about this morning are unit testing, which is to test your code at a low level.

Look at functional testing, which is testing how a user might interact with your application through use cases. Stress testing, you want to test your application in depth and really push it to its limits in a number of different ways. Performance testing, to measure how fast your application is running.

And integration testing, to test your application in a broader picture of the context of where it will be running. will give you a little bit of idea of some of the types of information and results you might be looking to get from your testing process. And finally, we'll take a look at some additional uses of automation beyond testing.

So let me tell you a little bit about how Eggplant works. As I said, it's an interface-level testing tool. It, in fact, allows you to drive your application exactly as a user would. So how does that work? Well, it's a two-computer system. Eggplant runs on your Mac OS X machine, and it connects over a TCP/IP network to any computer running a VNC server. How many of you have used VNC or heard of it?

Yeah, quite a few. VNC is a neat program that is an open-source software that allows you to access a computer remotely. People use it to access their home computer from work or vice versa. And Eggplant has a VNC client built into it, which allows it to access any computer running a VNC server, which is pretty much any computer out there. The VNC process gives Eggplant a view of the screen of the other computer and allows it to control the keyboard and the mouse.

To that, Eggplant adds scripting, which provides automation, of course. We have a scripting language called SenseTalk, which is a very easy-to-understand, English-like language that gives you all the control that you need to develop some sophisticated tests. And so the end result is that Eggplant can drive any software, it doesn't matter what language it is written in, since we're driving directly through the user interface, and can be running on any operating system.

So let's talk a little bit about unit testing now. The goal here is to verify the correctness of your application at the code level. Ordinarily, unit tests are written to test an individual function or method within your application. And in recent years, a number of agile development methodologies have become popular, such as extreme programming. Any of you extreme programmers or agile developers? Not too many in this crowd.

One common component to many agile approaches to software development is test-driven development. The idea of test-driven development is for developers to write tests to verify their code. Quite often, the tests are written before the code, and then you can develop the code to satisfy those tests. There are some good open-source free tools for doing this kind of testing.

The first one was JUnit, which is used for doing unit testing of Java applications. Since then, there have been a lot of other tools of a similar sort. There's one called OCUnit, for example, to do unit testing of Objective-C code that we use for our unit testing. AppleScript, of course, can also test your code at a fairly low level, interacting either with individual modules or testing the model of your application.

and John did a good job of covering a lot of the benefits that automation brings in terms of repeatability and consistency. One of the other things that's a great advantage, of course, of automation is that you can build these unit tests into your build process, providing a continual feedback to the developers as they're developing the application and revising it. So that can be a big benefit.

Functional testing is a little bit higher-level testing. Now we're looking at how users interact with the system. And the goal here is to verify the requirements of your application. Most applications are developed beginning with some requirements that tell you what it is that the application is supposed to be able to do.

And frequently these requirements are written up in the form of use cases, which are individual scenarios of how a user would walk through some particular process within the application. A lot of Functional testing is done manually, and frequently this is done by developing test cases that are modeled after those use cases.

Again, just so a manual tester can walk the application through some particular sequence and make sure that it's performing the way that it should. Using an interface-level tool such as Eggplant, you can automate those kinds of tests, and this allows your regression tests to be run, again, after every build for your functional tests, in very much the same way that your unit tests are run.

So I'd like to invite my colleague Jonathan Gillaspie up. He's going to be doing the demos today, and he's going to show you how we'd use Eggplant to do some functional testing. Thanks, Doug. Good morning. As our first demonstration, I'm going to be showing you an example, as Doug said, of a simple functional test script.

But for those of you who haven't seen our product Eggplant before, I also want to just give you an idea of how easy it is to start writing test scripts using that. So as Doug mentioned, Eggplat works by connecting to a remote machine. So let's go ahead and start by connecting to my test system over here.

And you can see here that I'm controlling the remote machine directly, sending it keyboard and mouse events. Just like this. So if we wanted to create, let's say we had a simple use case that we wanted to create a functional test for. For my demonstration, my first demonstration this morning, I'm going to go ahead and just do a little simple use case using Apple's iTunes application.

So here you can see we've just got a real simple use case. We want to go ahead and launch iTunes. We want to search the iTunes library for a particular song. We want to play that song. And when that song's finished, we want to go ahead and quit iTunes. Just a typical use case scenario.

So let's see how we'd go about scripting that using Eggplant. The first thing we need to do is create a test suite. A suite in Eggplant is just a collection of scripts and other resources necessary to run a test. And let's go ahead and create our first simple functional test.

Now I could just start writing a test script right here using Eggplant's scripting language, SenseTalk. But instead, let me take advantage of Eggplant's script generation mode that makes it real easy to write a test. So I do that by switching from live mode into capture mode. Now I'm no longer interacting with the remote machine directly. Instead, I have a selection rectangle that I'm using to identify elements of the remote machine that I want to work with, graphically.

So the first step in our use case was we wanted to launch iTunes. So we're going to go ahead and do that the exact same way a user would, by clicking on the iTunes application down here in the dock. So I select the iTunes application, and I tell Eggplant, using the toolbar, that I want to click on that image. Eggplant brings up a Save Image dialog, and I go ahead and name the image and save it. And when that happens, when I do that, three things happen. The first is that Eggplant stores that image over here in our test suite, the iTunes icon.

The second thing that it does is it goes ahead, let me make that a little bigger so you can all see it. Second thing is it does is it actually writes that script command into our script. And the third thing is it does is it actually executes that line. Click the iTunes icon. And when that line is executed, what Eggplant does is actually searches through the remote screen to find that image, and when it finds it, it clicks on it. You can see here that it's brought up iTunes on the remote system.

So the next step in our use case is we want to search the library for a particular song. As a user, we do that by clicking and entering something up here in the search field. And we identify the search field by this search label right here. Of course, we don't want to click directly on the search label. So Eggplant allows us to set what we call a hotspot so that we can interact to positions relative to an image. So I'll just move that hotspot directly above the search label. And again, tell Eggplant we want to click there in the search field.

You can see we have an insertion point now. And we can send keystroke commands to the remote machine just as easily. So let's go ahead and see if there are any eggplant songs over here in my iTunes app. And sure enough, there's one. So the next step in our use case is to go ahead and play that song. So we just do that, again, the same way a user would, by clicking on the Play button.

and David So we want to continue on with our use case. We want to make sure to wait for that song to end. iTunes goes ahead and changes the play button to a pause button while the song is playing. So what we want to do is wait for that pause button to return back to a play button. That's how we know the song has finished, again, just like a user does. So eggplant can wait for certain elements to appear on the remote screen using the wait for command.

The WAV4 command allows us to specify a maximum period of time for a particular image to show up. In this example, we'll go ahead and just give it a maximum of 90 seconds. If the image doesn't appear in that time, it's going to quit it, raise an exception, and fail the script. Yeah, I'm just naming these images as I record the script.

Yes, actually if you look over here in our test script, we can actually see all these images being recorded. I could have actually reused the play image from before and that would have been a better way to go. So then the last step in our script is to go ahead and quit the iTunes application. We'll do that here through the menu.

So here we've written our first simple functional test script using eggplant. And I can go ahead and run it here now. One, two, three, four, five. If there was a cow from the eggplant that ate Chicago-- You can see just how analogous the functional test So that's our first demonstration on functional testing using eggplant. Thanks, Jonathan. We have slides again.

Let's talk a little bit now about stress testing. Functional testing and unit testing are great to ensure that your application is working and doing what it's supposed to do, what it's designed to do. But you also are going to want to push your application a little harder. As John mentioned, put in some unexpected conditions, some things that are likely to cause errors. Test some of the boundary conditions there in your application.

You don't really expect your user to enter negative numbers, perhaps, for a duration of time or something. But you know they will eventually. And so you want to be sure to test some of those exceptional situations. So the goal here is not just to verify that your application is doing what it's designed to do, but that it's not doing what it's designed not to do. And so you want to push it a bit there. So the goal here is really to find bugs and track them. down to their source if you can.

Automation plays a really critical role here. It's always, of course, very nice to be able to automate things, to save some time and effort. But when it comes to this kind of stress testing, you really want to be able to do things repeatedly, and scripts do that a lot better than people do.

So you're going to want to be able to iterate over large sets of data, to try your application with different combinations of input, and also to be able to run tests repeatedly to reproduce any intermittent crashes you might have. How many of you ever had bugs or reports of bugs that were very hard to reproduce?

They happen. In fact, we had one in Eggplant a few months ago. People ask us sometimes whether we use Eggplant to test Eggplant. As a matter of fact, we do. We have a test that we call our EP squared test, or Eggplant squared, in which Eggplant is driving another machine, which is also running Eggplant, to create and run some tests on a third machine.

When we had this occasional crash showing up, we were able to run the Eggplant squared test repeatedly overnight. It takes about three or four minutes for each time through the test. Not something that a person would want to sit there and do a hundred times over. By doing this, we were able to find that this particular crash occurred once or twice every 100 runs. We were able to track down where it was occurring and isolate it and fix the problem. I'd like to ask Jonathan to give us some more information. a little demo of a data-driven test.

Thanks again. As Doug and John Comiskey before him pointed out, many times when you're looking to stress test your application, you want to try providing large amounts of varied data, and sometimes you want to try doing the exact same actions over and over again. So for this demonstration, I've created a simple little data file here.

And I'm going to use this data file to drive the address book application to just create some records in address book and then look up the phone numbers once we've done that. So let me bring up this script here. I'll try and make that a little bit larger.

So I'm actually going to walk through this script using a new feature of our upcoming release of Eggplant, which is actually an interactive debugger. So when I run the script, it's going to pause here after the pause script command. The first line in the script is to call another script. Here you see that eggplant is fully modular. We can call other scripts passing parameters. This particular script actually will launch an application on the remote system using the finder. Let's step through that. Or actually, I'll continue through that.

So here we are in the address book application. It's very easy to parse data files and work with data in our SenseTalk scripting language. All we need to do there is create a simple repeat loop with a simple construct like repeat with each line of file my data file.

Then inside that repeat loop we have a number of interactive commands similar to the first script we did. We're just hitting command N to create a new card. We're typing in the first name from the data file and then the last name and so on. and entering all these through the user interface.

So we've gone ahead and entered all five of our records for this data test. And now, like I said, we want to go ahead and validate the information that Eggplant entered. So we'll go ahead and do that by just stepping in here and using the find feature of the address book. Again, very similar, just interactive commands with the remote system.

and David So I'll go ahead and let that continue running. A really important point I'd like to make here is that this is all going directly through the user interface, just the way users would. Back to our eggplant squared bug. The eggplant squared bug was a bug with the interface of our application. It was a problem with some thread timing we had, which is one of the reasons it only showed up occasionally.

And testing it from, doing our unit tests and our model tests didn't expose this bug, but running it through Eggplant through the interface did. This becomes increasingly more important with the faster machines like the G5s and even more important with multiprocessor machines that are really likely to expose threading problems that you might have. So that's a basic example of how you would use Eggplant with some data to do some data-driven testing and vary the input and results. Thanks. Thanks, Jonathan.

Let's take a little look at performance testing now. Performance testing, of course, is designed to measure the speed of your application and how fast it's running. Quite often, developers, when they think about performance testing, are going to be concerned with the speed of their code and the cool new algorithm that they've devised to wring the most out of their G5 processor.

But there's another type of performance that's important also, and that's the responsiveness of the application as a user is using it. And this is very important, of course, because you want your users to be happy with your application and not frustrated as things are proceeding. So, for example, you might have a search feature in your application, and you could test that with an eggplant script that clicks on the search button and then waits for up to 30 seconds, perhaps, for whatever the next thing is to show up on the screen to indicate that it's running.

And you can see that the search has finished. And this is fine, but 30 seconds is going to be an awful long time for your users to wait if it's actually taking that long. And this script will only fail if, in fact, it takes longer than 30 seconds.

By adding the lines shown in blue here, you could check to see how long it really takes that search to come back. And if it's greater than five seconds, then you could log a warning message to indicate that there's a problem you may want to look into in your application.

So, of course, in the case where you're testing performance, it's really important to be able to gather consistent and repeatable timing data that you can compare from one build of your application to the next to see if you're, in fact, slowing down or have made some improvements there.

Final type of testing we'd like to talk about today is integration testing. The goal here is to really ensure a quality experience for your users. Your application may work just fine through your use cases and your unit tests and all that, but chances are nowadays your application doesn't live on its own. It's going to be interacting with other applications. It has to be able to go out and live in the real world. So it's really important that tested in that kind of an environment.

You may also want to test a complete process end-to-end. If your application, for example, produces data that it can export that can then be read into an Excel spreadsheet, perhaps, or a Quicken or something like that, you're going to want to test not only that your program runs, but that the data that it produces can be read in those other applications and looks the way it should.

You also may have different configurations that you want to test. Perhaps your program has some plug-ins or other modules that are optional. And so you're going to want to test various combinations of things and perhaps run all of your tests in each of these different configurations. And finally, if you have a cross-platform application, Obviously you're going to want to test it on different platforms that it runs on.

Perhaps you have a web-based application and there you need to not only test on different platforms, but in each of the different browsers that you support on those platforms. Even if you don't have a cross-platform application, you may want to test your program running on different types of hardware, different Macs. This is a place where a remote testing approach like Eggplant offers can come in handy.

Perhaps you have a G5 that's in a different department at your company and you'd like to be able to run your tests against that when you reach this stage. You may have hardware at a vendor's site even that you'd like to run your tests on, or perhaps at the Apple ADC compatibility labs in Cupertino, and that can be done remotely.

Obviously, again, automation provides consistent and repeatable results. If you're running your tests automatically, you can do much more testing than you would if you only had manual testers. One of the big wins, obviously, with automation always is that you're eliminating a lot of your manual testing time and giving a much more efficient execution. One of our customers develops tape backup software that runs on Unix systems and works with a lot of different tape drives from different manufacturers.

Every time they get a new model of tape drive from one of the vendors or want to revise their application, they have to validate that everything is still working on all of these different devices and on the different operating systems that they support. When they had a manual process to do this, it took them about two weeks to go through all of their validation tests. With Eggplant, they were able to write a script that runs in eight hours.

By being able to run their tests overnight like this, of course, they not only saved a huge amount of time and effort, but they were actually able to modify their approach to developing their software. You can imagine with a two-week testing cycle at the end of every build, you're not going to be able to innovate quite as quickly.

So, Johnson's going to show us kind of an interesting integration test now on the demo machine. Thanks. So, for this next script, let's say that my company wants to offer a free trial license of its software to all WWDC attendees. We want to allow people to come to the website, fill out a web form, get a trial license key, download the software, put in that license key, run the application. And test interactions that our software has with the operating system. We can validate this entire process using something like Eggplant. So, let's go ahead and just look at this integration script at a real high level here.

Again, we're going to reuse our launch application command to go ahead and launch up Safari. And then we're just going to go through that entire process. We're going to go test the free trial download form by going to URL and filling out the form and then grabbing the license key off the site.

Then we want to go ahead and download the application using Safari, make sure that downloads and the disk image unpacks and installs correctly. Then we want to go ahead and move on and launch our eggplant application and put in the license key that we got from the website. Then you could imagine we would actually run a whole battery, a whole series of functional use case tests like we did in the first demonstration. But in particular, we want to do things like test its interactions.

All of eggplant's documentation is written as PDF, so we want to go ahead and make sure that they all open up properly in the preview application and they can be read and seen. And finally, we want to go ahead and clean up and quit out of that and go ahead and even test the uninstallation process. So we can go ahead and do all of that with eggplant. So I'm going to get started here.

We can just run the script. And we can just sort of watch it go through its paces. Again, driving various components all on that remote system. An integration test like this could also be a multi-system test. We could initiate one process on a machine, close this connection, open a connection to another machine, and start processing there. Here, Eggplant automatically agrees to the licensing agreement panel that comes up.

Here we're launching the application and it says it needs a license, so we'll plug in the one that we pulled off the website. Again, we could run a bunch of eggplant tests now, or we can just go and check and make sure that the documentation is coming up correctly.

So there it is, the bookmarks and everything. So having moved through all that, we'll go ahead and just clean up, clear out the license, and quit and close Eggplant. So there we're seeing an example of an end-to-end process test of all of the steps necessary for a full integration. Thanks, Jonathan. That was great.

So I think you can see the power of automation there. In about a minute and a quarter, we were able to do all of that, which is something that you'd like to be able to validate after you're ready to ship your application, to be sure that it's all ready to go.

So some of the information that you might be looking to get as a result of your testing. Of course, the first thing is that you're looking for bugs. You want to find any bugs that you might have in your application and be able to see what those are. Project managers usually are also looking to gather some statistics about how many of your functional tests are passing or failing at any given point in time and hopefully reducing the failures throughout the development cycle.

If you're doing some performance testing, if that's important for your application, of course you're going to want to gather some timing metrics there. So clearly that's an important thing. Another really important piece, though, is to provide some feedback to developers. And I don't know how many of you who are developers may have had situations where testers tell you, "Oh, the thing crashed here," or, "Something didn't work," but they're not very specific always about exactly how that occurred. Or on the other side, of course, testers are sometimes frustrated that they tell the developers that something doesn't work, and they're not always believed.

and David Being able to provide clear communication of exactly what the steps were that led up to a crash or a bug can be really, really valuable. And that's one of the key things that automation can provide. By having your test automated and logging the information as it goes, it can show you exactly what the steps were and, of course, you can reproduce that for the developers. And, of course, automation also is really critical for being able to reliably compare one run to another and see the progress that you're making during the development cycle. Jonathan's going to give us a quick view now of some results in Eggplant.

Thanks. Eggplant automatically is going to record results any time we run a script. So let's go ahead and just look back at the results it recorded for that last integration demo that I ran for you. So we just come into Eggplant and go over to the results tab.

Here you can see it automatically records each time a script is run, and then it keeps statistics on any errors or warnings or exceptions that were raised. If we actually click on that run, and David You can actually see that eggplant is actually logging every interaction that it performs to the remote system in terms of clicking what time eggplant performed the action, what action it performed, and even where on the screen that was found.

And we can step through all of this and so you get a very clear sense of exactly what was happening during the test, even down to the exact second that it happened. And we can go ahead and double-click on any of these lines to jump back to the script and see what command caused that interaction.

Of course, something that's really important to know about testing is how do errors get reported? How do you tell when something went wrong? I'm going to go ahead and force an error condition here in this integration demo by having us go to a web page that doesn't exist. And then just go ahead and run the script with that.

So it won't take too long before eggplant tries to move through its script. But lo and behold, the page isn't there. The webmaster did something bad. So Eggplant comes up and says that a script failed. And now if we go back and look at our results, we see that the result for that execution is in red.

And again, we can follow through here everything that Eggplant was doing, including what it was trying to do when it eventually failed. It was trying to find the name field to fill out the form on the website. Any time a script fails, Eggplant automatically grabs a full-size screenshot of exactly what the screen looked like when it failed, so you can easily communicate to the developers exactly what went wrong and exactly what it had done beforehand.

So that's just a simple look at the kind of logging results that are important for testing. Thanks, Jonathan. And I might also note that all of Eggplant's reporting is done through standard text files that you can easily import into other reporting tools to generate whatever kind of reports you might need.

[Transcript missing]

One of our customers was interested in doing some validation of third-party software. They're a big organization with hundreds of Windows machines, and whenever they get a new patch to Windows, they're interested in knowing before they deploy that patch to all of their Windows boxes, they'd like to know that all of the applications that are critical to their users are still working. So they're interested in using Eggplant to do this kind of validation of other software. running on the new version of Windows before they deployed that throughout the organization.

System administrators, of course, do a lot of repetitive tasks in maintaining systems throughout a company.

[Transcript missing]

One of our customers, Fuji Film, has taken a formerly manual process of validating X-ray film and automated that. This is a process that is subject to audit by the FDA and it's very critical that this be done on a regular basis. There's a variety of different kinds of automation that are possible.

Hardware and BIOS-level testing is not something maybe that would immediately pop to mind for something that is typically thought of as driving software. But, in fact, most hardware, of course, has some sort of an interface. And if it's got an interface, then eggplant can interact with it and drive it. We even have one customer who is doing BIOS-level testing. There's a hardware KVM switch that supports VNC. And they're able to use that to have eggplant drive the machine even at boot time.

We also have come up with an idea of creating movies. So we built in a feature into Eggplant that will allow you to capture a quick-time movie of exactly what's going on on the remote screen, either when you're interacting with it live or under a script control. Since you have already developed tests now to walk your application through the various use cases of how users are going to interact with your system, it might be kind of nice to be able to capture that as a movie and present that as part of your documentation. And we're going to have a quick demo now of movies.

Just really quickly for our final demonstration, here's a little script that I've written that actually records a QuickTime movie of that integration demo that we've been working with. The first thing it does is just sets up some timing parameters, and this just slows down the script to make the movie easier for a user to follow exactly what's going on. And then we just say, start movie, and go ahead and call the script that we've already been working with. I'm not going to run it and have you watch the demo again, but I'll go ahead and show you the results of having created this movie just before.

So again, it's a process that we've already gone through for testing and validating this, but if we want to go ahead and create a simple interactive movie to show users how to do something within our application, it's real easy using Eggplant to just throw a little wrapper around it like this and create a little quick 30-second or minute-long QuickTime movie that you can put up on your website.

One of the nice things about creating a movie using a script, of course, is that if something changes in your application or you decide there's a little different way you'd like to present that process to your users, you can easily just modify your script and recreate the movie. It's a very quick process, a lot easier than editing a movie file.

So, just to sum up, we've talked today about a number of different types of testing and the kinds of value that automation can bring to those. Unit testing, which is very important to validate your code at the low level. Functional testing to test your use cases, how the user is going to be interacting with it.

Stress testing is very important to bring out any bugs that may be hiding in the application. And performance testing, of course, to make sure that it's running at an acceptable speed. Integration testing is often overlooked. It's one of those things, though, that really impacts the experience that your users have with your application when they get it.

We also talked a little bit about some of the kinds of results and information that you might be looking for from your testing process, and had a quick look at some different uses of automation beyond testing. So I hope this has been some information that you can use to go home and improve your own testing processes and turn out some great applications for Mac OS X.

This is our contact information here. We also have a booth downstairs in the exhibit area. We'd be very happy to talk to you later today or tomorrow. There's also a free download of Eggplant that's available to attendees of WWDC this year. The URL is included in the information that's available as part of this session. or you can come talk to us in the booth. And I'd like to open it up to questions. Yes, there's also some more information here about additional information that's available through the Apple website.