QuickTime • 1:20:48
Encoding media on a large scale can be a daunting task. This session will explain how to make it simple with the right planning and equipment. Learn techniques for all types of projects, including how to manage high volumes of assets, and how to maximize quality across a large-scale production workflow.
Speakers: Aimee Nugent, Jim Baker, Hage Van Dijk
Unlisted on Apple Developer site
Transcript
This transcript was generated using Whisper, it has known transcription errors. We are working on an improved version.
Well, good afternoon everybody. Hope you had a good lunch. First, I'm Aimee Nugent. I'm from the QuickTime product marketing team, so thank you all for coming. First, just want to make a housekeeping announcement. The QuickTime feedback forum, which will be after this session at 3:30, the room has moved to Marina, which I believe is down this hallway. So if you'd like to join us for that, then please note that change. It's now in Marina.
Now, on to why we're here. I'm very pleased to introduce to you session 729, Industrial Strength Encoding: Workflow Automation for QuickTime. We have two great presenters for you today. We have Hagie Van Dijk from Discrete and Jim Baker from 21st Century Media. And what you're going to see today, I think, is a great session that's going to pull together a lot of the concepts that you've been learning over the past couple days.
So, no further ado, I'd like to introduce Hagie. And one note about questions. We'd like to ask that you hold them until the end of the session. He's got a lot of material to cover. And when you do ask a question, please come up to the microphone and speak clearly. We do have a translation going on. So now, Hagie. Thank you.
[Transcript missing]
Hi, good afternoon. Thank you for joining us this afternoon. And Hagie's done a great job of introducing what this session is all about. And my part in this session is to explain how we're actually tackling a very specific job. As Hagie explained, it's the WWDC 2003.
This is actually the fourth year that my team have produced this content, and each year has been getting a little bit closer to doing it the right way. The first year, Apple asked us to do it, was extremely interesting, I can tell you. I think being immersed in boiling oil would be more fun than going through what I went through that first year. The second year, we tried it a different way.
And the second year, we threw a very, very large amount of sand storage at the project. And surprisingly, we still got the job done, but it wasn't quite as efficient as I was expecting it to be. And last year, we tried a different aspect altogether that didn't involve Macs, which was probably not such a good thing.
And this year, we've decided to build a workflow that is completely Mac OS X based. And I'm delighted to be able to share that with you today. I think it's the most efficient system that we've built so far for dealing with this particular type of project. What's interesting about it is that there are over 170 sessions at WWDC. I think there are 175, something like that, this year.
So there's a very, very large amount of content. It's over 200 hours of material. And just to make things even more complicated, it's in multiple languages. We have the English that you're hearing me speak now. And we have some ladies and gentlemen in the little box in the back who are translating everything that I'm saying into Japanese. So we're going to be receiving tapes that have both Japanese and English on them.
What we're doing is we're encoding them for streaming over the web. You'll be able to review and view the sessions as ADC members. You'll be able to view the sessions on the ADC TV website at some point in the near future. And then, as you know, if you've been attendees in the past, you receive a box set of tapes.
And you'll be able to view the DVD-ROMs that has all of those sessions encoded on them as well. And the interesting thing about the DVD-ROMs is those are in multiple languages, too, on those disks. And I'll be explaining how those work. We're using MPEG-4 this year. It's the first year that we've used MPEG-4.
And Hege's going to go into much more detail about why we're using MPEG-4 and the advantages of doing that and the improvements that we've seen in using that as a codec. As I mentioned before, it's completely Mac OS X-based now, which really streamlines the workflow. It leverages a lot of the functionality that you've seen in the past.
You find in Mac OS X and applications that run on Mac OS X. And we do some assembly as well. And that's pretty much what the Apple scripting is all about. It's not just about encoding. It's about once you've encoded files, you're taking some stuff from here and some stuff from there and putting it all together and creating this finished content.
So this is pretty much what the workflow looks like. Essentially, we're capturing from DV cam. We get the tapes like this. Fortunately, we're not editing them. The first year we edited them, and that was one of the big hells that we went through. Fortunately, somebody else does that this year, and we just get the tapes all fully edited, which saves an enormous amount of time. So pretty much, we just set an in point and out point and capture it in. We then encode it at multiple bit rates.
We're then assembling it, as I said already, in multiple language versions, English and Japanese, standalone English, standalone Japanese for streaming, and then a combined Japanese and English version for DVD. So you have one set of DVDs in two languages. We have a quality control stage, very important to make sure that everything you've encoded looks right, functions correctly before it goes on to the next stage, which is then delivering it up to the customer, the streaming version, and then off for replication on the DVDs.
So obviously when you're planning a workflow, you've really got to think about how long this is going to take. You've got to figure out how long it's going to take to capture this content in the first place. You've got 175 sessions. You've got to physically take those tapes, put them in tape machines, set the in point, set the out point, capture it.
That's going to take some time. In fact, it's real time. So it's going to take over 200 hours to capture that on one tape machine, right? So that's obviously you need to spread the workload. You've got to calculate the encoding times as well, per bitrate per minute. You can just do that with some math. We'll look at that math in just a second.
So once we've figured that out, it's like, well, how much power, how much horsepower do I need to throw at this to get it done in the time that I've got? Or the time that my client wants me to do it in? So we need to figure out how many CPUs we're going to need to hit that deadline. And then we've got to figure out how fast our network needs to be.
What kind of network topology do we need to build that is going to be able to suck digital video in off these decks, and push it over a network, and get it to all those encoders, and then get it to the assembly stations? So these are all the sorts of considerations that we've got to take into account to figure out how long this is all going to take. How much space do we need, storage space, to capture all of this? Quite a lot, actually. As Hoagie said, it's just under 4 megabytes a second for DV, and over 200 hours. That's a lot of storage. That's terabytes worth of storage.
Do you really want to do this manually? No, I tried doing that, and it was a really bad idea. So automation is a really good way to do it, and AppleScript saves the day for that. We'll be looking at some of the AppleScripts that we've written in just a moment. Are my applications scriptable? Fortunately, yes, some are, and unfortunately, some aren't. And I'll highlight why that is important, that applications that developers develop should be scriptable. I'll script that. I'll script that.
And then lastly, how does the client actually get to see this stuff, to sign off on it? And we'll be looking at a database-driven system for doing that. Let's look at some of the math very quickly. As I said before, DVCAM, it's 720x480, about 3.5 megabytes a second, time 200 hours. That's about 2.5 terabytes of storage. And if you want to have any kind of redundancy, you're up to 5 terabytes of storage that you should have available for doing this kind of a project. Now, each one of these WWDC sessions is about 65 minutes.
Some are a little shorter. People who don't have much to say take about 35 minutes, 40 minutes. Some who have far too much to say can take an hour and a half. But the average time is about 65 minutes. So those metrics allow us to figure out how long it's going to take to capture these. If it takes 5 minutes to put the tape in, mark the in point, spool to the end of the tape, mark the out point, rewind, hit batch capture, it's about 5 minutes to set that up.
So about 70 minutes per session to get that ingested into your system. So your overall production metrics, you've got 171 sessions. I think that figure's now gone up. It's 175 or so. So that's 12,000 machine hours to process this. But if you now split that load across multiple machines for the capture, you could actually capture all of this content in just 50 machine hours. So you can suddenly see how distributing this workload really is plowing through this material in double quick time.
This is what the encoding mathematics looks like. It takes about three minutes to encode a 900 kilobit per second stream. We encode the material at 900 kilobits a second for DVD-ROM. You may ask, "Why that particular bit rate?" Well, we want to try and fit all of these sessions onto a certain number of DVDs.
So that kind of tells us what bit rate we must use for the video. Of course, you could encode DVD-ROM, remember, not DVD-Video. DVD-ROM QuickTime movies are at a much higher bit rate, but they're going to chew up much more disk space. We want to be as economical as possible, keep the production costs down on the DVDs.
Then on the broadband, we're looking at about 1 minute 30 to encode 1 minute at 300 kilobits a second, and just 44 seconds to encode a low-end broadband at 100 kilobits a second. So you can see it's pretty fast to encode this stuff. That's the beauty of MPEG-4, very quick, and Hagie will go into more detail about the settings that we use a little later. This incident is running on a dual 1.33 gigahertz exo.
So once we've figured out that, how long it takes to compress each clip per minute, we can figure out how long it's going to take to compress a session. It's about just under six hours to compress one session on a single machine. So 171 sessions takes 1,000 machine hours, or 42 days if you're just using one encoder. So now spread that across 20 XSERVs, and there's that magical figure again, 50 hours.
You could potentially encode this entire conference in 50 hours, or even less if you threw more machines at it. But the reality of it is, it's never going to happen in that amount of time. And the biggest bottleneck of that is, we are just not going to get these tapes from the editors in 50 hours.
It's just not going to happen. These guys have got to sit down with the source, they've got to check it for content, they've got to set the in and out points, they've got to top and tail it with graphics. They've got to do all sorts of things. So it's really the flow of the content that's coming into this.
It's the workflow that is going to be the key bottleneck. And then at the other end, the client, in this case Apple, has actually got to look at this material and say, "Oh, well, yes, this is okay. Oh, did he really say that?" And then that may change that session. They may have to go back and re-edit it, or they may have to cut something. Or the audio levels, like that.
You know, there may be dropouts, and that happens all the time. Nothing's ever perfect, so you do have to take some things into account. So we reckon that realistically, this project could be done in two weeks, which is not bad for over 200 hours of encoding. And editing, and all of that sort of production. I'll let you know if we actually achieve that or not next year. Okay, so then there's the assembly.
We've actually got to piece all these things together. We're creating English language movies at 100 and 300 kilobits a second. And then a different set of Japanese. So if you want to watch in Japanese, you go and watch those online over here. And if you want to watch in English, you go over here. You can't actually serve a single movie with dual language tracks over RTSP streaming. Alas, progressive download, yes. RTSP streaming, no, at this current time.
However, on DVD, we can deliver a single movie that has multiple audio tracks in it. And it will intelligently switch out the audio tracks, depending on what you're looking at. We'll look at how we do that. So what we do is we encode the Japanese audio separately as an audio-only file. And then we encode the English video and the English audio together.
And then we have an automated process that opens up the English one, it strips out the English audio, and it pops in the Japanese audio, saves it out, there's your Japanese file. And then for DVD, it opens up the Japanese audio, copies it, the English DVD file, pops in the Japanese, sets up the multiple language tracks thing, adds annotations, does all kinds of cool stuff, and saves it out. And I'll show you those scripts shortly. So it's all really completely automated, which doesn't take very much time at all. If this was a manual process and you had to have somebody sitting there doing this, it would be a nightmare. And AppleScript really solves all of that.
So how are we doing it? What does the work unit look like? And I call this a work unit, and you'll see why I refer to it as a unit in just a second. This is kind of what it looks like. We're capturing off Sony DV cam decks.
And really, just remember, this is just playback only, so it doesn't have to be super high-end. It can be something that suits your budget. If you can afford to use a very, very high-end DV cam deck, fantastic. DSR-1500 is a very nice deck if you want to use one of those. In this case, we're using 25s and 45s, which have little built-in monitors that allow you to see the session as it's being digitized.
Really nice. JVC also make a great unit that has a built-in monitor as well. This is coming in over FireWire into an XSERV capture station. And then that's going over 2-gigabit fiber to half a side of an XSERV RAID. So one machine is sharing a single XRAID unit, but just is dedicated to half of it for capturing. So it's pulling in off the tape the video, the English on Channel 1, and the Japanese on Channel 2, all simultaneously. On that capture station, we've got QuickTime 6.3 and Final Cut Pro 4.
As Hagie mentioned earlier, establishing a naming convention is critical, particularly if you're automating a process. Everything has to be named identically. It's got to have the right extensions on it so that the scripts can pick it up, detect it, and know what to do with that specific file. Because all of that scripting is totally dependent on what that file name is. So we're pulling in that video and the audio. We're then exporting it out of Final Cut Pro by reference, so we don't have to go through a long rendering time.
Once that session is captured, we just go export, boink, it saves it out in less than a minute. It renders the audio, but not the video. And then for the Japanese audio, we're just exporting the audio alone, no video. So we end up with two files for each session that we capture. And they sit on the X-rayed, on the X-Serve RAID, ready to be encoded. You'll notice the naming conventions there.
So what we're doing is they come out of Final Cut, they've got the session ID with an underscore and then the language identifier. And that language identifier is used to make the session. And that language identifier is used throughout the process to make sure that it goes down the right path through the automation.
So a full work unit comprises the entire workflow from capture through to encode and output. And this is what one work unit looks like. So we've already covered that capture stage. What happens after it's captured? Well, it gets read off the RAID by the encoders. And each work unit, in other words, each capture station, has five dedicated XSERVs to do the encoding. And it's doing the encoding over gigabit Ethernet.
You might ask, why gigabit Ethernet? And why not just stick with fiber? It's purely a financial decision. If you think about how many fiber cards you would need to support, well, you'd need 20 if you're going to have four work units. It's extremely expensive. They're $500 or $600 each.
So from a cost point of view, when we analyzed how much data we were actually pushing through this network, we realized that we didn't really need to go to 2 gigabit fiber. We're not pushing that whole digital video file over to the encoders. It's just reading it in little bits and doing the encoding.
And a switch like the Asante switch is extremely good for doing digital video encoding, DV anyway, encoding over a network. So this is pretty much what a work unit looks like. You can scale this, you see. So now you can take that work unit and you can duplicate that as many times as you like to cope with however much content you're trying to encode.
So what's on these clusters that you see down here? We're running discrete Cleaner 6. Still very much the industry standard for video encoding. It does multiple formats. Obviously, we're just encoding QuickTime MPEG-4. The great thing about Cleaner, though, is it's fully scriptable. And, in fact, there really isn't a competitor to Cleaner that is fully scriptable, and that's one of the main reasons why we chose it, because of its scriptability and also its great workflow functionality using watch folders, which Hagie's going to go into more detail about in a minute, which is key to the whole process of our encoding. We're also using a very neat little thing called Whistleblower. Some of you may be familiar with Whistleblower. Whistleblower is essentially an application which allows you to monitor servers and other such things remotely.
I got in touch with the guy, James Sentman, who wrote Whistleblower, and I told him what we were doing. And I said that what I really wanted was not something to monitor the machine, but I wanted something to monitor a very specific application that was running on the machine. Because if a machine freezes, then you can detect that, right? If a machine freezes, Whistleblower pings it.
If it doesn't respond to the ping, then the machine's crashed, and it notifies you that it's crashed, and it restarts. But an application might crash, but the machine's still running, and Whistleblower couldn't detect that. So what James did is he wrote up this great little client, which is now released as part of Whistleblower, I believe, and that runs on the client machine, and it can actually do process monitoring, and Whistleblower can monitor those processes.
So it sits there going, "Is Cleaner alive? Is Cleaner alive? Is Cleaner alive? Is Cleaner alive?" And if Cleaner's dead, it says, "Oh," and it reports back to Whistleblower, and it pages you or calls you on your cell phone or, you know, wakes up your wife or something like that.
I'm going to hand back to Hagie in just one second, and we're going to take a closer look at why we do the encoding in the way that we do. But this is a very quick, high-level overview of what we've done this year. Last year and in previous years, we've done 56K settings. Hands up, any of you who actually watched any of the conferences on 56K modems? Perfect.
Okay, good. So we made the right decision in killing that. And so this year, we're just doing 100 kilobits per second for those of you who are on like a dual ISDN or the sort of low-end broadband connectivity. The mid-range broadband, which is probably what most people watch, around 300 kilobits a second, much better quality. Obviously, both of these are RTSP streaming.
They're not downloading. They're streaming in in real time. And then the DVD ROM, I already explained about that at 900 kilobits because we do it that way so we can fit that many. So I am going to hand back over to Hegi, who's going to tell you how we do the encoding. Thank you, Hegi.
[Transcript missing]
As a customized output. That way all of the files that I'm making can go right there to that folder. I know where they're going. And why this is really important is back to the network, we really want to output these files to essentially the folder for the QuickTime streaming server on the XServe, a network share point to be picked up by the next step in the process, etc. So having custom locations for your projects is an important and powerful feature inside Cleaner.
Now let's go ahead and actually look at the settings. I'll double-click on the settings window and I have my basic settings. This morning Jim gave me some more settings. He had actually updated the settings we've been working on for this. So he gave me the folder here, and I'm gonna go ahead and take a look at those. What I can do is he just placed them on my desktop and again I'll just drag them right into the settings area. One of the great things about Cleaner 6 for Mac is settings management.
I can go ahead and drag settings in and out if I wanted to give Jim a copy of my MPEG-4 test. I can just drag those off to his FireWire drive here, and he's got a copy of those. So again, it's very flexible about moving these settings. I don't need to restart in order to use those settings again. So what I'll do is we'll take a look at the 900k bit setting first. I'm gonna go ahead and apply that to the project just initially so then we can step through the process here.
Just make sure I did that right, so select my setting and apply. There we go. I'm actually left-handed, so I'm going to move to the other side here. This will be a little more efficient. You'll notice that there is a crop on there, and it gives me a little indication of the crop in the project column there. So, the settings window inside Cleaner is really divided into two parts. The side on the left, the column, are all of the default presets that come inside Cleaner 6, as well as all of the settings that I've created as I've worked with various clients.
On the right side is really a hierarchy of the various functions for the output format that you're choosing to use here inside Cleaner. In this case, we'll be discussing QuickTime, and we're creating MPEG-4 movies in QuickTime. Again, we could have gone just with a straight .mp4 file, but we decided that we really wanted to encase those or contain that media inside a QuickTime movie using the MPEG-4 codec, so we could take advantage of FastStar.
So again, that's why we actually have it on a QuickTime movie. And you can see there listed are some of the other formats that Cleaner does support. So again, because it's a .move, the file suffix will be .move. A little bit of a legacy thing at this point, but we always flatten and cross-platform FastStar. You'll notice that I don't have the prepare hint for streaming server checked.
That's because there is a problem currently with the QuickTime exporter for MPEG-4, so we have to take care of that in the Apple script, and we'll show you that demo a little bit later. So we do have a way of working through that, and we're all going to go to the feedback form and request that they fix that for us, so we don't have to do that in the next session here.
And compressing movie header, again, a little more of a legacy thing at this point, but trying to make the file size as small as possible means faster download, means better user experience. The tracks tab inside Cleaner is just an acknowledgement of the different types of tracks. If I was using Flash or MPEG, I could choose to copy or process those tracks. So I'm not really doing that. I'm doing a whole lot with the tracks tab. So the image tab, now we're going to start to get into it a little bit. You'll notice that Jim has placed a numeric crop on the video.
What I can do is go ahead and double-click on the project window and bring that up again. And you'll see that what he's done is he set a framing in here to just cover the edges and to get rid of any possible tape anomalies that might be in that. So again, I just discussed cropping as an aesthetic thing, but you really want to take a look. Every piece of content has its own consideration, so you really have to take it on a case-by-case basis with regard to cropping there.
You can also do a manual crop if you need to. Image size, basically dictated in this case by the client, if you will. So we're working with the sizes. The next parameter are -- now we get into really the filters inside Cleaner and this whole idea of pre-processing.
We talked a little bit the other day about pre-processing for video. And again, pre-processing is really important to preserve the quality as much as possible. We're capturing these sessions on a DV cam tape. It looks very good, and what we're trying to do is compress that down. So we always want to really preserve the quality, maintain as much of the color balance. And really, I'd go a little farther than maintain.
They used to say that editorial touches every frame. Recoding touches every frame as well. And that increasingly, assets are delivered in a variety of formats. And especially in bandwidth-restricted -- the Internet for video -- avenues, you really need to do everything you can to make the video look as good as possible.
And again, pre-processing is critical. It's the key to that. At Discrete, we kind of have identified this process as called media mastering. And it's the most out of any particular piece of content as you deploy it to a variety of formats and locations, sizes, and bit rates. So getting into the pre-filtering, the first thing I'll talk about a little bit is deinterlacing.
Deinterlacing is the ability to -- these video cameras shoot in alternating fields. So if you eliminate half of that, it's a good way of reducing data. The adaptive deinterlace here, what that does is only deinterlaces the pixels that move. So, well, what good is that? I'll be happy to show you. What I'm going to do is just scroll through here where our subject is kind of throwing around his glass of water there. And then I'm going to bring up the dynamic preview window.
So this is giving me a preview. I have a before and after slider here. I can also go to an A/B-- oh, I'm going to say my joke-- for an apples-to-apples comparison.
[Transcript missing]
Test. Again, save that back on the desktop. So what I've done is created just a little piece of media that I can run through my watch folder to create, to make sure that everything is working.
You'll notice that when I double-click on the folder, there's, I'm giving my secret away. Well, I'm just going to go ahead and drag. We're going to put that file into the watch folder. And what's going to happen is Cleaner's going to pick that up and encode it. What it will also do is create a folder structure.
[Transcript missing]
Thanks, Hagie. That was great. For those of you who haven't seen that kind of nested watch folder or cascading watch folder functionality before, it's a real eye-opener. As Hagie said, this is an incredible way to distribute your encoding across multiple machines that you never really have the touch. And it's definitely been absolutely key to the entire way in which we built the workflow for this particular -- Switch back to the slides, please.
Oh yeah, slide please. Thank you. Sorry, I didn't see that. Thanks. So how do we actually take this process that Hagie's just described and apply it to our problem? Well, first of all, what we're doing is those exports that we looked at, or that I described earlier, the Japanese and the English exports, what we're doing is as soon as we've exported those from Final Cut Pro, we're just dropping those into a receptacle, a folder.
Just drop them in and bang, off they go down the river being encoded. So that's the last time you have to see them until essentially you assemble them and review them. So really it's going to go through all of that encoding process, being renamed and being taken through the workflow all with hands-off. It's all completely unattended.
Once they've been encoded through that cascading process, they get then pushed up to another server, which we call the repository, and there they sit ready to be assembled into the different languages. As I mentioned before, it's a completely headless process. You don't need to have monitors attached to these computers.
They don't even have to be in the same room or even in the same building. You can utilize any computers that happen to be around that you can install Cleaner on. They can be added to your workflow. And Whistleblower, as I described earlier, is going to let you know if there are any problems with those encoders. If Cleaner crashes for whatever reason or something goes wrong with that computer and things always happen, you will be notified.
You'll be the first person to know. And it's very easy to set up a system so that you do get paged or it sends an SMS message or it rings your cell phone. And it works really efficiently. As I said, you don't have a monitor on the encoder, so how do we get to them? Well, we have Apple Remote Desktop on every single encoder.
And that enables me, wherever I am, or whoever is administrating the project, to be able to look on and see at any particular machine. If there's been a problem, he or she can log in and see whether it's a problem that can be dealt with manually through Apple Remote Desktop or whether the machine has to be restarted remotely or something like that.
And then lastly, because we're using XSERVs, we can also take full advantage of server status, which is a little app, a little utility that comes with the XSERV that reports to wherever you happen to be, everything about that particular machine or the hardware configuration. And everything else. So let me give you a closer look at how this works.
This is encoder monitoring. This is automatic notification with Whistleblower. That little client that I was describing that James Sentman wrote for us, the Whistleblower process monitor, you can see that up in the top left-hand corner there. That's running. It just sits there in the background. Very lightweight app. Doesn't take any processor cycles really.
And it's just sitting there looking at what's going on. And if it has a problem, it's going to notify the Whistleblower main software. Which is running on a totally different machine. Could be anywhere in the world. Doesn't matter. That's just sitting there looking at, in fact, all of our machines are being monitored constantly. Both repositories, all the encoders, all the capture machines.
Everything's being monitored 24/7. So you've got a complete update on those. And you can see how it looks. If a machine goes down there, you can see encoder 4 has reported an error. It's not responding. And it immediately sends out a signal to a pager saying, oh, encoder 4 is down.
Come and fix me, please. So that's the use of Whistleblower. App Remote Desktop. This is a great application. And really well integrated, of course, with Mac OS X. In this particular example, I'm looking at four different encoders there. I can just move around. I can take one encoder, let it fill the screen. Or I can just have four.
Or I can cycle. I prefer the cycling thing where it just steps through. So every now and then I can have it up on a monitor, like on a plasma or something. And you can just keep an eye out in your administration room. As to what's going on in every encoder.
And if it's like kernel panic, then you know you've got to do something with that particular machine. And then, of course, there's server status. And this is really reporting everything about the hardware of the machine that you're running on. So not specifically the cleaner application. It's pinging that IP address.
It's letting you know what's going on with it. Things like temperature. Whether you've got full power. Whether the blowers are on correctly. You can restart the machines. You can -- and if you've identified it, you can -- you can just go back to the machine. And if you've identified a problem, you know, you'll see in a minute how many of these XSERVs we've got in racks.
And it's like, jeez, which one is which? You can actually set a little flag here, the system identifier. Like you click that button for encoder one. And then you turn to your rack. And the little light will be on. So you'll be able to see, oh, that's the encoder number. 418. And that's the one that you need to replace the drives on.
Okay, let's talk about the assembly process, because I don't want to run out of too much time. We've still got a bit to tell you about. Automation via AppleScript. This is really sort of the glue for, it's very specific to what Apple wanted, which was these multiple language versions that have to be assembled and annotated as QuickTime movies. So as I mentioned before, what we're doing is we're coming out of the encoding system up to the central repository. So here are all these movies. They're in folders. They've got all these correct names. They're all ready to have stuff done to them.
We build the Japanese and the DVD versions in QuickTime Player using AppleScript. It's all completely automated with those scripts, and we log everything about what we've done into a text log as well. We could be parsing that text log if we want and bringing that information back into an online database so that we can see what's going on. In fact, we do something similar to that, although we don't parse the log. We have another AppleScript that sort of gives us a health check on the whole workflow, so I'll show you what that looks like. So what do the scripts do? Okay, well, they assemble.
They assemble from the individual files or from folders of files. I'm going to show you the example in a second, just assembling a single file. Of course, we don't do it manually, do this one, do that one. We just batch process everything that gets dropped into a folder. We check for possible errors.
So how does a script check for errors? Well, typically, we found that errors might be just corrupted files where they didn't get encoded properly. And what we do is we do some basic checks. We look at the exact sample length of each track, and we compare it to the other ones.
And if they're even slightly out, we're told about that because they should be pretty much the same, certainly within like a second or two, right? Otherwise, your video track is going to be out of sync with your audio track. We add annotations. Remember, Hege said earlier that we don't do that in Cleaner. We do that afterwards in this script.
It's much easier just to batch process all of these annotations because we're hinting, and you have to do annotating before you do your hinting, not after you do your hinting, because if you do, then your annotations don't get carried over into the hints. We hint. We do the streaming versions, and then we set these flags for the DVD versions, these alternate language tracks, which is a pretty cool process. I'll show you how that works, too. Oh, time for me to show you how that works. Okay.
Let's have this laptop up here. So here are some scripts. Let's first of all look at what we've done here for the English one, because this is the simplest Apple script of all. I'm just going to open this up, and I'm really not going to bother to go through this line by line for two reasons. First of all, I don't know what it means, line by line.
But secondly, I don't really want to bore you with all the sort of intricacies of this script. There's a lot of sort of setup and checking and things like that. What we're doing is we're just setting up the script to start with, then we have some code that allows us to set up whether you're choosing a file or choosing a folder. And then going back down through here, we've got the sort of core code about how to process a file or how to process a folder. Then we have an interesting thing here. This is actually pretty neat. This is a hack. God bless the QuickTime and Apple script.
This is a hack whereby we actually use a thing called a QT export to be able to set up parameters for, in this case, hinting a movie. We actually have a file that tells the Apple script what to do within QuickTime when it's doing an export, because when you export from QuickTime, you have lots of different options. And usually what QuickTime Player does is it just remembers the last thing that you did, but what if you did something different the last time? You have to set up a template.
Well, currently, the... The only way to do that is a little hack called a QTES file, and this is where we set up the parameters for that, and I'll show you how I use a QTES file in just a second. We got sort of open and then do this, do that and do this, and let me just scroll down. You see it's quite a big script. And basically, we go down and then we log everything here as well. So essentially, what this script is doing is it's opening it up, it's adding some annotations, and then it's saving it as a hinted.
So it's a quick time movie. So if I just go ahead and look at some source, here's some source right here. So here's the input. Here's some stuff that has just come fresh out of our encoder. So you'll see here, I'm going to do the 300K file because it's just shorter.
That's the naming convention, the session ID, the language, and the bit rate. That's going to change in a minute, but this is what it looks like on the way in. And we use different naming conventions this side of the script so as not to confuse with the file that's come out the other end.
So I'm going to process that particular movie, and so I can just manually run this just for the sake of it. So I'm going to just run my script. I choose a file. I'm going to choose that English 300 kilobit second one. It's going to process it. It's asking for that QTS template.
There it is. That's telling it that I wanted it to be a hinted movie as opposed to a not hinted MPEG-2 movie or something else. So you can set up all these templates. You see it's very neat. And now I choose the output folder. You only have to do that once, and then it's going to do it for every other thing you do. And it's done. It's exported it. It's ready to go. And you can see it's dropped into the output folder. There's my movie. It's been opened. It's been hinted. It's been exported and annotated.
If I open that up, you can see that it's got a name now, and we've intelligently given it that name. We looked at the file name, and then we extracted this number, and we created a title for it. So all of that's done in an automated fashion. And if we actually look here, you can see that it's now a hinted movie as well, whereas it wasn't on the way in.
So that's how we do the English. Now, it gets a little bit more complicated now as we get into the Japanese, because what we're doing with the Japanese is we're opening the English version, stripping out the English audio, pasting in the Japanese audio, comparing the length of the Japanese audio with the length of the video, make sure they're not too different, i.e., is it the wrong audio track? And then we're adding annotations and hinting it. And again, fully automated. So we're going to do this. Puppy up.
Hit run. Run the script. Choose the file. This time we're going to choose the Japanese. Remember, this is just audio only. This is not video. It's just an audio only file. And just open up the Japanese. Select the export settings that tells it exactly what we want to do with the file. And choose the output folder. And off it goes. It opens up the Japanese audio, copies it, pastes it, done. Hinted, annotated, completely done. Less than a second to do that entire thing. And there it is dropped into the output folder.
We open that up. Again, it's got the correct name. It's hinted it correctly. Got the information. And it's done. So we can just look at the -- no, we'll look at the log in a minute. Now let's look at the last one, the DVD assembly, which is more complicated still.
This is doing quite a bit more, because this is setting alternate language flags as well, and it has two language tracks in it. And if we zip all the way down to about here, you can see right here, we have, this is the code that sets track one, the language track of track one, which is the English, and the language track of track two, and then the various alternates of... So, if it detects that you're on a Japanese computer, it will automatically play the Japanese audio. If you put the disk into an English computer, it will automatically play the English audio. So it's totally intelligent. Let's run that.
Run the script. Choose the file. Choose the input. So we're choosing the Japanese 900 kilobit per second audio. You notice I'm not choosing the English audio, the English video. It's intelligently saying, oh, you want session number 601 Japanese? Then that means you need session 601 English, right? Yes, OK, let's do it. You can see it's piecing it all together. It's moving some tracks around. It's going to rewind. It's going to select none. It's done. And now we can close that script.
And we can go and look in the log here. There's-- now you notice it's given all of these different names, you see, as well. So we can identify now that's the finished movie, whereas that's the input movie. There's the DVD movie. It used to be called 900, but what does that mean? That's the important thing that it's for DVD. And if we look at this, we can see now-- there we go. Two soundtracks ordered correctly.
If we look at soundtrack one, at the alternate, the alternate of soundtrack one is soundtrack two. And if we look at soundtrack two, the alternate of soundtrack two, which is language, is one. So it's like if not Japanese, then English, or if not English, then Japanese. And if you wanted to do this manually, you could play this delightful clip of Ian Ritchie here and hear him being simultaneously translated. Good morning, everybody. Welcome out to the QuickTime session 601, how to-- 601.
Oh, there you go. Perfect. OK, pretty cool. The EQ was nothing to do with us on that, Japanese I should point out. That was how it came to us. So that gives you an idea of how these assembly scripts work. Now obviously what I've done here is I've just gone, let's select this one, let's select that one. These scripts I just put together purely for this demonstration. This is completely faceless.
You never see these scripts. They just run in the background. They're watching folders. Stuff gets dropped in, bang. They just automatically process this stuff. There's never any human interaction with these scripts at all. And the logs, because I like to keep paper wrapped and records of all these things, I get the logs emailed to me so I can see everything's been done correctly. There we go. Japanese was done. Yeah, there was no difference in the audio. So all those were great.
They lasted 20 seconds. So we have a good record of exactly what was done. And as I said, you could be parsing these into a database, but we actually have a different system, which I'm not going to bore you with because it's incredibly proprietary. And so-- So what am I showing you now? Oh, yes. So I'm going to go back over to the slides. Thank you. Okay.
So this is, now we've got out of the encoding system and done all this, where does all that happen? Well, this is where it happens. It happens in this one rack. So in the middle there, below the Asante switch, and the team at Asante Switch. Thank you. Oh, and the big thing on the top, the PowerFile R200, really, really amazing device.
You should really look at this. If you're very serious about doing any kind of archiving, the PowerFile is a DVD RAM-based jukebox. It takes 200 disks, and we've automated the whole archiving process out to DVD. So when these files have been finished and assembled, they're automatically written out to DVD, so we don't have to put disks in. 200 should be plenty for this project.
These are the compressed files, not the uncompressed files. You'd need a lot more than 200 disks for that. But the PowerFile R200 is great, and what makes it possible to add to this workflow is just last week, PowerFile came out with the Mac OS X version of their MediaFinder software that makes that possible. So thanks to PowerFile for giving us one of those.
Okay, so before we sort of wrap up, which we should do fairly soon, the workflow monitor is a really neat thing. And that lets, we have this projected, we have a big room where we're doing all of this work, and we have a big drop-down thing with a LCD projector. And this is projected up on that screen. So at any time, I can look over and I can say, okay, what's exactly the status of this track or that track? As you know, there are multiple tracks in the WWDC conference.
In this example, I'm using track 700 from this year, which is the QuickTime track. This is just populated with bogus data, because obviously we haven't started encoding yet. But this is what it would look like, hopefully just a few days from now, where I can look up at the chart and say, okay, well, I can see session 700 has been fully processed, captured, encoded at all bit rates, assembled at all bit rates. It hasn't been approved yet, and so thus it hasn't been archived. And I can look down and say, ooh, we haven't got very far in session 702, but I'd needle someone about that. 704, great, that's been approved. It's been archived.
We can put that to bed, and so on and so on. And this is all totally Apple script and FileMaker driven, this display system, and it's web-based on a browser. So a very, very nice way of being able to see exactly what's going through the process. So if my phone starts to ring and I have someone from the developer relations giving me a hassle about where session 700 is, I can tell them it's their fault because they haven't approved it yet.
Okay. Last but not least, and unfortunately, this is not quite as fully fledged a demo as I would have liked. Thanks to the wonderful security of Apple's firewall, I was unable today to, this afternoon, to actually punch through to be able to show you this server, which is running down in Cupertino, alas. But I can give you the general gist of it, and I can show you the actual FileMaker source. What I can't show you is it actually running in a browser.
I just, I couldn't get access even with VPN. So, okay, so what does this do? Well, the whole FileMaker-based thing enables us to, enables the client. To look at all of this content and give feedback and accept or reject sessions. They can be inside or outside the file, hopefully outside the firewall by next week. And so the Apple reviewers can be anybody, and they can be sort of assigned tasks. I'm sure, Amy, you'll be doing some reviewing at some point.
You know, everybody gets to review sessions, and you really have to because you're looking for content. You're listening to what people are saying, making sure they didn't make mistakes. It's very important that the message is communicated correctly. So it's not just about, does the. Does the encoding look okay? Of course the encoding looks okay. It's about the content of that particular session.
So, and there's an admin mode, which allows people like Hagee and myself to sort of get in there and fix things. If Apple say, oh, this one's rejected, we can use that same interface to resubmit that particular thing back into the workflow for whatever reason. And this web-based review process really does speed up the whole review process. And again, it's one of the bottlenecks that we talked about earlier. The bottleneck is either on the input, the tape coming in. Because we don't get them, so that's a bottleneck.
Or it's in getting them approved. So what we hope to see is lots of green lights in that workflow thing up until sort of red lights are okay in the approval thing. But red lights are not okay in the other columns that are our responsibility. So if we can just go back to the laptop very quickly. I'm just going to show you what this looks like in FileMaker.
It doesn't look very much. It looks much more sexy in a web browser. But this is basically how the database sort of hangs. It hangs together. We have the session ID and the name of the session at the top. We have a status here that lets us see whether it's been accepted, rejected, or reviewed. The length in there, whether it's been pushed to the FTP or not.
Whether it's been released. In other words, whether it's something that we can now archive. We have the URLs to the sample. Where do these movies actually live? Well, they have to live at a particular URL. And don't try hacking that because I invented that IP just for this session.
And so that's where the movies live. The database knows where the movie is. It doesn't have to be on the same machine. It's on a separate server altogether. There's just some sort of internal stuff here. Internal notes. And these are the notes that somebody like at Apple would submit. Not through this interface, but through the web interface. They're sitting there watching a session. And they'll say, "Ian Ritchie just said something that he can't possibly say in that session." And they'll write in here that.
And we will get notified of that. And we can then make the arrangements for that thing to be re-edited. And then below that we just have some sort of globals. Access information to prevent unwanted people getting in. And all these sorts of things as well. And there are some Apple scripts that we use to sort of complicate. To simplify. It is quite complicated. To simplify the process.
So, you know, the great thing is that this is not super, super high-end stuff. And this is what I wanted to show you. This is not a particularly complicated database. But it's this database with the Apple script. With the web-based browser. With that monitoring. With all these other Mac OS X apps. All fitting together in this framework. Is literally saving us, the contractor, weeks of production time.
And saving Apple, the customer, an enormous amount of money. Because we're able to do it fast. Well, we're able to do it quicker than we did it in previous years. More efficiently. And with less problems arising during the production process. If we could go back to the slides, please.
Just to wrap up, this is what the complete system looks like. If you are interested to know, we're using four complete work units for this project. It doesn't take up very much floor space, just a rather lot of hardware. But it really works flawlessly. We've been hammering it pretty solid for a while now.
Looking forward to getting our hands on. What we do each year is we just use last year's content. So we've just been plowing through the 2002 tapes to make sure that this all works and ready to receive the next ones. I'm going to hand over to Hagie. Thank you. Thank you.
Could I get the computer going just for one second? I wanted to finish off with just one. I've been noticing there's a debate about codecs going on. So what I did this morning was took my example and... What I'm going to do is show you two movies. What I like to do when I compare these, I will put them both in loop mode. And what I'm going to do is I'm going to start one and then tap to the other one.
Just so you can look, I'll turn off the audio, so you can see one image and then the next image, and your eye can kind of move from one to the other. So this was taking an MPEG-4 example, the setting that we're using, and last year's codec, and putting them together.
And again, I've been noticing that there's a lot of people using, well, people are reluctant to upgrade to MPEG-4. So I'd like to take a few minutes and just challenge some quality assumptions that I've been hearing about regarding these various codecs. Let me just bring up the info window here.
Okay, so these two movies look different. They look different to me. They look different probably to most of you. And if we sat here for a while, we could discuss the merits, advantages, and disadvantages. But clearly, they do look different. And so as viewers, we can make an assessment about the quality. So, all right, that's fine.
I appreciate that, Hagee. Great, thanks. So what? So what I did was I took a look at a little bit of brightness and contrast. Again, the idea of media mastering and getting the both out of those. And again, with these examples, if there was enough time, I'd be happy to.
I'm not going to run people through this. But we want to take some time for some questions. So what I'll do is we'll give my contact information. If people are interested in repeating these types of examples, I'm happy to send you my settings and tell you how I do this.
Okay, so what I did here was I did a little brightness and contrast and a gamma adjustment. What I did was my MPEG-4 I kept with no filters. And what I did was I modified my Sorenson Pro Codec to look like my MPEG-4. And so now I'd like everyone to take a look at that.
And I'm going to show you a little bit of the difference between these two movies. Well, I can hide this. But, you know, I challenge people. I know that there's a few eyes in this room that can see the difference between these two movies. But I'd hazard to say that they do look very, very similar. And the point of this is to show you that MPEG-4 at the same bit rate and frame size and frame rate is very compelling.
Especially when you figure that for Jim and I, this entire encoding process will be finished in one-third less time. Okay, I'm going to say that again. One-third less time. That means out of that 50 hours, we could add a few days. And maybe one of those days would be a weekend. And we could all take three or four weeks to do a gig like this. So, again, encoding in one-third less time. Oh, and not having to buy the Pro Codec and put it on all those machines for a quality debate.
Okay, if you look at the lines here and the edge, again, it is possible to tell the difference between these two clips. You'll notice that there's a little bit of edginess, a little difference in some of the lines. That they both have a certain amount of artifacts. They both do have artifacts. I will admit that. But one is a little sharper and one is a little less capable in that transition. So, again, just leaving those up for a little bit of example there.
Then, okay, yes, Virginia, this is the Sorenson and that is the MPEG-4 over there. And, again, I just wanted to finish with this because this is a sort of important hearing. I hear a lot of people debating these technologies and these codecs. So I wanted to further that debate because I work for Discrete. We make cleaner. So we live in that debate. And I wanted people to consider some of that while they're choosing a codec.
And, lastly, just before we finish up and take questions, I'd just like to point this out, too. That increasingly we're going to see devices, mobile devices, like this Sony Clia. And next year you'll probably all be watching the session on something like this. Again, I have the same video encoded with the Konoma exporter that's inside Cleaner 6 off to my PDA. So I just want to sort of remind people that this is where all this is going.
And that, again, this whole idea of media mastering, of workflows, of encoding is really a matter of taking our media and deploying it in the highest possible quality in the most diverse number of places to reach the largest target. And to reach the largest number of people. So I just wanted to kind of remind people like that. And I'll bring up Amy and we'll do the wrap-up. Thank you.