Digital Media • 1:04:46
This session presents overviews of several advanced OpenGL rendering techniques now supported with the current generation of Apple display hardware. These techniques include projective shadow mapping, texturing from a render surface, and rendering effects generated with vertex and pixel programs.
Speakers: Geoff Stahl, Simon Green
Unlisted on Apple Developer site
Transcript
This transcript was generated using Whisper, it has known transcription errors. We are working on an improved version.
Hello, good afternoon. Welcome to session 513, OpenGL Advanced 3D. This session is a real treat for all of us 3D enthusiasts. We're going to show the process and the very latest techniques for high-end CGI generation using vertex programs. Well, the treat is not only on the content, but also on the delivery.
We have a very special celebrity guest today. NVIDIA demo engineer Simon Green is here to deconstruct the Wolfman demo and share with us his expertise and knowledge about the subject. So to deliver the first part of session 503, OpenGL Advanced 3D, I would like to introduce to the stage Apple OpenGL engineer Geoff Stahl.
The first thing I'm going to do is talk specifically about what advances we've made in OpenGL for Jaguar. Then we're going to hand it over to Simon to talk specifically about the Wolfman demo. So that's me, Geoff Stahl, and Simon will be handling the second half from NVIDIA. Again, I'm going to talk about what's new with Jaguar. We're going to walk through the Wolfman and then take your questions and answers both on Jaguar and how the Wolfman was made after the session.
So what optimizations do we have in Jaguar? First thing we did was we realized that while read pixels and draw pixels are not the optimum path, or draw pixels are not the optimum path to get things to a card, we still realize there are legacy applications that need to use these or there are valid reasons to use these. So we optimized those.
Read pixels and draw pixels now are supported on most cards through DMA engines, so relieving the CPU of having to move all that data. You'll see definitely in your apps that use these, you'll see improvements. We still recommend using a texturing path for when you would normally use draw pixels, but if you need to use it, note that we have done some implementation optimizations for that.
Also, copy text sub-image. This now does not do a round trip across the bus and will stay on the GPU, so you may get rates up to 9, 10 gigabytes per second, which is native on the card, for your copy text sub-image. So if you need to do this, you should be seeing tremendous increases in throughput.
Display list. Display lists are now optimized in your seed, and we're going to continue to further optimize them as we move down and lock down Jaguar. What we're looking to do is move display lists through a vertex array range scheme. So what happens is when you have primitives that you run through a display list, we'll take those primitives, optimize them, put them into a vertex array range, and allow the card directly to map those from AGP memory into the GPU. We've seen tremendous increases in using this. And John Stauffer will talk about it in the optimization.
He'll talk about specifically about and show you what kind of display list increases we've made. So especially for like an architectural application that is set up to use display list and draw a thousand spheres through display list, those kind of applications will definitely see some speed ups. For general purpose applications that you have a large array of data, as they do with the Wolfman, vertex array range is your friend, and you want to move through that path.
Image Processing. We've added the image processing extension for all machines. That includes all machines that even support a software renderer. So you'll be able to get the image processing extensions and we'll go through some of those in a minute, what the optimizations there are. Again, I mentioned vertex arrays. Last year we came to this session, or we came to talking about how to get data through the system quickly.
And we said compile vertex array is the best way to do it. So you take a vertex array, compile it, and that's the best way. Well, we're moving past that now. We're actually looking at vertex array range where you can set a section of memory up and map that into, we will dynamically map that into AGP and the GPU will DMA from that memory with the vertices without having to involve the CPU in the process. That is going to be the fastest path in Jaguar for you to use. So look at vertex array range.
It's only a couple of additional calls if you're already on vertex arrays and it'll really give some great speed ups for your applications. for as far as things we've added that aren't extensions, surface texture. Surface texture is supported both in Cocoa and Carbon. Carbon is AGL surface texture. Cocoa is create texture.
I talked about this earlier in one of my sessions. And specifically, this allows you to render to a surface, which can be a window that you just don't show, or it can just be a window, and use the content of that as a texture. The requirements are very simply whatever requirements were there originally for the GPU for textures.
So if your GPU supports rectangular textures, then you can do a surface texture of any arbitrary size. If it doesn't support rectangular textures, you're going to still need a power of two surface to be supported, and you'll just pass that in as you would normally. Last, and probably the best news, is extension support.
One of the things we concentrated on for the 10.1 release was getting a very good, stable OpenGL release that you all could work on, could build your applications, and really didn't have any downsides or any flaws in it. We think we've really achieved that with 10.1. 10.2, we said we really need to add the features you're asking for. So we had 34 extensions coming into today, and Jaguar has an additional, at this point, planned 30 extensions. So we're bringing up the 64 new extensions, and I'll go over what they are. And there may be some more in the works as we move toward the final.
So one of the main areas of the past year that's really increased in visibility in the commercial market is programmability. Programmability is both on the pixel side and on the vertex side. We've supported through a set of about seven extensions. We have one unified extension, our vertex program, which we talked about a few days ago.
And that will take care of the vertex programming along with providing you a great tool, the OpenGL shader builder, which will allow you to work on vertex programs today on the Jaguar C that you have. This gives us the vertex programming functionality. It's available for the ATI Radeon 8500. NVIDIA GeForce 3 and 4 all support vertex programming in hardware.
All other machines that support OpenGL have a software implementation that's optimized for the CPU. What this means is your PowerBooks, whatever generation they are that you have out there, if they support OpenGL, you can use that OpenGL shader builder and build shaders and have that optimized software implementation.
From the pixel program and fragment program side, from the NVIDIA we have all five of their programmability on that pixel side extensions. Registered Combiner 1 and 2, Texture Shader 2 and 3, which will support the full pixel programmable path. On the ATI side, we're targeting ATI Fragment Shader as a support there with the unified extension.
So vertex array range. I think we talked about this already, but to reiterate, it allows the GPU to DMA the vertex arrays directly from client memory. You don't have to involve the CPU. So this allows you to alleviate the CPU of that burden, use the CPU for other things, and utilize the GPU to its fullest. It supports both Radeon cards and all NVIDIA cards, but does require hardware TCL.
Apple vertex array object is a very simple extension that extends this, which allows you to add on the idea of multiple vertex arrays like you would have textures. If you have texture objects, you may want to have vertex array objects. So you would have for different objects in your scene, you could have a vertex array for each one of them instead of having to manipulate a huge, large vertex array and have sub ranges inside of that. So that operates very similar to texture objects with generate and bind.
Apple texture range is similar to vertex array range but allows clients to specify memory that they map texture space and allow you a very fast texture upload and utilizing this for the maximum texture throughput performance. John Stauffer will also talk about this in the session following this as far as performance of this.
Apple fence is a basically will insert a token into the command stream so it allows you to do more synchronization if you have audio/video synchronization you want to do. Or OpenGL and some other process you can do that with an Apple Fence extension both synchronously and asynchronously as far as the handling of the fence.
So some additional extensions. For texturing, we have texture mirror repeat, which is going to basically double the size of your texture with a mirrored section of it. Texture EMV crossbar, which allows you to more arbitrarily reference texture units when you're doing combining. Texture mirror once, very similar to mirror repeat, but the fact is that you have a single mirror without the repeating edges. SGI depth texture and SGI X shadow, which is something that Simon will talk about later, where it was critical for the Wolfman to get the self-shadowing. This allows you to do shadow mapping techniques. These two extensions are critical for that.
So rendering extension. Something that's been there before but I think we should really mention is our multi-sample. Multi-sample's been in through 10.1. It's fully supported through both Cocoa and Carbon and allows you to do anti-aliasing both in windowed and for full screen. So you can have full screen anti-aliasing support over all cards that support the multi-sample. Secondary color is an extension that's been around for a while. People are asking for it to support secondary colors both in vertex programs and through our normal path.
Fog coordinates allow you to pass explicit fog coordinates. Draw range elements. It seems to you, those folks who know OpenGL, well that's an old extension. Why is that new for Jaguar? We wanted to make sure that was in there so if you're checking for that on machines that don't report OpenGL 1.2 and only report 1.1, you can specifically check for that which allows you to draw a range of elements instead of the entire array.
Stencil wrap for stencil shadows. This is a key thing to wrapping your stencil buffers around. Fog distance allows you to specify an eye radial fog distance instead of just a planar from the eye point. So it allows you to have more correct looking fog which I know Austin will be very happy with that we have that.
Multi-sample filter hint allows you, as you may have read, there's a, NVIDIA has a multi-sample technique called Quinconcs. And to do this you would specify the four tap multi-sample and then you would say nicest as far as the filter hint and it would pick the five tap Quinconcs algorithm. What that does is allows you to pick nicest or fastest in your multi-sample algorithms. And depth clamp, again, in some of the shadowing algorithms covering shadow volumes you want depth clamping rather than clamping to the actual frustum.
Pixel transfer, we talked about RBM imaging, histogram, blending, min-max, some convolution filters, those are all in there in every OpenGL, any CPU that you have, RBM imaging will be supported. And also we have some new ATI extension, blend equation separate, which allows a separate equation for the alpha and for the color, which can be very useful at times, and weighted min-max, which allows you to do weighting values for your blending.
And then lastly, blend square, so for some lighting calculations, that allows you to square in the blending to bring out the highlight if that's needed. Point parameters, both point parameters and point sprite are supported across certain hardware, and that allows you to specify OpenGL either as points for like a particle system or put a texture at that point and show the texture, so you can do billboarding textures very simply.
So, that completes kind of what's new for Jaguar. We think that we really have a great OpenGL and one of the driving factors behind some of the improvements has been some of the research that Nvidia has done, including the Wolfman demo. We need to make sure that we have all the features there, we can run this, and I guarantee that what you have in your seed is exactly what we have up here on stage running the Wolfman demo. So all the features that Simon's going to talk about are in Jaguar, in the seed, and are planned for the final release. So, without further ado, Simon Green from Nvidia to talk about the anatomy of the Wolfman.
Thank you, Geoff. So I'm not sure if I'd really consider myself a celebrity, but I do work for NVIDIA. I actually work in the demo team. For those of you that aren't familiar with that, NVIDIA actually has an internal group whose job it is basically to create demos that show off NVIDIA's hardware. So today I'm going to be talking a little bit about the Wolfman demo, the anatomy of the Wolfman. And it's all about the advanced fur rendering techniques that we used using OpenGL. And that's a little preview of the sound there.
So we have a little motto in the NVIDIA demo team. We make the marketing lies come true. Now that's really just a joke, but it has a serious point behind it. Because when NVIDIA comes out with a new piece of hardware, we always have a number of new features.
So it's very much our job to demonstrate those new features, demonstrate what you can do with the extra performance that the new product provides. It doesn't really matter how good your technology is. If you don't have something that demonstrates it and shows developers what's possible, it's really not worth anything. So that's our little motto.
Okay, so I'm going to give you a brief overview here. So the Wolfman was one of four GeForce 4 demos that we had at launch, and it was actually running on the Macintosh on OS X at our launch event in San Francisco. And as I said, the main reason for these demos is really to showcase the performance and also the programmability of the GeForce 4, because GeForce 4 had several new features that kind of enhanced the programmability of the part. Clearly, the main focus of the Werewolf demo was to demonstrate this volumetric fur rendering.
And the whole thing was animated using the vertex shaders, and I'll talk about that a little later. And also the lighting was done using the pixel shader technology that we have. And the whole thing is also, as Geoff mentioned, is shadowed using shadow maps. And it runs using OpenGL with NVIDIA extensions. So how did we do it? So before we get to that, first of all, why fur? There were lots of different demos we could have done.
But if you look in the real world, a lot of things in the real world are fuzzy. It's very easy to do plastic and very hard materials in computer graphics, but it's much harder to do things that are kind of fluffy or fuzzy or hairy, those kind of general effects.
Geoff Stahl, Simon Green In addition to that, fur wasn't something that people had seen in real time very much. There had been a couple of Microsoft demos with some quite simple fur rendering techniques, but people hadn't seen it before on a kind of fully animated character model. So we thought that would be a cool thing to try.
So how do you render fur? Well, there are two basic methods. The first method is probably the most obvious one, and that's just to use geometry. So clearly fur is made up of a number of different kind of, well, a lot of hair strands, basically. So the most obvious way to model it is just with geometry. So for each individual hair in the fur, you basically just draw a curve, and that curve would be made up of line strips, basically.
So the obvious problem with that method is there's a lot of hairs in fur, right? I don't know how many of you have seen Monsters, Inc., the Pixar movie, but apparently the Sully character in that actually has three million hairs on him. So, you know, our hardware's good, but it's not quite at the point where it can actually render three million individual hairs and light them in real time.
So instead, we actually used what's known as a volumetric method. And the basic idea behind this is the fur is represented using textures. So rather than using geometry, you're using images. And it kind of approximates the kind of gross look of the fur without actually having to draw each individual hair. So that's the method that we use.
So I'm just going to give you a little brief history of fur rendering here. Probably the first and most influential reference in the literature to do with fur rendering is this paper called "Rendering Fur with Three-Dimensional Textures." This is one that Jim Kajir did back in SIGGRAPH 1989. And, you know, 1989 is a long time ago in computer graphics.
It's very much the kind of golden age of computer graphics, so this is a pretty old paper. So the basic idea here was that rather than using geometry, you just represent the density of the fur using a 3D volume texture. And that basically just means if you imagine an image and just generalize it into three dimensions, you have a volume texture.
So rather than having a 2D array of pixels, you have a 3D array of what actually he called "texels," although that's a slightly confusing term because we now also use "texels" to mean pixels within textures. But that's just a detail. His other main contribution was this idea of actually lighting the hairs based on the tangent direction. Now, all the tangent direction means is that's just the direction that each individual hair strand is pointing in.
So what he did with all this was he did quite a cool image of a teddy bear. And if I go on I can show you that picture. And as you can see, it's a very nice image. And in some ways, the fur rendering here is better quality than the stuff that we do in real time.
But you have to bear in mind that this was rendered on a network of 16 IBM mainframes in total, or at least there were 16 processors. And it took two hours just to create this one image. So it's nice, but it took a lot of time, and it's a long way from being real time.
So now we're going to jump ahead 10 years or so. This was the next major paper on fur rendering. This paper called "Real-time Fur Over Arbitrary Surfaces" by a guy called Jed Lengel at Microsoft Research. He introduced this concept of shell and fin rendering. I'll get into what exactly that means later on. But the basic idea is you create these concentric shells. Imagine taking the base polygon mesh of the character and then extruding it out to create a number of concentric shells.
It's kind of like a Russian doll with one inside the other. Then you texture each of those shells with a different image, and that approximates the fur volume. We'll talk about that more later on. The second part is this idea of fins, which is basically just extra geometry that you use to improve the silhouette edge. The image that he created, or at least the most famous one that he created, was this furry bunny rabbit right here.
And you can see, again, this is a pretty impressive image. This, I think, had about 5,000 polygons in it. And he was actually running this on the original GeForce. It was like a GeForce DDR. And I believe it ran at about 12 frames a second. So pretty good. I mean, interactive, but not as fast as it could be. And you can see there isn't really much in the way of lighting on the fur, either. It looks relatively flat. There's no kind of gloss to it.
So we decided we wanted to do a fur demo, but teddy bears and bunnies, they're really not NVIDIA's style. So we had a meeting and we were thinking about different things. We considered doing a gorilla, we considered doing a yeti, the Sasquatch. But in the end we decided on a werewolf.
So this is actually one of the concept sketches that our artist Daniel Hornick did. And I have to give full credit for him because these demos are really all about the artwork. If it wasn't for the art, you'd probably be looking at a furry Taurus, or maybe a furry teapot, rather than the furry werewolf right now. Okay, so now I'm going to go ahead and actually show you the demo, so if we can switch to that machine.
I'm not sure how many of you have seen this before, but this is our werewolf demo. As you can see, he just kind of happily walks along this street. He has a number of different animations that he goes through. This is very much the money shot here, I think, where he does the howl at the moon. If I just wait a little while.
There's several interesting things that you should be looking at here. First of all, the fur itself. If we just pause it here, you notice there's a pretty convincing sense of furriness here. You notice when I move the light around here, you notice that there's a very subtle sheen that goes across the fur as we move the light. That's all done using the pixel shader hardware, and I'll talk about that in a minute.
The second obvious thing is the whole scene is shadowed using shadow maps. So you notice as the werewolf walks past the lights here, he is casting shadows not only on the ground, but actually on his body as well. If I run it in slow motion here, you notice there is, for instance, a shadow. There is a shadow of his arm on his leg, for instance. So, you know, his body is casting shadows on itself, not just on the ground.
The other major feature we're showing off here is bump mapping. Now, other demos have done this, but if you look at the quality of the bump mapping on his face here, you'll see there's a lot of detail if we look at the wireframe. One of our marketing guys described this as an irresponsible use of polygons.
I think that's an apt description. When you look at the wireframe and it just looks completely white, that's really what we aim for. But if you look at the geometry on the face here, you'll notice that a lot of that detail really comes from the bump map. It's not in the geometry itself.
Okay, so that gives you an idea of what the demo looks like. If we can switch back to the slides. Oh, we're on. So just a few statistics about that. There are about 100,000 polygons in the model and the scene in total. So it's saying 100,000 polygons per frame. And the demo runs at about 30 frames a second. So it's pretty fast.
So, rendering fur with shells and fins. So how do you actually use this technique? So as I said earlier, the basic idea is to generate these concentric shells by scaling the base skin mesh along the vertex normal. Now, I'm not sure how familiar you people are with 3D graphics, but that's a relatively simple thing to do.
Once you have the polygon mesh, you just duplicate it several times and scale it along the vertex normal. Perhaps later on we might go back to the demo and I can actually show you it within the demo as well. So once you have these shells, you texture each shell with a separate 2D texture that describes a slice through the fur geometry.
So you have this fur geometry that describes all the individual pieces of hair in the fur, and then you generate these 2D textures that kind of slice it through that geometry. So once you have those fur textures, you apply them to the shells, you blend it all together using blending, and that gives you the final result of this semi-transparent furry volume.
And one other trick that you can use to kind of improve the illusion of depth, if you like, is to just shade the lower layers in the fur a little bit darker than the top layers. And that kind of simulates the self-shadowing of the fur, because clearly light doesn't get so deep into the lower layers.
And in this demo, we actually just used eight layers. Now, you could use more, but clearly for each extra layer you use, that decreases the bit of depth. So we experimented with different numbers, but eight was the best kind of balance between the look of the fur and the performance.
So you might be asking yourself the question, how do you create these fur textures? And we spent quite a lot of time on this, and we ended up actually doing a special kind of custom tool just to generate the fur. A lot of the initial demos we had just used very simple fur that looked very kind of combed and flat. And, you know, we decided that wasn't really the look we wanted for a werewolf.
You know, you have to have a lot of hair. You expect werewolf fur to be more kind of matted and dirty. So we actually came up with this fur design tool that, you know, enabled us to kind of tweak the fur parameters, like the curly nurse. And, you know, you can kind of apply noise to it to basically kind of make it more random. And this is a screenshot of the tool, and you can kind of see the sliders at the top there.
So I just went through most of this. The hairs themselves are actually defined something like using a particle system. So that just means that they're kind of points moving almost under gravity. And that's what we use to kind of define the path that each individual hair takes. And then in the tool, those hairs are previewed using line strips. And then once you're happy with the look of the fur, you press a button and it actually voxelizes that geometry into a volume texture, and then it writes out those textures to disk.
So this is what those fur textures actually look like. You can see, so 0 to 7 here, this is actually going up through the fur volume. So you can see at level 0 there, they just look like a lot of kind of small points because the hairs are kind of pointing straight upwards.
And then as we go up, they kind of move over, the fur is kind of combed to the right. So as we move up through those textures, you kind of see more and more of the fur. And then by the time we get to level 7, a lot of the hairs have kind of faded out. So as we go up through the volume, the density decreases.
So we actually used relatively small fur textures in this demo, just 256 by 256 pixels. And the main reason for that is because fur is relatively random, you can actually repeat it several times over the surface, and you don't really notice that it's a repeating pattern. And it's good to keep textures small because it also improves the texture. It improves the kind of cache coherency in the texture mapping hardware.
So we use 256 by 256. And what you're actually seeing in these images here is actually the alpha channel. So that represents the density of the fur, effectively, how much fur there is at any one point. What you're not seeing is the RGB components of the texture, which actually store a per pixel tangent vector. So that basically describes which direction each individual fur strand is pointing in. And we use that to do the lighting, which is what gives the fur that kind of glossy look.
So this is just an image of the fur without any kind of color or lighting on it. And you can see it looks pretty good, but it's very, it looks like an albino wolf or something. So if we go on to the next image here. So to give the fur color, we actually use this fur, a totally separate fur color texture.
And this fur color texture covers the whole of the surface, whereas the fur textures themselves are repeated like, you know, ten times over each surface. The fur color texture covers the whole surface and allows us to kind of, you know, give it this colored, kind of striped look. So when you put the fur textures together with the color texture and the lighting, you get something that looks like that.
So there's only one problem with using this shells technique. And as you can imagine, when you look at the shells from the side, especially on the silhouette of the character, you start to see the gaps between those shells. And I'll show you what that looks like in a second.
So the solution that Jed Langell came up with to improve this was to add this geometry that he calls fins. And they're called fins because literally they kind of stick out from the surface. And the way we generate this geometry is we basically just, for each edge in the original polygon mesh, we create an additional quadrangle that just kind of sticks up straight from that mesh. And these fins are textured with a totally separate image that just has a kind of, you know, generic cross section of the fur in it. I'll show you what that looks like in a minute.
Now, there's several ways you could do this. You could try and dynamically generate these fins just on the silhouette, but that would be quite expensive. So, it turns out the easiest way to do it is just to create fins everywhere in the model and then just fade them in and out based on the angle between the surface normal and the view direction, so that they only -- they're always there, but they only appear on the silhouette edges. So, this is what the actual fin texture looks like.
Okay, so we're just going to step through a few images here. This is just a close-up of the werewolf's arm, and you can see the polygon mesh there. This is what it looks like with the first shells that I was talking about earlier. You can see they're just copies of the base mesh, but just extruded out along the vertex normal.
And then finally we add the fins in it, and it's a little hard to see where the fins really are there, but you can see that they kind of poke out from the silhouette of the character. So when you put it all together, it looks something like that.
And in this image, at the edges here, you do start to see the gaps between the shells. And it's pretty obvious that the fins and the shells aren't really perfectly lined up. But when you look at that from a distance, you really can't tell. So it's, like all computer graphics, it's a hack, right? It's an illusion, but... It works. So actually at this point I might just switch back to the, to demo machine three if I could. Because this stuff is easier to understand.
[Transcript missing]
So I'm going to try and zoom up here and show you actually how the fur works. So first of all, I'm going to switch off the fins, and I'm going to remove all the fur layers. So this is what a naked werewolf looks like if you're into that kind of thing. One of the slightly crazy things actually is that even underneath his fur, this werewolf is bump mapped, which may seem a bit over the top, but you can see as I move the light around, you can see all his ribs there and his spine.
So I'm going to add back in the fur layers one by one. You can kind of see the effect that each one has. So that's the base layer. That's level zero that we saw in the images earlier. Yeah, it doesn't really look like fur yet. So that's layer 1, 2, 3, 4, 5, 6, 7.
So you can see as you add in the layers, you get more and more depth. And we can also kind of scale the layers between, or scale the distance between each of those shells. You know, this gives the well something of a kind of blow-dried look, I always think.
But yeah, if you go too far, then the illusion is really shattered, so we usually keep it relatively modest like that. So this is without fins, and you can see that if you get too close,
[Transcript missing]
So you see, because the light is behind the character here, the fins are lit in such a way that they actually receive light from behind, so it gives him a nice kind of rim lighting highlight around his edge there. Okay, so hopefully that gives you a good idea of how the FUR works. We can switch back to the slides.
Okay, so how do we model and animate this character? So the whole thing was modeled in Maya, which is, of course, now available on OS X. So you can all rush home and model your own werewolves. There's no excuse. One of the interesting things about our workflow is we do model everything as NURBS, NURBS surfaces, which are a curved surface description. And the nice thing about that is it kind of gives us the flexibility to change the number of polygons in the model.
So we model it as NURBS, and then once it looks good, we then convert it to polygons. And then if we later decide that we can get away with more polygons, or perhaps we need to scale back and we need fewer polygons, then we can just change the tessellation in Maya, and it'll generate that many polygons for us.
So it is nice to have that flexibility. The base mesh of the character I showed you is a little bit different. The base mesh of the character I showed you has about 20,000 polygons in total. Now, not all of those, as you noticed, are furry. Not everywhere on the werewolf is furry, mainly his back and his arms, and there's a bit on the legs. So when you take those polygons and multiply them by eight, that's where the 100,000 polygons comes from.
Now, as for the animation, the whole thing is animated using a skeleton, and that skeleton has 61 bones in it. So, you know, all the spine, the arm, the legs, and it even has, like, all the fingers and thumbs. So, you know, if we wanted to, this werewolf could play the piano.
An interesting thing that our artists pointed out to me is that in some ways these characters are comparable to the complexity that people are using in film and television production. The number of polygons perhaps isn't, but the character setup is. In total there were about a thousand frames of animation that was keyframed by an animator that we contracted, and it runs at 30 frames a second.
So this is just a quick screenshot of the Wolfman in Maya. You can see the skeleton there and the fingers. The yellow surfaces that you see are the surfaces that we applied fur to. And the red ones, I think, are just the surfaces that weren't skinned. So that means they don't actually bend at all, like the claws are always just static objects.
So that's what our werewolf looks like in Maya. So once we have this stuff in Maya, you may be asking the question, how do we make it run in real time? In the demo team, we actually have our own proprietary NVIDIA demo engine, which we imaginatively call NVDemo.
This is used for all of the in-house, well, pretty much most of the demos that we do in-house at NVIDIA. And it basically provides us with a scene graph library. So this is something that kind of manages all the lights and materials and cameras and all that kind of stuff. And it also, it takes care of managing all that scene data, also doing stuff like culling and sorting.
Basically all that kind of tedious legwork that you have to do when you're doing a real time 3D application. And the other big part of it is we also have a Maya plug-in that will take the data from Maya and convert the geometry and the lights and the materials to our own custom file format, which is then loaded into the demo engine for display in real time.
Okay, so how do we actually make this werewolf move? So all the animation is done using vertex shaders. And I believe in previous sessions they have talked about vertex programs a little bit. So I'm not going to go into too much technical detail here. But vertex shaders are really cool. So they basically give you total control over the hardware processing of geometry.
So you can do your own transformations. You can do your own lighting calculations. If you want to do some weird kind of deformation, you can basically write that code yourself. And it's exposed as a relatively simple assembly language, which does scare off some people. But I personally like writing in assembly language, but that's just my personal preference.
But one interesting thing is there are going to be higher level languages coming out soon. There's OpenGL 2.0 and there's various other efforts that are going on at this point. So at some point you will be able to write in a C-like language and then it will be compiled to the actual vertex program assembly language.
So we use vertex shaders in the werewolf demo for several things. First of all to do the skinning. I'm going to talk about a little bit about the skinning. I'm going to talk a little bit about what skinning is about in a minute. But basically that's the process of making the skin deform as the skeleton moves.
Secondly, for scaling the fur layers along the vertex normal, that's actually a very important thing, but it's a very simple thing to do in a vertex shader. That's just a single instruction that takes the vertex position and basically adds a fraction of the normal onto that position for each of the fur layers.
Thirdly, we use vertex shaders for the setup calculations for the per-pixel lighting for the bump mapping and for the fur shader that you saw. Lastly, we use it to do the text coordinate generation for the shadow mapping. So I don't think I'm going to go through this in too much detail, but this is just a little extract from one of the vertex programs that we used in the werewolf demo.
And it starts off with the skinning. So what we do here is we're actually using a technique called matrix palette skinning here. So that means there's several matrices, and then for each vertex there's an index which describes which of those matrices we're going to use. So anyway, so basically we transform the vertex by the first bone, and then we also have to transform the normal by that bone matrix.
And each of these is weighted as well, so that's what those moles there do. We also have to do that for binormals as well, which is part of how the per pixel lighting is done. So next you see we have the MAD here, and that's what does the scaling of the vertex along the normal, and that's just a single instruction. Next, we actually have to project that vertex to the screen. So these four DP4s actually take that coordinate in I-space and then transform it to clip space.
So once we've actually figured out the position of the vertex, we have to do some calculations to figure out the lighting. So we work out the view vector, so that's basically the vector from the eye to the position of the vertex that we're lighting. So we work that out.
Then we work out the half-angle vector. This is a kind of standard way of doing Blinn shading. It's basically a vector that's halfway between the eye vector and the light direction in this case. We have to transform that into tangent space. I'm not going to go into a huge amount of detail, as I said earlier. But basically, once we've done all that setup, we take those values and we store them in the output colors.
So that gives you a feel, at least, for what a real vertex program looks like and the kind So the first part of the code there that we saw was doing the skinning. So the justification behind skinning is, you know, we want to have a smooth skin that covers the whole of the character, right? And we want that skin to move as the character animates.
Now, there are several ways you could do that. You can imagine the easiest way to do it would just be to store the positions of each of those vertices for every frame in the animation, right? And that would give you a lot of flexibility. You know, you could do muscles bulging and have jiggle and all that kind of stuff.
But that would be a very expensive way of doing it, both in terms of storage -- actually, mainly just in terms of storage. So the idea behind skinning is rather than storing all that data, we basically just store a static skin for the character. And then we animate the skeleton. So the skeleton is what I showed you earlier in Maya.
So we animate the skeleton and we use the skeleton to deform the skin in real time. And the vertex shading hardware of the GeForce 4 is actually doing that deformation in real time. So the basic idea is, as we saw in the code, is you transform each vertex by multiple different transformations.
And they're basically the transformations of the nearby bones. So if you imagine in a werewolf's arm, this position on his skin is clearly affected by the bone of the forearm here and also by this kind of upper arm bone and maybe the shoulder as well. So the final position of the vertex is basically a kind of weighted blend between those transformations. And we store those weights actually with each vertex. And we store those weights actually with each vertex. There are tools within Maya where you can actually paint vertex weightings. And most Maya users are familiar with this technique. I think it's called smooth skinning within Maya itself.
One of the things about skinning is it's always a lot of fun to debug because when you're--it's one of those things where either you get it right or it's totally broken. So this is just a funny image, which is one of the--where the skinning went wrong. I must confess I had nightmares about this for weeks afterwards. It's something about the way his hand is kind of distorted and he has these really long fingers. Yeah, I still can't look at that actually.
This is another kind of one of these outtake images. This actually wasn't a bug in the code. This was because the animator we had, Geoff Bell, who did a great job, by the way, was more of a kind of traditional 2D animator. So he was animating this character in Maya, mainly looking at it from the side.
So he wasn't really, you know, from the view you're normally looking at it at, it's not clear that the hand was actually going through its face. But when we looked at it from this angle, it was a lot more obvious that that wasn't a good position for his hand to be in.
Okay, so now we're going to go on and talk about pixel shaders a little bit. Again, I'm not sure how much experience you guys have with pixel shaders, but... The basic idea is pixel shaders offer you programmability at the per pixel level. Now, at this point in time, they're not quite as flexible as the vertex shaders.
You have some control over what you can do with pixels and you can do a lot of blending operations, but it's more of a kind of configurable hardware than actual programmability per se. Now, in OpenGL on NVIDIA hardware, pixel shaders are exposed using two extensions. The first one is MVTextureShader. Now, that's what gives you control over the kind of texture addressing operations. So you can do dependent texture lookups.
That is kind of using the results of one texture lookup to affect the lookup in a subsequent texture lookup. And you can also do things like you can do dot products between texture coordinates. You can do that by adding color values. So quite often you use that for lighting.
The nice thing about texture shader as well, everything happens in floating point precision. So you have a much more precise result. The second part of pixel shaders in OpenGL on NVIDIA hardware is the NV register combiner extension. Now, this is basically a programmable means of combining texture and color results together.
And it's actually very flexible on the GeForce 4 because you actually have eight combiner stages that you can use. So that means that there's a lot of math you can do in the register combiners. And the anisotropic lighting model that you saw on the fur there was actually all done in the register combiners. The only disadvantage of the register combiners is that they happen in fixed point precision. So I think it's actually a nine bit precision. But they are very, very fast. So that's the kind of trade-off.
[Transcript missing]
OK, so let's talk a bit about shadows. Everything in the demo is shadowed using shadow maps. For those of you that aren't familiar with shadow maps, shadow maps are the same technique that are like Pixar used, for instance, in "Renderman" to do all their shadows. The great thing about shadow maps is that they're an image space technique. You may also be familiar with stencil shadow volumes, which can be a great way of doing your shadows for some applications. But the great thing about shadow maps is that because they're an image space technique, all you have to do is one extra pass.
As I say here, the other good thing is that performance is linear with the complexity of the scene. You really don't have to do any kind of pre-processing of the scene. If you can render something, then you can shadow it. They are relatively easy to implement once you have the code for reading back the depth buffer.
Now, in OpenGL, the shadow mapping hardware on the GeForce 3 and 4 are exposed using the GL ARB Shadow extension. There was also--this demo is actually using the SGIX extension, which provides pretty much the same functionality, but that functionality is now being rolled into the ARB Shadow extension.
The only disadvantages of using shadow maps are the aliasing. You might have noticed in the demo that I showed you that you do occasionally see some kind of blockiness in the shadows. That's just not a fact of the fact that it is an image space technique. Where the shadow map texture is magnified, you are going to start seeing those kind of magnified textiles.
So I'm just going to briefly go over the shadow map algorithm here. So as I said, the first pass when you're using shadow maps is to render the scene from the light's point of view. So if you imagine where the lights were in that scene, the street lights, we're actually in a separate pass rendering the scene from that light's point of view, kind of looking down on the werewolf at the street.
So, once we've done that render, we copy the depth information. And that's an interesting point, because you only really need the depth values. You don't have to render color or textures or any of that kind of stuff. You just copy the depth information to a texture. So that texture basically is our shadow map texture.
So in the second pass, when we're actually drawing the character with color, we project that shadow map texture back onto the scene, and this is using projective texturing in OpenGL, which is a fairly well-known technique. And if you look in your OpenGL Redbook, it will explain how projective texturing works. And I should also mention that if you go to the NVIDIA website, we do have a lot of examples and a lot of presentations about how to do shadow mapping and how to use pixel shaders and vertex shaders.
But anyway, you project that shadow map texture back onto the scene, and then the hardware actually does this comparison. So it does a comparison between the depth of the pixel that you're rendering at that point and the corresponding depth from the point of view of the light. So I really should have had a diagram here, but the basic idea is if the value in the shadow map is less than the depth value of the pixel we're rendering, then that means that there must be something between us and the light. So therefore, the point we're looking at is shadowed. On the other hand, if those values are roughly equal, that means there's nothing in the way and therefore the point is visible, so we're unshallowed.
So that's how shadow maps work. And they're a great technique, and I recommend you use them. So here I just thought I'd give you a little bit of a look into what the actual pixel shader code does. Now, there are various different ways of expressing pixel shaders, and I've actually just given you pseudo code here because the actual register combiner setup code would be very long and it would fill like pages and pages.
So this is really just pseudo code, but it gives you a good idea of basically how we're doing the bump mapping in the werewolf demo. So I showed you a lot of the kind of textures that were kind of made up for earlier, but this is concentrating on the bump mapping. So in texture zero we have the color map. So that's basically just the color of the surface.
One interesting detail is the alpha channel of that color map contains a kind of shininess map. So that basically represents how shiny each point is. So, for instance, on the mouth of the werewolf, we want that to look wet and shiny, so that would have a very high shininess value. I'll actually show you some of the textures later on.
So in texture unit one, we have the bump map. In actual fact, it's a normal map, and I'll show you what that looks like in a second. In texture unit two, we actually have special texture that encodes the, it's not really a Phong specular map, actually, it's more of a kind of Phong, Phong-Blin specular map. But that basically determines how the light interacts with the surface, and how big the specular highlight looks. So if you look at that texture, it basically just encodes a curve, a kind of power curve that goes up like this.
and Texture 3 contains a shadow map. So, one of the cool things about the GeForce 4 was that It actually exposed a new texture-shadow operation, which meant that we could do the color map and bump map with specular in just three texture units, whereas previously it took four. That allowed us to also include a shadow map, so for the first time we could actually do color map, bump map stuff with a shadow map all in a single path. That's what we're using in the demo here.
So, in the primary and secondary colors, we send down the light direction and the half-angle vector. And then basically in the register combiners, we're computing the diffuse lighting, so that just computes n.l, basically. That gives you your diffuse lighting. Then we calculate the shadow factor. One of the things you'll come across with shadows is you don't want your shadows to look completely black. Although, in the real world, that's how they would be. If you have a point light up here and something's in the way, then this part is completely black.
There's a lot of ambient lighting that we're not really simulating in real time. To compensate for that fact, we make the shadows only slightly darken the color of the surface at that point. When something's in shadow, it just gets 50% darker. It doesn't actually go to black. That's what that shadow factor there means. Then we just multiply the diffuse lighting by the color map.
Now, at the same time, the specular lighting is being computed in the texture shaders. So, the texture shaders, if you look at the bottom here, are actually doing this dot product texture 1D operation, which basically does a dot product between the half angle vector that we sent down and the normal value that came out of the normal map, and then looking up in a 1D texture that kind of gives us the exponent. So, the 1D texture is what gives us that to the power of p there.
So, in the register combiners, we take that specular value, we multiply it by the shininess, and that gives us our specular lighting. And that's what gives you the kind of highlights on the bump maps. So, and then, almost done, we multiply both of those by the shadow factors.
So one interesting thing is you do want the specular term to completely disappear when the surface is in shadow. So you see we multiply the diffuse by the shadow factor, and then we just multiply the specular by the actual shadow value out of the shadow map, which is just 0 or 1. And then finally, we do the fog. I'm not sure if you noticed in the demo, but the street does actually kind of fog out to transparent. So that's what gives you that effect.
Okay, so here I'm just going to show you some of the texture maps that are used in the demo. These were all actually mainly painted within Photoshop. We also used DeepPaint 3D for some work. But these are really, really detailed maps. Some of these go up to 2K by 2K.
One of the nice things, GeForce 4 has 128 meg of memory, so you have plenty of memory to waste on really big textures. So that's what the color map looks like. So this is obviously the face we're looking at here. The eyes are separate surfaces, so you don't see the eyes. The purple bit at the bottom is actually the mouth.
So this is the shininess map that I was talking about earlier. So you can see the mouth is almost completely white because we want that area to be very shiny. And the nostrils also tend to be a bit kind of greasy and therefore shiny. And his eyes, well the eye sockets at least aren't shiny at all.
So this is what a normal map looks like. Now, for those of you who haven't seen a normal map before, this probably looks a little bit unusual. The way this works is we actually don't author these normal maps looking like this. We author them as basically a height field.
So they're just drawn as a black and white image where black represents kind of a low part in the surface and white represents a high part. And then we run them through a special filter that kind of takes that height field and then works out what the normal would be at each point.
So, and then that normal is encoded as an RGB color. So that's why this looks a little unusual. The reason it's mainly blue is because the normals are mainly pointing out of the page, right, which means their Z value is very big, and a big Z value translates to a big blue value. And so it mainly looks blue. And then depending on the direction that normal is pointing, it'll have different amounts of red and green in it.
The other way you can think about it is it's like it was a real surface that's lit with a red, green, and a blue light in different positions. So that's what a normal map looks like. And then you put it all together and that's what the final result looks like.
So we're almost out of time here. Just a little few words about future extensions to this technique. One of the things that we thought about but didn't really have time for were the fur dynamics. It would have been nice to actually make the fur move as the character animated.
You could do this by moving the shell geometry independently from the underlying skin. You could do that either based on a real physical simulation or maybe just make it lag behind the skin a little bit. That's one of the really cool things that Pixar do. Their fur has a full physics simulation run on it so it bounces around.
The other thing that's going to be interesting on future hardware is this idea of ray marching, which is basically similar to the shells and fins technique, but you're actually doing all the blending within the pixel shader. I don't think it's any secret that future hardware is going to be a lot more programmable and a lot more flexible and a lot faster.
You'll actually be able to do all that blending within the pixel shader itself and actually march through a 3D texture to give you that illusion of depth without using the shells. The nice thing about that is that reduces the amount of blending that you're having to do in the frame buffer, which is very bandwidth intensive. And then lastly, the holy grail is just to do with geometry. So just draw curves for each individual strand of hair, as I said. But it's very hard to do the anti-aliasing and the shadowing with that. But we'll get there.
Okay, so just a final summary. Program of vertex and pixel shaders allow you to control the hardware. And I think it is true in some sense that real time and offline production rendering are converging. What people were doing offline in movies a couple of years ago, we are kind of approaching doing in real time now. And the tools are getting a lot better as well, so it's becoming a lot easier to just author something in Maya and then just display it in real time and animate it. So I think it's a very exciting time for real time 3D graphics.
One of the other interesting things is that this programmable graphics hardware is not only useful for graphics these days, not only games and those kind of things, but also, as we've seen in the Jaguar UI work that's been done at Apple, it's also very useful for doing 2D operations as well now. 2D imaging, filters, video processing, and also user interfaces. In fact, people are even using pixel shaders to do 3D rendering.
Geoff Stahl, Simon Green And I think it's a really interesting time for real time 3D rendering. Geoff Stahl, Simon Green And kind of simulation work as well, doing kind of fluid dynamics in pixel shaders. So there's a lot of possibilities there. And as I said, the next generation hardware is going to be faster, even more programmable. So it's only going to get better. So, yeah, you should start learning this stuff now.
I don't want to make this sound like an off-square speech, but I'm just going to give a few credits here. Curtis Beeson and Joe Demers did the engine code. Daniel Hornick did all the modeling and texturing. Geoff Bell did the character animation. Ken Crete-Aditz, sound design. I just did some additional code on the shaders.
Mark Daly's my boss, so I have to say thanks to him. And that's just another outtake image where the werewolf's mouth for some reason popped out of his head. And there are some references, should you want to pursue this in your own time. And that's all I have. Thank you.
Thank you very much, Simon. I just want to mention here a quick roadmap. This is a must-attend session, 5.1.4, OpenGL Performance and Optimization. And the feedback forum tomorrow afternoon, last session of the conference, for you to let us know what else you need. Here's a couple of contacts, Simon for NVIDIA and myself, Sergio at Apple.com, for any inquiries you may have.