OpenGL: Geometry and Modeling - WWDC 2001

Digital Media • 21:57

This session provides key insights into the use of complex geometries within OpenGL applications. Topics include hierarchical key-framed model storage, retrieval, and animation within an interactive application.

Speaker: Todd Previte

Unlisted on Apple Developer site

Check out Bezel, our iPhone mirroring app →

Transcript

This transcript was generated using Whisper, it has known transcription errors. We are working on an improved version.

All right. Welcome to session 405, Geometry and Modeling. My name is Todd Previt. I'm the 3D graphics TTS engineer for Apple. What we're going to be going through today is a study of the implementation of complex animated geometries. Any of you who have downloaded any of our sample code from the web are probably familiar with the multicolored square. For most of our OpenGL demonstrations, there's always a spinning square that we use to illustrate any particular concept. Well, today, I'm going to move from that into a complex animated geometry in the form of a model. You see our little marine here on the screen.

I'm going to explain basically what goes into rendering one of these things to the screen. I'll get into coordinate systems first. One of the things that was kind of a challenge for me when I was first getting into OpenGL was that there's so many different versions of coordinate systems that go into any type of 3D rendering. You have your texture coordinates, which are normally described in 2D space as they apply to a 2D texture. These days, though, we now have what are called 3D textures, which is a layered texture. So you can actually have three-dimensional texture coordinates.

But the ones we'll be discussing today are two-dimensional. We also have model coordinates, or model or object coordinates, that describe an object within the scope of its own space, as in relative to the centroid of the model. World coordinates are a global reference. It really defines how objects relate to one another in the scope of the world that you are rendering them in.

And finally, I-coordinates are the result of a Model-View matrix transformation on model coordinates to put them in pseudo-world coordinate space. This kind of gives you an example of coordinates here. You can see the triangular slice out of the texture on the left-hand image as it applies to a model here. It kind of illustrates what texture coordinates do. These vertices here at the edges all have a texture coordinate that maps directly into that image. I just happen to select that portion of it as it maps onto this model here.

So what's in a model format? Well, you have vertices, texture coordinates, triangles, and textures. Those are the basic things that you need to store a model, load it back from disk, and render it on the screen. Your vertices are just an array of three floats, or doubles, however you want to represent your coordinates.

In a moment, I'll discuss a particularly interesting technique that was used by id Software to render their models for Quake 2. But that's really all it is. It's just a simple array of three floats that describe a vertex in 3D space. Texture coordinates, as I went through just a moment ago, are an array of two floats, or doubles, whatever you prefer to use. These apply directly to the texture in its own space.

Now, with texture coordinates, they're not always equal to the number of vertices. It doesn't sound right. How can you have more than one texture coordinate per vertex? I'll show you. As you can see, I've highlighted the vertices on these two triangles here with the red line down the middle, which illustrates the shared edge the two triangles have. You can either have texture coordinates that are based per vertex, or you can have them that are based on a per-triangle vertex.

If you have them per vertex, and you have two different textures on each triangle, you're going to end up with some pretty interesting results. Unless you go out of your way to make it happen. You're going to find that the texture coordinates from one don't map directly into the image of the other.

So you'll come out with some pretty interesting looking textures if you share texture coordinates amongst vertices. The alternative to that, to make things look right with two different textures, is to have texture coordinates assigned on a per-triangle vertex basis. At that point, each triangle vertex, even though it's shared, has its own set of texture coordinates. So therefore, you can have two different textures on the two different triangles and not screw up the texture coordinates.

Again, we'll go back to this. All your triangles are an array of vertex coordinates and texture coordinates. Uh... that applies to... they're usually indices into the aforementioned arrays of vertices and texture coordinates. Uh... It keeps things fairly easy and For loading purposes, you're better off using indices instead of actual hard-coded vertices. Textures themselves are just the image data stored in whatever format you like, whichever you happen to want to prefer.

I do know that id for Quake 2 happened to use the PCX format. These days, a lot of people are going back to using BMPs and GIF files. JPEGs are not used a lot because of the compression schemes that are involved in them. You tend to lose a lot of image quality, so they tend to be avoided.

So as I said, I've been kind of mentioning id here and there, but the case study we're going to be going through today is for the Quake 2 model format, which is .md2. I chose this format for a number of reasons. It's publicly available. There's a ton of information on the web that you can get. It is fairly well documented, although as I learned just recently, it's not quite as well documented as I thought it was. There's also a lot of models that are readily available. You can download them almost anywhere, and there's a ton of them out there.

So what does the header file for this, for an MD2 look like? Well that's it. Doesn't look like there's a whole lot there. You'll notice that it's actually a structure of 17 ints. Makes it really easy to load in. It comes out to 68 bytes. So you just, all you have to do is put your file pointer at the beginning of the file, load in the first 68 bytes, and you have a complete description of what the rest of the file looks like.

I've highlighted here the elements that are actually relevant to our discussion today. Magic conversion, those are, I believe, specific to id. The offsets are kind of interesting though. They specify the offsets of the various data points from the beginning of the file. Well if you read in the file in its entirety in memory, they also become the offsets to that data in memory.

So how is this data actually represented? Well, there's a lot of supporting data structures that go into the MD2 format, and I'll be going through some of those right now. The triangle vertex T is, as I mentioned in the overview slide, this is our structure that will describe any given triangle vertex.

What you have is, you've got three vertices that are in a byte. And you also got a byte which describes the light normal index, which is not relevant to our discussion today. The triangle data is then-- The first step is to integrate two arrays of indices into the vertex and triangle data.

Now I know I said that vertices are floats or doubles, some kind of floating point number that you can use, right? What these are, they're an encoded byte coordinate. In order to conserve space, It decided to pack all of its vertex data into bytes. This makes it really easy to go cross-platform when you're switching from BigIndian to LittleIndian because you don't have to do anything with these.

They're already--it's just a--it's a byte, no swapping required, can be loaded by any machine. Also, having the vertices in such a small space means that they can be very, very tightly packed. You notice that each of these structures are actually only four bytes in length. Well, four bytes, 32 bits, size of a long.

The other aspect of it is, why do they encode them in a byte? Because it allows for very easy scaling and translation of any given vertex. So Jeff mentioned earlier in his presentation about how OpenGL is very, very good at scaling and rotating things with just a simple transform matrix. Same concept applies here.

Our texture data is stored as file names. It's pretty basic. You just load in the texture as you would any other file. The preferable method on Mac OS X is to use Qt New GeoWield from Pointer, which sets up a non-padded GeoWield that you can then direct OpenGL to to suck in the texture and apply it to your model.

Some little caveats about this: for any given MD2 file, the skin width and skin height will be the same. You can have as many skins as you like, but they all have to be of the same size. Your texture coordinates: small struct of two shorts, the S and T coordinates. There's really not much more to it than that. Those describe the coordinates of the slice of the texture that you're going to be using for whichever vertex texture coordinates apply to.

This is relevant to animation, which is your frame data. On a per-frame basis, this is what you get when you're dealing with the MD2 format. You've got a scale and a translate, and those are the two that I want to focus on for a moment. As I mentioned, OpenGL is very good at scaling and translating just using very simple matrix transforms. These are the values that we'll use and apply to the byte vertices to get them to the actual positions that we want them.

The name segment of this is just whatever you happen to want to name the frame for tracking your animations. And then the triangle T vertices. This is kind of interesting because it's only one vertex. Or it's only one vertex. But what it actually is, is a pointer to a whole array of vertices that are allocated on a per-frame basis. So, what you end up with here is that each frame will have the exact same number of vertices for any given MD2.

This is how I chose to set up a storage of an MD2 file. Four elements: we have the header, frames, triangles, and texture coordinates. You don't have to set it up this way, and I'd like to point out that this does leave out a significant chunk of the MD2 file, that being the GL commands, which were not exactly relevant, at least to this discussion.

So here's a little pseudocode about loading from loading in MG2. The first thing is you want to obviously find and open the file. And then you'll read in the header, as described on the slide here. You read it in, copy it to the-- once you have your copy of it in memory, you can now directly access it to get to any of the data that you need, such as the offsets of the frames. ...like this. And then you can allocate your memory for your frames here, and then read in the data to your, uh, then you read the, excuse me, and you'll read the data in from the file into your, into your data structures and memory.

So drawing models. Well, there's a really basic loop that you want to use whenever you want to draw your models on screen. After setting everything up, you have your GL context, you have your viewport, you perform your camera transformations. So you're looking at the proper place in space.

Then you're going to push that matrix so you can save it. You'll transform your model to place it where you want it. You'll draw it. You're going to pop that matrix back off, and that will return you to the basic camera transformation so you're right back where you started. And then you continue the next iteration of the loop, and you can go through it for the however many models you happen to have.

This is kind of a graphical representation of what it looks like. After the camera transform, you push the matrix, transform your model, draw it, pop the matrix back off, return right around again for however many models you've got. After you perform all of this, you continue with any other rendering stuff that you want to do. This is kind of a pseudo-code representation of what a scene rendering routine would look like. The highlighted code specifies the loop that I just described for you, where the push matrix and pop matrix surround the model drawing routine.

One thing I didn't talk about was animation. This would actually only render one frame of any animation, which happened to be frameNum. If you wanted to render each individual frame, the model itself would probably have to have some kind of rendering routine that would animate through whatever frame sequence you wanted to perform the given animation.

So how are we going to draw that model? Well, this is, again, a pseudocode representation of how you might do it. First, we start with the vertex indices, vi, and the actual vertex we're going to use, vtx. And we start with triangles. Now these loops are kind of strange, and while we were writing the code for this, the language that we used to try and describe it became very, very convoluted very quickly, as you'll find out momentarily. So what we have here is for each triangle in the header, Skip down one. Now for each vertex of each triangle in the header.

But wait, there's more. So for each vertex index in each triangle, and Now we're on to, for each vertex, in each vertex index, in each triangle, perform the following operations. Those would be: you get the vertex itself, you multiply by the scale, you add the translation, and you have your vertex. Calling GL vertex down at the bottom. You specify your vertex so OpenGL knows where you want it, and you're good to go.

So as I mentioned, this rendering engine-- or this rendering engine, excuse me. This rendering loop does omit any sort of animation. You would have to do quite a bit more. You'd have to add quite a bit of code to get the animations actually working here. What you'd want to do in that case would be, as you drop down and start rendering the vertices, you'd want to execute that loop for each frame of each animation that you wanted to display at any given time. Alternatively, what you could do is you could increment the frame number every time a frame renders through your rendering engine or for whatever application you've got and continue with your rendering loop that way.

As I mentioned, on a per-vertex coordinate basis, each vertex is an encoded byte. You multiply by a scale, you add the translation, and you end up with a floating point resultant vertex that you can pass to OpenGL. So a little bit about animation here. I'm going to discuss key-framed animation with vertex interpolation between the key frames. What exactly is key-frame animation?

If you're not familiar with it, key-frame animation is a series of snapshots. For instance, if you are at a racetrack and you're watching a car travel around in a circle, and you took various pictures of it as it moved around the track, those would be your key frames.

So if you were to view them in order, you would actually see the quote unquote "animation" of the car moving around the track. Well, what's wrong with that? It's going to look really choppy. So what do you do? That's where vertex interpolation comes in. Between any two given key frames, You can divide up the amount of time that it takes to render each frame and then interpolate a number of vertices in between the key-frame vertices. So you end up with a much smoother animation.

Each frame in a key-frame animation sequence contains a complete set of vertex data. It has all the texture coordinates, all the vertex coordinates for any model that it applies to. As I mentioned, it is a series of snapshots over a given period of time. Key-frame animation is very good for things like simple rigid body motion or for rendering any sort of fixed animation in a game context.

For other things, such as models that are going to be interacting with your environment and other dynamic sources of energy, you would probably want to move to something more complex, such as a Skeletal-based animation. Key-frame animation will do the job, but it's probably going to look pretty choppy, and it's going to be hard to modify your key-frames on the fly.

This is kind of a graphical representation of what a key-frame animation looks like. Each of these would be considered a snapshot. And as you progress through, he goes from a standing--our marine here goes from a standing position all the way down to laying down. If you were to play these through one at a time at any amount of speed, it's probably going to look pretty choppy. Again, that's where the vertex interpolation would come in, and you could smooth out the animation to make it look much better. What we're going to do is we're going to interpolate the vertices linearly.

As in, there will be a fixed amount of time between any two interpolations. The number of actual interpolated vertices is arbitrary. It is purely based on your frame rate at the time, or you can actually hard-code it to where you can lock it in to a particular rate.

So the basic equation is the number of interpolations. The number of interpolations is equal to the animation speed from frame to frame.

[Transcript missing]

So for any given vertex, you would have 15 more in the middle of them. That would smooth out your animation sequence. I've kind of gone through how How to define a model format and the necessary data that you would want to have. Went through loading and storage and memory representation. And showed some basic algorithms as to how to display and animate a model.

If you want a little more information on this kind of thing, you can go to Apple's website and the Apple OpenGL website, as well as OpenGL.org. Two really good books on modeling are Real-Time Rendering and 3D Graphics File Formats, which is where I ended up learning most of my information from over the course of time.

The Principles of 3D Computer Animation is also another very good book, and Advanced Animation and Rendering Techniques are very good for just that, animation. As far as the roadmap goes, 408 tomorrow is OpenGL optimization. Jeff discussed that earlier. Advanced rendering will be given by Troy Dawson tomorrow, and our feedback forum is tomorrow at 5:00 PM. My contact information for anybody who wants to get a hold of me.