Recap:
- Part 1 – Overview of GLKit
- Part 2 – Drawcalls, and how OpenGL code is architected
- Part 3 – Vertices, Shaders and Geometry
- Part 4 – (additions to Part 3); preparing for Textures
…but looking back, I’m really unhappy with Part 4. Xcode5 invalidates almost 30% of it, and the remainder wasn’t practical – it was code-cleanup.
So, I’m going to try again, and do it better this time. This replaces my previous Part 4 – call it “4b”.
NB: if you’re reading this on AltDevBlog, the code-formatter is currently broken on the server. Until the ADB server is fixed, I recommend reading this (identical) post over at T-Machine.org, where the code-formatting is much better.
December 2013: I’ve converted the sample code from these articles into a standalone library on GitHub, with the code from the articles as a Demo app. It uses the ‘latest’ version of the code, so the early articles are quite different – but it’s an easy starting point
Drawing multiple 2D / 3D objects
A natural way to “draw things” is to maintain a list of what you want to draw, and then – when the OS / windowing library / whatever is ready to draw, you iterate over your “things” something like:
- void draw( CanvasLayer layer )
- foreach( Drawable nextItem in myDrawables )
- layer.draw( nextItem );
- foreach( Drawable nextItem in myDrawables )
As explained in Part 2 … OpenGL doesn’t work that way. Instead, for each item, you need to go through the complete setup and tear-down of every configurable parameter that might affect the drawing. Under the hood, most windowing API’s do this too – but they hide it from you.
We’ll create multiple triangles, each with their own unique geometry, and display them all at once.
Multiple objects: A VAO per draw call
A VAO / VertexArrayObject:
VertexArrayObject: stores the metadata for “which VBOs are you using, what kind of data is inside them, how can a ShaderProgram read and interpret that data, etc”
We’ll start with a new class with the (by now: obvious) properties and methods:
GLK2VertexArrayObject.h
[objc]
#import
@interface GLK2VertexArrayObject : NSObject
@property(nonatomic, readonly) GLuint glName;
@property(nonatomic,retain) NSMutableArray* VBOs;
@end
[/objc]
…and add this to the Draw call:
GLK2DrawCall.h
[objc]
#import “GLK2VertexArrayObject.h”
…
@property(nonatomic,retain) GLK2VertexArrayObject* VAO;
[/objc]
We upgrade our rendering call to actively switch between the VAO’s on a per-draw-call basis:
ViewController.m
[objc]
-(void) renderSingleDrawCall:(GLK2DrawCall*) drawCall
{
…
/** Choose a ShaderProgram on the GPU for it to execute while doing this draw-call */
…
if( drawCall.VAO != nil )
glBindVertexArrayOES( drawCall.VAO.glName );
//else PROBLEM: unbinding causes us to lose a texture unexpectedly, I don’t know why yet…
//glBindVertexArrayOES( 0 /** means “none */ );
…
glDrawArrays( GL_TRIANGLES, 0, 3 );
}
[/objc]
VBOs revisited; VBO vs. BO
I deliberately avoided going into detail on VAO (Vertex Array Objects) vs. VBO (Vertex Buffer Objects) until now.
Previously, I said:
- A VertexBufferObject:
- …is a plain BufferObject that we’ve filled with raw data for describing Vertices (i.e.: for each Vertex, this buffer has values of one or more ‘attribute’s)
- Each 3D object will need one or more VBO’s.
- When we do a Draw call, before drawing … we’ll have to “select” the set of VBO’s for the object we want to draw.
- A VertexArrayObject:
- …is a GPU-side thing (or “object”) that holds the state for an array-of-vertices
- It records info on how to “interpret” the data we uploaded (in our VBO’s) so that it knows, for each vertex, which bits/bytes/offset in the data correspond to the attribute value (in our Shaders) for that vertex
Note that “VAO” and “VBO” are independent: you can have multiple VAO’s sharing 1 VBO. You can have 1 VAO using multiple VBO’s. And any combination of many-to-many.
But a Draw call only ever uses a single VAO. A single VAO is mapping “a bunch of metadata, plus a big chunk of VRAM on the GPU … to a single draw-call”.
It’s important to understand that a VBO “is” a BO. Everything you can do with a VBO you can also do with any BO. We give it a different name because to us, as programmers, it’s easier to think about that way. There is one caveat: the GPU is allowed to do under-the-hood optimizations based on what you first use a BO for. In practice: you’ll rarely need to re-use a buffer for a different purpose, so don’t worry about it.
With a generic BO (BufferObject), some of the method calls in OpenGL will require a “type” parameter. Whenever you pass-in the type “GL_ARRAY_BUFFER”, you have told OpenGL to use that BO “as a VBO”; it has no special meaning beyond that.
Vertex Buffer Objects: why plural?
A BufferObject is simply a big array stored on the GPU, so that the GPU doesn’t have to keep asking for the data from system-RAM. RAM -> GPU transfer speeds are much slower than GPU-local-RAM (known as VRAM) -> GPU upload speeds.
As soon as you have BufferObjects, your GPU has to start doing memory-management on them. GPU’s are OK at this, but not great – it’s a complex problem and requires a lot of code and theory at the level of “building a new Operating System”.
With poor hardware you can get noticeable speed gains by having “only one” VBO for your entire app and doing your own memory-management on its contents. It’s messy in code / debugging terms (no isolation of data), but sometimes worth it.
On the flip-side … GPU’s have lots of gotchas to do with “replacing” the data inside an existing BO/VBO. OpenGL is hiding the multi-threaded reality from you – but when you write to the GPU’s local VRAM, from CPU, it’s easy to get “blocked” waiting for the render threads to complete. This is a particular problem on PVR, as used in all iOS devices.
For instance, Imagination (PowerVR manufacturer) has a blog post on avoiding massive performance drops when writing to a VBO, by using a couple of VBO’s, and swapping between them on alternate frames. The PVR chip has to wait for the “TA” to go through before it can render, stalling the process:
This affects dynamic data in your app – but also simple stuff like running out of memory: if you only have one VBO, you’re screwed. You can’t “unload a bit of it to make more room” – a VBO is, by definition, all-or-nothing. You have to dump the whole thing, then re-upload a subset of it.
Taken all together … the sweet-spot for OpenGL ES 2 on iOS is somewhere around “slightly more than 1 VBO dedicated to each VAO”.
Refactoring our old code into a new “VBO class”
We’re going to name this “BufferObject” instead of “VertexBufferObject”, since the “vertex” part is merely a property that could be set or unset on any instance:
GLK2BufferObject.h
[objc]
…
@property(nonatomic, readonly) GLuint glName;
@property(nonatomic) GLenum glBufferType;
…
[/objc]
We have our standard “glName” (everything has one), and we have a glBufferType, which is set to GL_ARRAY_BUFFER whenever we want the BO to become a VBO.
To refactor, start with a recap on our previous code, which I previously glossed-over:
(from previous blog post)
glGenBuffers( 1, &VBOName );
glBindBuffer(GL_ARRAY_BUFFER, VBOName );
glBufferData(GL_ARRAY_BUFFER, 3 * sizeof( GLKVector3 ), cpuBuffer, GL_DYNAMIC_DRAW);
The first two lines create a BO/VBO, and store its name. From now on, we’ll automatically set the “GL_ARRAY_BUFFER” argument using our self.glBufferType. Looking at that last line, the second-to-last argument is obviously “the array of data we created on the CPU, and want to upload to the GPU”.
… but what’s the second argument? A hardcoded “3 * (something)”?
(Ouch – very bad practice, hardcoding a digit with no explanation. Bad me :(.)
(glBufferData’s 2nd argument): The total amount of RAM the GPU needs to allocate … to store this array you’re about to upload
Three definitions of “size”
In our case, we were uploading 3 vertices (one for each corner of a triangle), and each vertex was defined using GLKVector3. The C function “sizeof” measures “how many bytes does a particular type use-up when in memory?”.
But we’re not done yet… when we later told OpenGL the format of the data inside the VBO, we used the line:
(from Part 3)
glVertexAttribPointer( attribute.glLocation, 3, GL_FLOAT, GL_FALSE, 0, 0);
The 2nd argument there is also called “size” – but it’s a different number.
And, finally, when we issue the Draw call, we use the number 3 again, for a 3rd kind of ‘size’:
(from Part 3)
glDrawArrays( GL_TRIANGLES, 0, 3); // this 3 is NOT THE SAME AS PREVIOUS 3 !
- glBufferData: measures size in “number of bytes needed to store one Attribute-value”
- glVertexAttribPointer: measures size in “number of floats required to store one Attribute-value”
- glDrawArrays: measures size in “number of vertices to draw, out of the ones in the VBO” (you can draw fewer than all of them)
The final one – glDrawArrays (how many vertices to “draw”) – we’ll store in the GLK2DrawCall class itself, but the rest needs to be associated with the VBO itself, and make sure we use the right kind of “size” at each moment.
Multiple Attributes per VBO / Interleaved Vertex data
As part of configuring your Draw call, you use glVertexAttribPointer to tell OpenGL:
Use the data in BufferObject (X), interpreted according to rule (Y), to provide a value of this attribute for EACH vertex in the object
Earlier, I stated that you can put “all” your data for a draw-call into a single VBO. So far, we filled a single VBO with values for a single Attribute. There are many cases where that’s the right approach (one VBO contains data for one attribute) – but your starting point for a new 3D object is to cram all data, for all its Attributes, into one VBO.
(from Apple’s Techniques for Working with Vertex Data)
How?
It’s back to that glVertexAttribPointer method again:
glVertexAttribPointer( attribute.glLocation, 3, GL_FLOAT, GL_FALSE, 0 WAT?, 0 WAT (x2)? );
This method is IMHO one of the worst-designed ones in the OpenGL API. When newbie OpenGL programmers screw-up and can’t work out what’s gone wrong, it’s usually a misunderstanding of the arguments of this method. This is often made worse because tutorials put “0″ and “GL_FALSE” into the arguments, and don’t explain why.
One by one, using OpenGL’s glVertexAttribPointer docs, the arguments are:
- “index of the generic vertex attribute to be modified.” — i.e. the GLK2Attribute.glLocation we fetched from the ShaderProgram after linking
- “number of components per generic vertex attribute. Must be 1, 2, 3, 4″ — i.e. if your shader source code had this attribute as a “vec4″, you MUST set this to “4″. If it had a “vec2″, it would be “2″. For a simple float: “1″.
- “the data type of each component in the array” — Apple’s GLKit uses floats for everything, so unless you start optimizing data-formats, this will always be GL_FLOAT
- “specifies whether fixed-point data values should be normalized (GL_TRUE) or converted directly as fixed-point values (GL_FALSE)” — in this case, they mean “are you weird, you’re sending me the wrong data, and you want all your values to be converted into the range (0…1)?, or shall I just use the data you gave me?”. Hence: always GL_FALSE.
- “the byte offset between consecutive generic vertex attributes” — Aha. Interesting.
- “offset of the first component of the first generic vertex attribute in the array” — Hmm. Interesting.
If you only have one “vertex attribute” in the array … then the “offset between” them will be the size of the attribute in bytes (i.e. “read one attribute. Then move ahead “the size of one attribute”, and read the next). But OpenGL allows a pointless optimization here which only confuses people: if you provide “0″, it magically works in the special case of “only one” attribute.
…and with only one Attribute: your “offset” will be “0″ — i.e. “start at the start”.
It gets interesting with more than one. If you have e.g.two vertex attributes in your array of data:
- “offset between” — each one has to read ahead “the size of both attributes”: i.e. add together their TOTAL size in bytes
- “offset of the first component” — the first Attribute starts at the start – 0. The second Attribute will have to skip ahead a little to find its first value. i.e. add together “size of one each of the Attributes that were before this one in the array”.
e.g. if your attributes were a vec4 (4 floats, of 4 bytes each) and a vec2 (2 floats, of 4 bytes each):
- Offset between: (4×4 + 2×4) = 24
- Offset of first:
- …for first Attribute: 0
- …for second Attribute: (4×4) = 16
- …(a third Attribute would be: 16 + (2×4) = 24)
The easy way to encapsulate all this info: A “buffer format”
GLK2BufferFormat.h:
[objc]
…
@property(nonatomic) int numberOfSubTypes;
…
[/objc]
We store the “bytes per item” and the “floats per item” into a pair of arrays, and access them by index:
GLK2BufferFormat.m:
[objc]
@interface GLK2BufferFormat()
@property(nonatomic,retain) NSMutableArray* numFloatsPerItem, *bytesPerItem;
@end
…
-(GLuint)sizePerItemInFloatsForSubTypeIndex:(int)index
{
/** Apple currently defines GLuint as “unsigned int” */
return [((NSNumber*)[self.numFloatsPerItem objectAtIndex:index]) unsignedIntValue];
}
-(GLsizeiptr)bytesPerItemForSubTypeIndex:(int)index
{
/** Apple currently defines GLsizeiptr as “long” */
return [((NSNumber*)[self.bytesPerItem objectAtIndex:index]) longValue];
}
[/objc]
Note how you can auto-convert data into those arrays using Apple’s new “@( … )” Objective-C autoboxing syntax – but when you extract them, you have to explicitly cast them to correct types. NSNumber does the magic for us, in both cases.
Each BufferObject will now need to have a “current Buffer Format”, and each time it changes we’ll add up the sizes of all the items and cache that:
GLK2BufferObject.h
[objc]
@property(nonatomic,retain) GLK2BufferFormat* currentFormat;
@property(nonatomic,readonly) GLsizeiptr totalBytesPerItem;
[/objc]
GLK2BufferObject.m
[objc]
…
-(void)setCurrentFormat:(GLK2BufferFormat *)newValue
{
[_currentFormat release];
_currentFormat = newValue;
[_currentFormat retain];
self.totalBytesPerItem = 0;
for( int i=0; i 0 , @”Invalid GLK2BufferFormat”);
self.totalBytesPerItem += bytesPerItem;
}
}
[/objc]
We can add the last bit of VBO/BO code into our buffer-object, using the buffer format etc:
GLK2BufferObject.m
[objc]
…
-(void) upload:(void *) dataArray numItems:(int) count usageHint:(GLenum) usage
{
glBindBuffer( self.glBufferType, self.glName );
glBufferData( GL_ARRAY_BUFFER, count * self.totalBytesPerItem, dataArray, usage);
}
[/objc]
VBO is now done, yay! But we still have to finish VAO – that glVertexAttribArray call needs cleaning up.
Multiple Attributes per VBO: configuring the VertexArrayObject
Taking all the above, and putting it together, we get a single method on the VAO that allows us to:
- Provide:
- a set of Attributes
- an array of data (e.g. filled with GLKVector3′s)
- a buffer-format (with one entry per Attribute, saying how many bytes it is, and how many floats)
- the number of vertices in the array
- …and have the VAO do for us:
- Create a VBO on the GPU
- Upload the data to the new VBO
- Store inside itself (the VAO) the mapping from “this VBO” to “what the shader expects”
- Handle data for “one attribute per VBO” equally well as “multiple attributes per VBO”
There’s quite a bit of code here, mostly it’s housekeeping and for clarity – the concepts are nothing new:
GLK2VertexArrayObject.m
[objc]
-(GLK2BufferObject*) addVBOForAttributes:(NSArray*) targetAttributes filledWithData:(void*) data inFormat:(GLK2BufferFormat*) bFormat numVertices:(int) numDataItems updateFrequency:(GLK2BufferObjectFrequency) freq
{
/** Create a VBO on the GPU, to store data */
GLK2BufferObject* newVBO = [GLK2BufferObject vertexBufferObject];
[self.VBOs addObject:newVBO]; // so we can auto-release it when this class deallocs
/** Send the vertex data to the new VBO */
[newVBO upload:data numItems:numDataItems usageHint:[newVBO getUsageEnumValueFromFrequency:freq nature:GLK2BufferObjectNatureDraw] withNewFormat:bFormat];
/** Configure the VAO (state) */
glBindVertexArrayOES( self.glName );
GLsizeiptr bytesForPreviousItems = 0;
int i = -1;
for( GLK2Attribute* targetAttribute in targetAttributes )
{
i++;
GLuint numFloatsForItem = [newVBO.contentsFormat sizePerItemInFloatsForSubTypeIndex:i];
GLsizeiptr bytesPerItem = [newVBO.contentsFormat bytesPerItemForSubTypeIndex:i];
glEnableVertexAttribArray( targetAttribute.glLocation );
glVertexAttribPointer( targetAttribute.glLocation, numFloatsForItem, GL_FLOAT, GL_FALSE, newVBO.totalBytesPerItem, (const GLvoid*) bytesForPreviousItems); // cast needed because GL API is overloaded too much in C
bytesForPreviousItems += bytesPerItem;
}
glBindVertexArrayOES(0); //unbind the vertex array, as a precaution against accidental changes by other classes
return newVBO;
}
[/objc]
The only special item here is “usage”. Previously, I used the value “GL_DYNAMIC_DRAW”, which doesn’t do anything specific, but warns OpenGL that we might choose to re-upload the contents of this buffer at some point in the future. More correctly, you have a bunch of different options for this “hint” – if you look at the full source on GitHub, you’ll see a convenience method and two typedef’s that handle this for you, and explain the different options.
(but “uploading to a live BufferObject” is a complex topic of its own, and I’m not going into any further detail right now)
Source for: GLK2BufferFormat.h and GLK2BufferFormat.m
- GLK2BufferFormat.h – link to GitHub because it would make the blog post too long to insert it here
- GLK2BufferFormat.m – link to GitHub because it would make the blog post too long to insert it here
Source for: GLK2BufferObject.h and GLK2BufferObject.m
- GLK2BufferObject.h – link to GitHub because it would make the blog post too long to insert it here
- GLK2BufferObject.m – link to GitHub because it would make the blog post too long to insert it here
Source for: GLK2VertexArrayObject.h and GLK2VertexArrayObject.m
- GLK2VertexArrayObject.h – link to GitHub because it would make the blog post too long to insert it here
- GLK2VertexArrayObject.m – link to GitHub because it would make the blog post too long to insert it here
Gotcha: The magic of OpenGL shader type-conversion
This is also a great time to point-out some sleight-of-hand I did last time.
In our source-code for the Shader, I declared our attribute as:
attribute vec4 position;
…and when I declared the data on CPU that we uploaded, to fill-out that attribute, I did:
GLKVector3 cpuBuffer[] =
{
GLKVector3Make(-1,-1, z)
…
Anyone with sharp eyes will notice that I uploaded “vector3″ (data in the form: x,y,z) to an attribute of type “vector4″ (data in the form: x,y,z,w). And nothing went wrong. Huh?
The secret here is two fold:
- OpenGL’s shader-language is forgiving and smart; if you give it a vec3 where it needs a vec4, it will up-convert automatically
- We told all of OpenGL “outside” the shader-program: this buffer contains Vector3′s! Each one has 3 floats! Note: That’s THREE! Not FOUR!
…otherwise, I’d have had to define our triangle using 4 co-ordinates – and what the heck is the correct value of w anyway? Better not to even go there (for now). All of this “just works” thanks to the code we’ve written above, in this post. We explicitly tell OpenGL how to interpret the contents of a BufferObject even though the data may not be in the format the shader is expecting – and then OpenGL handles the rest for us automagically.
Those “multiple” triangles…
First, we’ll re-write ViewController to use all the new classes above:
ViewController.m
[objc]
-(NSMutableArray*) createAllDrawCalls
{
…
draw1Triangle.VAO = [[GLK2VertexArrayObject new] autorelease];
[draw1Triangle.VAO addVBOForAttribute:attribute filledWithData:cpuBuffer bytesPerArrayElement:sizeof(GLKVector3) arrayLength: draw1Triangle.numVerticesToDraw];
/** … Finally: add the draw Call 2 into the list of draw-calls we’re rendering as a “frame” on-screen */
[result addObject: draw1Triangle];
…
}
[/objc]
Hang on – how come so little has changed?
This is the purpose of VAO’s: they encapsulate (at the OpenGL / GPU level) all the data surrounding a bunch of VBO’s. That means “the raw values of the Attributes”, but also “the metadata about the VBO’s”. By modifying and re-writing and refactoring our VBO/BO/BufferFormat code … we have no effect on the rest of the app, only the VAO code needs to change.
To add some triangles, we’ll simply “add more draw calls” – and let our existing rendering code automatically handle everything else. Replace the code for creating the “draw1Triangle” object with this:
[objc]
GLK2ShaderProgram* sharedProgramForBlueTriangles = [GLK2ShaderProgram shaderProgramFromVertexFilename:@"VertexPositionUnprojected" fragmentFilename:@"FragmentColourOnly"];
for( int i=0; i<4; i++ )
{
GLK2DrawCall* draw1Triangle = [[GLK2DrawCall new] autorelease];
/** … Upload a program */
draw1Triangle.shaderProgram = sharedProgramForBlueTriangles;
glUseProgram( draw1Triangle.shaderProgram.glName );
GLK2Attribute* attribute = [draw1Triangle.shaderProgram attributeNamed:@"position"]; // will fail if you haven't called glUseProgram yet
/** … Make some geometry */
GLfloat z = -0.5; // must be more than -1 * zNear, and ABS() less than zFar
draw1Triangle.numVerticesToDraw = 3;
GLKVector3 cpuBuffer[3] =
{
GLKVector3Make(-1 + i%2, -1 + i/2, z),
GLKVector3Make(-0.5 + i%2, 0 + i/2, z),
GLKVector3Make( 0 + i%2, -1 + i/2, z)
};
/** … create a VAO to hold a VBO, and upload the geometry into that new VBO
*/
draw1Triangle.VAO = [[GLK2VertexArrayObject new] autorelease];
[draw1Triangle.VAO addVBOForAttribute:attribute filledWithData:cpuBuffer bytesPerArrayElement:sizeof(GLKVector3) arrayLength: draw1Triangle.numVerticesToDraw];
/** … Finally: add into the list of draw-calls we're rendering as a "frame" on-screen */
[result addObject: draw1Triangle];
}
[/objc]
End of part 4 (b)
Next time – I promise – will be all about Textures and Texture Mapping. No … really!