Imagination PowerVR SDK Blog

glDrawRangeElements


#1

Hi...

two questions in one thread: is powervr mbx providing an extension to use glDrawRangeElements?

If not, it there a reference, how to emulate this functionallity in performantic way?

With the best regards,

Andi


#2

There is no extension for OpenGL ES that would introduce glDrawRangeElements. Simply use glDrawElements instead.


#3

Ok, problem is, that i have vertexarrays and indexarrays coming from an "full OpenGL" application... so i am thinking about a way how to render these for glDrawRangeElements designed arrays in OpenGL ES.

Any conclusion?


#4

As I wrote before, simply use glDrawElements. Drop the start and end parameters to glDrawRangeElements, the rest is the same.





This is a perfectly valid implementation of glDrawRangeElements:


Code:
void glDrawRangeElements(GLenum mode, GLuint start, GLuint end, GLsizei count, GLenum type, void *indices)
{
    glDrawElements(mode, count, type, indices);
}

#5

Xmas… Thanks again for the help. I was really confused about the start and the end. Â



#6

Since i use glDrawElemets for these range elements now, this is really slow.

i need about 10 ms per draw call, means around 400ms for 40 batches, which results in a framerate of 2 frames per second. This is really slow.

Any way to speed this up? where can be the the problem, making this that slow?

Best regards


#7

What device are you using? What are the arguments for the draw calls, i.e. primitive type, number of vertices? What is your vertex data layout?


#8

Hi xmas... to your questions:

All tests are running on Nokias N95 8GB.

Primitive Type is GL_TRIANGLE

Number of vertices depends because of flexible building of the batches as spatial groups and related to the textures. is is per batch somewere between 12 and 2000 verts.

The vertex data layout is build out of own classes for Vector3 and Vector2, all holding floats in an array. The vertex array it selve is an strider, containing a Vector3 for the vertices followed by a Vector2 holding the texture coordinates. An own strider is given for the U16 indices.Â

so, i hope, i answered all the questions.

Best Regards,Â

Andi


#9

That really seems to be quite slow. How do you measure the time for each draw call? Do you have blending or alpha test enabled? What texture format are you using?


#10

Texture Format is PVRTC... in these batches no alpha tests and no blending.

The time is measured by symbian system calls and written in a debug file, after all drawing is over.


#11

So no further ideas?

Maybe ideas, how to find the leak. If you say, this sounds for you quite slow ( i feel the same ), there should be a leak. What do you think?


#12

Please try reducing the workload in specific areas to find the bottleneck, e.g. disable texturing, reduce the viewport size, reduce the frustum size, use smaller vertex data types, or submit only one triangle per draw call.





How exactly are you measuring the time for each draw call?


#13

Time is measured like this:

Code:

#ifdef DEBUG_PUSHING_BATCH
 TTime starttime, endtime;
 TTimeIntervalMicroSeconds rendertime;
 starttime.HomeTime();
 #endif

// Drawcall here

#ifdef DEBUG_PUSHING_BATCH

endtime.HomeTime();
rendertime = endtime.MicroSecondsFrom(starttime);
#endif

I allready tried to disable Texturing ( no measurable result )

Best regards


#14

Did you try the other suggestions as well?


#15

Reducing the count of triangles is (of course) speeding up the drawcalls. The vertexbuffers, the ranges are drawn from are flexible rebuilt. means, there are parts of it changed during the runtime. is it possible, that memory gets fragmented and so the memory acces gets that slow?



#16

Are you re-allocating the vertex array memory every frame? Otherwise, memory fragmentation should not be an issue. Did you try reducing the viewport size and reducing the frustum size?Xmas2009-04-07 12:51:41


#17

Sometimes, the buffers are realocated each screen, sometimes a long time not, depending on what is happening on the screen and with the specific objects.

Reducing the viewportsize to the half values is only bringing very small bettering, maybe 5 to 10 percent. reducing the frustrum size, is the same... if there are then only very small betterings!


#18

This would seem to indicate that your application is vertex limited. Do you have lighting enabled, are you using texture matrices or anything else that may need to be calculated per vertex?





Have you tried using shorts instead of floats (scaling the shorts to the apropriate range using the modelview and texture matrices)?


#19

Hi,

looks like the performance breakdown was caused by disabling alpha write with glColorMask function. Looks like this is a killer feature for the hardware. Maybe you can explain me deeper why this happens.

Thx for now!


#20

Disabling writes to any colour channel means the hardware has to treat the object rendered as transparent for hidden surface removal, since the value for the masked channel has to come from the surface behind the current object. This means you get additional overdraw which requires a lot of fragment fillrate.





It is strange that reducing the viewport size did not result in better performance, though.