GL_POINTS, tile accelerator is bottleneck?

I am new to developing on PowerVR devices and tried out the PVRTune program on my app yesterday. It reported that the TA unit was the bottleneck (90%+ busy) with no other units above 50% busy (and the shader units well under 50%). Running on SGX540.





I was hoping that someone could offer some suggestions as to why this unit is so loaded. A description of my app:


* Processes large mesh of vertices using glDrawArrays with GL_POINTS, writing to frame buffer that is being used as render-to-texture, with gl_PointSize=1.0.


* No depth tests, depth buffer, etc.


* Not using GL_POINT_SMOOTH


* glViewport used to set clipping region


* Moderately complex vertex shader, simple fragment shader





Thanks!

Any unit (in this case the TA) will naturally become the bottleneck if the load on all other units is very low.

True. :slight_smile:





I was hoping that there might just be something I was doing wrong in how I set things up in my app. Also, PVRTune said that my app was processing 2M or so verities per second, and I think that the GPU is supposed to handle 20M per second, so something seems off. It would be great to hear some ideas for how to make the app more TA-friendly - perhaps a matter of mesh ordering, or vertex size/alignment, etc.