Texture Uploading with PVRTC2

Are there any performance gotchas to watch out for when subloading PVRTC2 data into textures? Currently we’re seeing this as a bottleneck.

Is the use of PBOs for texture data subloading recommended? Any recommendations on upload strategies that work well with the PowerVR driver?

Does subloading on 4x4 (or 8x4 for 2bpp) boundaries work efficiently?

And, while I’d prefer vendor-neutral methods, are there any more efficient methods to upload data to GLES textures on PowerVR? I see references in the archives to an “IMG texture streaming” extension (GL_IMG_texture_stream2), but can’t find much information on it.


Regarding the subloading question, are you texture streaming there?

Yes, this would be streaming texture data off-disk to the GPU at render-time, with the goal of minimal hit to total frame in the render thread.

(By the way, when does a user get exempted from this forum feature:

“You have posted 1 times within 30 seconds. A spam block is now in effect on your account. You must wait at least 600 seconds before attempting to post again.”

When working with PBOs it’s generally recommended to have a circular buffer of PBO. This will allow you to write to one buffer while the graphics driver reads from another, in this way neither will block each other during their respective operations. Also consider multithreading the texture upload part of the process.

As long as the GPU is being fed the data constantly the render should be efficient.

Flood control applies to all users.

Thanks PaulL! Re PBO ring buffer, I understand.

Question: re “multithreading the texture upload part”, do you mean filling mapped PBOs in a background thread? I’m assuming you don’t mean having two threads each with their own GL context pointing at the same GPU/display which compete to talk to the same GPU/driver? Could you confirm?

Also, any recommendations on upload strategies, or confirmation that subloading PVRTC2 on 4x4 (or 8x4 for 2bpp) boundaries should be supported with good performance?

Filling the buffer is useful in the way you expect, but also multiple threads each with their own context allows the graphics driver to distribute its work load efficiently. The usual caveat of multithreading applies that it isn’t always efficient depending on your application.

Our blog post on multithreaded rendering has more detail about it:


Subloading on those boundaries would be fine, there are no performance issues there.

Good info — thanks, PaulL.