Clarifications on some Power VR Architecture tech

 I hope any one of you will be answering some of my ‘blind concepts’ with much


               I’m a Game Developer

by profession and into OpenGL for the past one year.Recently I’m very

much attracted towards GPU insights. Especially related to advancements

of PowerVR products related to TBDR(Tile Based Deferred Rendering).But,

I’m having some problems behind understanding the underlying


Here are the topics I humbly request you to answer...

1) Tile Based Deferred Rendering :
Some articles says TDBR don't need any Z-Buffers. Is this really
true??? the GPU must have some kind of at-least small buffer to store
the depths for comparing with the latter results right?? . 

                          If the implementation really don't have
Z-Buffers, does the driver simulate the depth buffer array (as when we
call function calls related to Depth, we will be given correct results
similar to non-TDBR architectures).-Am I missing anything in this?

2)Does SGX cards have Frame Buffer Memory,Color Buffer and Depth buffer(if any) in MAIN MEMORY or it will be in the GPU hard-wired memory?
                      As I came to know, SGX cards use Shared (Main)
Memory - I can't be able to know how this can be BETTER in terms of
'performance' (as usually Main Memory is 'SLOW')

                        Does Texture memory in MAIN RAM + FASTER BUS (between GPU AND MAIN RAM )will do the trick?  
                        Finally what will be the case of MBX architectures related to memory?

3) In Fixed Pipeline, As 'glReadPixels' call on Frame Buffer is Really Slow, How about this kind of implementation?
                       If i do something like "Maintaining a CUSTOM
Frame Buffer(Pixel array) and do depth sorting,alpha testing and other
tests on my own with ACCELERATED LIBRARY(something like Open CL or
VecLib) , and finally loading the result into the GPU frame buffer" -
will have any performance gain when using glReadPixels intensively?

                      Or Does this way really making me a noob before you :( .

     In one of your(Thobias) comments, it was mentioned
that glReadPixels on Render Texture will be a performance gain.And why
this difference happens as, every thing needs the full flush when we
request a read call.

4)How Triangle Clipping will be optimal in TDBR when compared to normal traditional rendering(say, immediate mode rendering).

Still have lot of confusions but will be happy to know once if you are willing to answer. Eagerly waiting for you reply.

P.S: Apologies for making the post big.
Thanking you,


Are you aware of the documents available here:

The “SGX Architecture Guide for Developers” is probably of most interest.

I think it is very unlikely that writing what is essentially your own rasterizer using something like OpenCL or VecLib is going to be more efficient than using SGX through a dedicated graphics API. glReadPixels is only slow in that it can interrupt the parallelism of the GPU and CPU - GPU processing must finish on the target to be accessed (causing the CPU to wait) and then wait for the CPU to finish before it can continue. By using render to texture (e.g. FBO) then it can be possible to mitigate this effect as the GPU may be able to continue to process a different render target while the CPU is working.

Hai Gordon,
            Very much thankful for your reply.
 “By using render to texture (e.g. FBO) then it can be possible to

mitigate this effect as the GPU may be able to continue to process a

different render target while the CPU is working.”
          - Can you please make it in depth? What I’m thinking is like , Even Render to texture will need a flush (of GL draw calls that are pending) by the GPU right?

Yeah I have gone through the  Docs.But Still its not clear about the above questions as they were discussed in a bit abstract way in docs :frowning: .
I would be very much thankful to you, if you can fix my doubts.