Vertex Cache Size for Series 66XT

What are the sizes of vertex caches in PVR 66XT?

Is it the same as in 5MP?

Hi,



Unfortunately, cache sizes are not discussed in any of our public SDK documentation. If you are encountering performance issues that you believe are related to cache sizes, let us know and we can help you investigate.



Thanks,

Joe

I don’t have any performance issues because of cache misses in vertex cache. But, since, I have vertex cache optimization in my content pipeline and i want to use it with proper cache size - this is an optimization that comes for free and it would be a crime to not use it.

I’ve discussed this with the engineer who implemented PVRGeoPOD’s triangle sorter. Rogue devices do not have a dedicated vertex cache (a dynamic amount of space is allocated from a generic cache). Additionally, it’s extremely difficult to account for other variables in the graphics pipeline, such as clipping and culling (back face, small triangle etc.) to optimize Parameter Buffer writes/reads.



Our recommendation is to use a generic triangle sorting algorithm for Rogue graphics cores.

StiX, if you search the web a bit, you’ll find info from PowerVR on SGX (Series 5) that states the number of verts you can fit in the post-transform vertex cache is a function of how much varying data you output per vertex shader execution: more varyings == fewer cached vertices.



This sounds like the same scheme desktop GPUs built on unified shader cores have been using for quite a while, where there’s a fixed amount of space dedicated for vertex cache, and it’s “divied up” among the transformed verts based on how much space each needs (total_verts = total_bytes / bytes_per_vert).



Don’t know for sure, but the same is probably true for Series 6.



In any case, I’d suggest you use a vertex cache optimizer that isn’t hard-wired to a specific vertex cache size, degrading gracefully with smaller cache sizes. For instance, Tom Forsyth’s optimizer is a good one:


is a function of how much varying data you output per vertex shader execution: more varyings == fewer cached vertices.

I don't see the relationship between the amount of varyings and the amount of cached vertexes.
PS Vita documentation I have clearly states, that vertex cache stores only N indexes on last vertices. So optimize your meshes for this amount

That’s very interesting, StiX. Websearching a bit, sounds like the Vita is a Series 5XT (SGXMP; SGX543). In your doc quote, do they give a fixed number for “N”?



However, in the “POWERVR SGX OpenGL ES 2.0 Application Development Recommendations” (also found through a websearch), it says:


3.4.1. Vertex Processing FIFO
POWERVR SGX has a cache for processed vertices. This cache will hold a few previously transformed vertices so keeping them together will increase the Vertex Shader throughput. The
smaller the transformed vertex format the more effective the FIFO will be so try to reduce the amount of data passed from the vertex shader to the fragment shader (varyings).