Hello,
I have a few doubts regarding the PowerVR Shader Editor shader cycle counts and emulated cycle counts.
Assuming:
SGX 543 Single Core at 200 Mhz
30 Fps - Frame time = 33ms
Total SIMDS/ALU - 4
Total MADS/ALU - 4
Total MADS/Core - 16
Number of Vertices - 128 x 64 = 8192 vertices
Frame Buffer - 1024 x 768 = 786432 pixels
Vertex Shader Cycle Count - 100 (emulated)
Pixel Shader Cycle Count - 20(emulated)
The cycle counts generated are for a single ALU (According to documentation)
So please tell me if my following calculations are correct
(assume vertices cover the entire screen for simplicity)
Total Vertex Cycles - 128 x 64 x 100 = .819 MCycles
Total Pixel Cycels - 1024 x 768 x 20 = 15.7 MCycles
So Total rendering cost = ~ 16.6 MCycles
So does this mean for a sgx 543 it is 16.6/4 = 4.13 MCycles
also the document does not mention the cost of each cycle in milliseconds
is One cycle cost = 1/200Mhz = 0.005ms
If so then total rendering time is 4.13M x 0.005ms = 20 ms
For an ipad2 (sgx 543mp2)
this translates to 20/2 = 10ms
or 30 percentage of frame time.
Any thoughts on this will be very helpful
Thanks,
Ganaboy
Hi Ganaboy,
Sorry for the delayed response. Your question spawned a little discussion here. I can confirm that the emulated output of the compilers within PVRShaderEditor are correct. Sadly, some of your maths isn’t quite right, and your assumption in regards to rendering time isn’t quite accurate either.
To explain, yes you are correct that the total rendering cost would be ~16 MCycles, and that, given the SGX543 has 4 pipes you can quarter that for an approx time cost per frame of ~4 MCycles. The reason we don’t give the cycle cost in milliseconds is because it isn’t something we have control over, the clock speed of the chip will vary from customer to customer, from SoC to SoC. If the 543 was running at 200MHz you would be roughly correct at 20 ms, or 10ms on an MP2…roughly.
This doesn’t however cost you a third of your frame time in drawing. The GPU works in parallel with the CPU, and most (nearly all) of the driver calls are non-blocking. There is a small over head in time to submit your calls to the GPU, but then the GPU will process them at the same time as the CPU is doing other work, so actually, it will only cost as much time on the CPU as the time it takes to submit the driver calls, not as much time as it takes for the GPU to render.
I would recommend you take a look at our Hardware Architecture Overview on our documentation page for more information on how the hardware works.
Hope that helps.
Hey thanks for the answer…
i had done the math taking into account that. GPU and CPU will work in parallel.
Implemented double buffering , to avoid those nasty stalls.
I mean whatever is the case the cost is 10ms GPU time.
A gpu can only do 33ms worth of work every frame (targeting 30 fps)
But the answer has been very helpful.
I am doing almost all my development on Windows machine(using the opengl es 2.0 emulator and GUI for updating scene). I wanted to have a rough estimate of how much i can push the GPU. Thinking of adding a small statistics in the editor gui for these calculations.
I am using the same code base for actual device/GUI Editor.
Since for every small update i make to the scene, dont want to test it on the ipad2(time consuming)
Thanks and regards,
Ganaboy
HI again,
Yeah, the cost would be 10ms. As I say, assuming 200MHz clock speed. Most chips in the wild are, however, clocked higher than that. I can’t comment on the iPad2 I’m afraid.