About GPU counters

Hi, I want to find out my 3D game GPU resource usage in order to optimize it. While I am a little confused about the value you explained and I observed on PVRTune. For PerfServer I could set its sample frequency to every 1ms, so it will visit the driver and get the value every 1 ms, while the graph data could show the functional unit like TA is busy in 0.642 ms or even smaller, so if for this 1 ms time, the TA load will be calculated as about 64.2%, am I right? Is it the perfserver caculated the value or the driver updated it based on 1 ms.

The PVRTune documentation appendix explains the value that you get is based on active or not in sample period, so I am a little confused about the sample period, is it the driver internal static sample period or the outer routine visiting period? If it is the driver internal static sample period, then I could understand it as the driver could visite the hardware every 0.001 ms and deciding that it active in 0.642ms. And the driver could keep up to several ms records, so the Perfserver could get this kind of information in the last 1 ms or 2 ms.

Am I right?

Thank you very much!

This discussion was continued over email. For the benefit of other forum users, I’ve posted an overview of our answers below.


PVRPerfServer's sample frequency value specifies how often values are retrieved from the graphics driver. The data pulled from the driver consists of all counter values that were captured by the hardware since the last sample. Once PVRPerfServer has retrieved the counter values, the driver's circular buffer is cleared to free up space.

The default sample rate has been carefully chosen as the minimum number of retrievals that can be made without losing data (i.e. the circular buffer should not fill during the sample period). For this reason, it's unlikely that you will need to alter the sampling rate to profile your target device.

Thanks,
Joe

what if I choose 25ms as the sample period in my pvrscope code? So actually the value I got the last maybe 2 ms counter values, if the default is 2ms. Another problem is how to get per pid FPS using PVRscope, in pvrtune I could see this counter?

The less frequently you sample, the more chance that the driver’s circular buffer will have overwritten old values. We can’t give any exact numbers of how low a frequency you can get away with as it is dependent on the GPU’s workload, it’s clock speed and the size of the circular buffer in the driver (which may vary between driver versions or even be customized by the platform provider).

If you start getting strange results at your chosen sampling frequency, then you should lower it. The most reliable solution will be to run a background thread for the sampling. That way, you’ll be able to sample at a consistent frequency (e.g. 2ms).



Per-PID stats are calculated in PVRTune using the timing data it collects. As timing data is not available to PVRScope, it’s not possible to expose these counters through the PVRScope interface.



Thanks,

Joe