Why is the "GPU clock speed" measured by PVRScope inconsistent with the result measured by PVRTune

Hello.

The “GPU clock speed” is significantly different when using PVRTune and PVRScope as following:

The second curve is the result of the PVRTune testing while running the 3D Mark graphics program. The first curve is obtained using the PVRScopeStats API while running the 3D Mark graphics program.

I guess PVRTune somehow fixs the clock speed. Is there a way for me to set a fixed clock speed when I’m using PVRScope

Hi niubaty,

Thanks for your message, and welcome to the PowerVR Developer Forum!

After speaking with the Tools Team I confirmed that PVRTune does not change the GPU clock speed when recording from a PowerVR device. The results you are obtaining with PVRScope are likely to be due to the frequency you are sampling (possibly each frame?) You could try getting results with a lower frequency rate, or applying some high-frequency filtering to the values.

As for the results on PVRTune, the tool has an option to smooth the plot of all GPU counters displayed, which produces the results similar to the ones you are obtaning:

Best regards,
Alejandro

Hi AlejandroC,

Thanks for your reply.

I use PVRScope with 1000ms, this is my code

#include <chrono>
#include <thread>
#include "PVRScopeStats.h"

#if defined(__ANDROID__)
#include <android/log.h>
#define LOGV(...) __android_log_print(ANDROID_LOG_VERBOSE, "PVRScope", __VA_ARGS__)
#define LOGD(...) __android_log_print(ANDROID_LOG_DEBUG  , "PVRScope", __VA_ARGS__)
#define LOGI(...) __android_log_print(ANDROID_LOG_INFO   , "PVRScope", __VA_ARGS__)
#define LOGW(...) __android_log_print(ANDROID_LOG_WARN   , "PVRScope", __VA_ARGS__)
#define LOGE(...) __android_log_print(ANDROID_LOG_ERROR  , "PVRScope", __VA_ARGS__)
#else
#include <cstdio>
#define LOG(...) printf(__VA_ARGS__)
#define LOGV(...) LOG(__VA_ARGS__)
#define LOGD(...) LOG(__VA_ARGS__)
#define LOGI(...) LOG(__VA_ARGS__)
#define LOGW(...) LOG(__VA_ARGS__)
#define LOGE(...) LOG(__VA_ARGS__)
#endif

using namespace std::chrono_literals;

static const char* InitErrorToStr(EPVRScopeInitCode eError)
{
    switch (eError)
    {
    case EPVRScopeInitCode::ePVRScopeInitCodeOk:                            return "Ok";
    case EPVRScopeInitCode::ePVRScopeInitCodeOutOfMem:                      return "Out of memory";
    case EPVRScopeInitCode::ePVRScopeInitCodeDriverSupportNotFound:         return "Driver support not found";
    case EPVRScopeInitCode::ePVRScopeInitCodeDriverSupportInsufficient:     return "Driver support insufficient";
    case EPVRScopeInitCode::ePVRScopeInitCodeDriverSupportInitFailed:       return "Driver initialisation failed";
    case EPVRScopeInitCode::ePVRScopeInitCodeDriverSupportQueryInfoFailed:  return "Driver information query failed";
    default:                                                                return "Unknown";
    }
}

int main(int argc, const char *argv[])
{
    SPVRScopeImplData *PVRScopeStatsData = nullptr;
    const EPVRScopeInitCode eInitRet = PVRScopeInitialise(&PVRScopeStatsData);

    if (ePVRScopeInitCodeOk == eInitRet)
    {
        LOGI("Initialised services connection.\n");
    }
    else
    {
        LOGE("PVRScope failed to initialise with error: %s.\n", InitErrorToStr(eInitRet));
        return 1;
    }

    SPVRScopeCounterDef *counterDefinitions = nullptr;
    SPVRScopeCounterReading counterReading{};

    SPVRScopeGetInfo info;

    PVRScopeGetInfo(PVRScopeStatsData, &info);
    printf("nGroupMax %d", info.nGroupMax);



    for(int no = 0; no <2; no++) {
        constexpr unsigned int nGroup = 0U;
        constexpr unsigned int NUMBER_OF_LOOPS = 60U;
    
        PVRScopeSetGroup(PVRScopeStatsData, nGroup);
    
        unsigned int numCounters = 0U;
        unsigned int nIteration = 0U;


        while (nIteration++ < NUMBER_OF_LOOPS)
        {
            LOGI("Iteration %d\n", nIteration);
    
            if (PVRScopeReadCounters(PVRScopeStatsData, &counterReading))
            {
                unsigned int iReadingIdx = 0U;
    
                for (unsigned int iCounter = 0U; iCounter < numCounters && iReadingIdx < counterReading.nValueCnt; iCounter++)
                {
                    if (counterDefinitions[iCounter].nGroup == nGroup)
                    {
                        const char* const counterName = counterDefinitions[iCounter].pszName;
                        const float counterValue = counterReading.pfValueBuf[iReadingIdx++];
    
                        if (counterDefinitions[iCounter].nBoolPercentage)
                            printf(" %s: %f%%\n", counterName, counterValue);
                        else
                            printf(" %s: %f\n", counterName, counterValue);
                    }
                }
    
                // If we have too many results, there may be new counters available
                if (iReadingIdx < counterReading.nValueCnt)
                {
                    if (PVRScopeGetCounters(PVRScopeStatsData, &numCounters, &counterDefinitions, nullptr))
                        LOGI("%d counters enabled\n", numCounters);
                }
            }
            else
            {
                printf("No data\n");
            }
    
            std::this_thread::sleep_for(1000ms);
        }
    }


    LOGI("Shutting down\n");
    PVRScopeDeInitialise(&PVRScopeStatsData, &counterDefinitions, &counterReading);
    return 0;
}

I print the values every 1000ms to the stdout and collect the test result later.
This file(aa.txt) is my test result
aa.txt (533.1 KB)

I change the smooth to 0, still the GPU Speed Clock is flat.


This file(smooth0.csv packed in zip) is my PVRTune test result
smooth0.zip (24.6 KB)

I drew them togather as following:

I’m focus on GPU Clock Speed because the data I obtain with PVRScope shows more volatility than PVRTune. For example, GPU PowerVR Series8XEP, GE8320, Device 0/GPU memory write bytes per second, GPU PowerVR Series8XEP, GE8320, Device 0/Renderer active (%), GPU PowerVR Series8XEP, GE8320, Device 0/Shader/Shaded pixels per second …



I tried 3 devices. When they were tested with PVRScope, the test data of all of them were more volatile.

Did I make a mistake somewhere?

Hi niubaty,

Thanks for your message.

Regarding the GPU clock speed GPU counter not showing any information, please make sure to:

  • Install PVRPerfServerDeveloper on the device you are taking recordings from.
  • If you have used more than one PVRTune version, then make sure that there is a match between the version installed in the device and the PVRTune version you are using to take any recordings.

Regarding data volatility, the Renderer Active GPU counter is a binary counter (it either has a value of 0% or 100%):

I can see in the file provided entries like “Renderer active: 77.362137%”, please for that GPU counter treat its value as value 0 → Renderer not active, value > 0 → Renderer active.

Also, for comparing the values from PVRTune and the values from PVRScope, are you exporting the values from PVRTune as a .csv? (using File -> Export -> Export raw data to .csv). In that case, there is a PVRTune issue which was reported on a different thread ( How to get Geometry active value from exported .csv file? - #9 by AlejandroC ) and was fixed internally. The fix will be included in the next PVRTune release (I can also provide an engineering drop if you need it).

Best regards,
Alejandro