G6200 - OpenCL memory copy speed is extremely slow

I perform memory copy testing on MTK 6595 with PVR G6200.But it is extremely slow.
Details as below :
Host to Device : 1.42 GByte/s
Device To Host : 0.06 GByte/s
Device To Device : 0.12 GByte/s

While Qualcomm snapdragon 800 with Adreno 330 has better speed:
Host to Device : 2.44 GByte/s
Device To Host : 1.71 GByte/s
Device To Device : 6.72 GByte/s

Such a big gap is incredible. Especially when we perform Device to device

The test tool app is here:
https://play.google.com/store/apps/details?id=com.robertwgh.opencl_z_android

OpenCL-Z-Android Report

Phone/Tablet Information

Device model: Meizu MX4
Android OS version: 4.4.2
Kernel version: 3.10.35+, 周五 3月 06 02:45:45 2015
Build number: Flyme OS 4.2.2.2A
CPU ABI: armeabi-v7a
CPU ABI 2: armeabi

OpenCL Information

Found 1 OpenCL platforms.
[Platform] INDEX: 1
[Platform] NAME: PowerVR Rogue
[Platform] VENDOR: Imagination Technologies
[Platform] PROFILE: EMBEDDED_PROFILE
[Platform] VERSION: OpenCL 1.2
[Platform] EXTENSIONS: cl_khr_byte_addressable_store;cl_khr_global_int32_base_atomics;cl_khr_global_int32_extended_atomics;cl_khr_local_int32_base_atomics;cl_khr_local_int32_extended_atomics;cl_khr_egl_image;cl_khr_spir

Found 1 devices:
[Device] INDEX: 1
[Device] TYPE: GPU
[Device] NAME: PowerVR Rogue Han
[Device] VENDOR: Imagination Technologies
[Device] VENDOR_ID: 0x1>>>
[Device] DRIVER_VERSION: 1.3@3304414
[Device] PROFILE: EMBEDDED_PROFILE
[Device] VERSION: OpenCL 1.2
[Device] MAX_CLOCK_FREQUENCY: 598 MHz
[Device] MAX_COMPUTE_UNITS: 2
[Device] AVAILABLE: true
[Device] COMPILER_AVAILABLE: true
[Device] EXTENSIONS: cl_khr_byte_addressable_store;cl_khr_global_int32_base_atomics;cl_khr_global_int32_extended_atomics;cl_khr_local_int32_base_atomics;cl_khr_local_int32_extended_atomics;cl_khr_egl_image;cl_khr_spir;cl_img_cached_allocations
[Device] MAX_WORK_ITEM_DIMENSIONS: 3
[Device] MAX_WORK_ITEM_SIZES: (512, 512, 512)
[Device] MAX_WORK_GROUP_SIZE: 512
[Device] ADDRESS_BITS: 32
[Device] MAX_READ_IMAGE_ARGS: 8
[Device] MAX_WRITE_IMAGE_ARGS: 1
[Device] MAX_MEM_ALLOC_SIZE: 67108864
[Device] IMAGE2D_MAX_WIDTH: 16384
[Device] IMAGE2D_MAX_HEIGHT: 16384
[Device] IMAGE3D_MAX_WIDTH: 0
[Device] IMAGE3D_MAX_HEIGHT: 0
[Device] IMAGE3D_MAX_DEPTH: 0
[Device] IMAGE_SUPPORT: true
[Device] MAX_PARAMETER_SIZE: 1024
[Device] MAX_SAMPLERS: 8
[Device] MEM_BASE_ADDR_ALIGN: 512
[Device] MIN_DATA_TYPE_ALIGN_SIZE: 64
[Device] SINGLE_FP_CONFIG: CL_FP_INF_NAN;CL_FP_ROUND_TO_ZERO;CL_FP_FMA;
[Device] HOST_UNIFIED_MEMORY: Unified
[Device] GLOBAL_MEM_CACHE_TYPE: read/write cache
[Device] GLOBAL_MEM_CACHELINE_SIZE: 64
[Device] GLOBAL_MEM_CACHE_SIZE: 32768
[Device] GLOBAL_MEM_SIZE: 268435456
[Device] MAX_CONSTANT_BUFFER_SIZE: 1048576
[Device] MAX_CONSTANT_ARGS: 4
[Device] LOCAL_MEM_TYPE: local
[Device] LOCAL_MEM_SIZE: 4096
[Device] ERROR_CORRECTION_SUPPORT: false
[Device] PROFILING_TIMER_RESOLUTION: 1000 nanoseconds
[Device] ENDIAN_LITTLE: true
[Device] EXECUTION_CAPABILITIES: CL_EXEC_KERNEL;
[Device] QUEUE_PROPERTIES: CL_QUEUE_OUT_OF_ORDER_EXEC_MODE_ENABLE;CL_QUEUE_PROFILING_ENABLE;
[Device] PREFERRED_VECTOR_WIDTH_CHAR: 1
[Device] PREFERRED_VECTOR_WIDTH_SHORT: 1
[Device] PREFERRED_VECTOR_WIDTH_INT: 1
[Device] PREFERRED_VECTOR_WIDTH_LONG: 0
[Device] PREFERRED_VECTOR_WIDTH_FLOAT: 1
[Device] PREFERRED_VECTOR_WIDTH_DOUBLE: 0
[Device] PREFERRED_VECTOR_WIDTH_HALF: 0
[Device] NATIVE_VECTOR_WIDTH_CHAR: 1
[Device] NATIVE_VECTOR_WIDTH_SHORT: 1
[Device] NATIVE_VECTOR_WIDTH_INT: 1
[Device] NATIVE_VECTOR_WIDTH_LONG: 0
[Device] NATIVE_VECTOR_WIDTH_FLOAT: 1
[Device] NATIVE_VECTOR_WIDTH_DOUBLE: 0
[Device] NATIVE_VECTOR_WIDTH_HALF: 0

I agree that those results are unusually different.

I can suggest using a couple of our tools to help get a better picture of what is going on. PVRTune will give you real-time performance data from the hardware counters, and PVRShaderEditor can help you you identify any subtle potential issues in your kernels.

Otherwise, it may be worth contacting MediaTek regarding any known issues with OpenCL data transfer speeds for that chipset.

@Joe , Any suggestion?

Unfortunately, there isn’t any advise we can give you beyond what Paul has already said. If you contact MediaTek, they may be able to provide you with the peak system memory performance statistics for you to compare with the results you are seeing.

Running PVRTune on the target will help you verify that your tests are working as expected. The latest version of PVRTune also includes statistics for the GPU’s System Level Cache (SLC) reads/writes which should help too

I had tested another product which use G6200 also : Allwinner A80.

The result is the same . Device to Device memory copy speed is as slow as MTK 6595.

So i think the problem may cause by the OpenCL lib from Imagination Inc.

MediaTek and Allwinner may just compile the source files from Imagination Inc to generate .so file only.

Another low level product named MTK6735 with GPU named Mali-T720 MP3 has better performance than G6200 .
It‘s 20 times gap!!!
Details of MTK 6735 as below :
Host to Device : 1.09 GByte/s
Device To Host : 1.19 GByte/s
Device To Device : 2.61 GByte/s

Mali-T720 beat G6200 on the device to device memory copy speed.And it’s 20 times gap!!!
I think it’s unreasonable!

This is curious. We’ll will continue to look into this when we have the opportunity to.

[quote quote=49393]This is curious. We’ll will continue to look into this when we have the opportunity to.

[/quote]
Do you have any android product with G6200 on hand ?

The link is here http://www.amazon.com/MX4-Unlocked-Smartphone-MTK6595-Gorilla/dp/B00NWBQ9VI
This product has OpenCL support .

We don’t currently have any devices suitable for testing this ourselves.

Could you use PVRTune to record performance, and attach the recording file. We can investigate from there.

[quote quote=49402]We don’t currently have any devices suitable for testing this ourselves.

Could you use PVRTune to record performance, and attach the recording file. We can investigate from there.

[/quote]

Where can i get the PVRHub.apk ?
Like https://github.com/powervr-graphics/PVRMonitor ?
I am just follow the step from below blog:
https://www.imgtec.com/blog/powervr-developers/powervr-graphics-sdk-tools-explained-quickstart-guide-running-pvrtune-android
the step is :
adb install /path/to/PowerVR_SDK/PVRHub/Android_armeabi_armeabi-v7a_x86_mips/PVRHub.apk

Full SDK is too big : https://www.imgcommunity.local/files/mac-osx-offline-installer-powervr-tools-sdk-3-5/
And the link for PVRHub without download link for the APK file :
https://www.imgcommunity.local/developers/powervr/tools/pvrhub/
Would you please update above page with the apt download link ?

You can download the online installer for the PowerVR SDK from: https://www.imgcommunity.local/developers/powervr/installers/

The online installer will let you selectively download and install only the components you need.

We’re looking into updating the community page now to make access to PVRHub more convenient.

[quote quote=49415]You can download the online installer for the PowerVR SDK from: https://www.imgcommunity.local/developers/powervr/installers/

The online installer will let you selectively download and install only the components you need.

We’re looking into updating the community page now to make access to PVRHub more convenient.

[/quote]
I use PowerVRSDKSetup-3.5 to download PVRHub
Check the “PowerVR Tools --> PVRHub” and click next
But can’t found PVRHub on /Users/Shared/Imagination/PowerVR_Graphics
Any suggestion ?

[quote quote=49415]You can download the online installer for the PowerVR SDK from: https://www.imgcommunity.local/developers/powervr/installers/

The online installer will let you selectively download and install only the components you need.

We’re looking into updating the community page now to make access to PVRHub more convenient.

[/quote]

Would you please send me the PVRHub.apk?
I had sent my e-mail to you by private message.

Thank you.

PVRHub.apk is created under: PowerVR_Graphics\PowerVR_Tools\PVRHub\Android

[quote quote=49426]


I use PowerVRSDKSetup-3.5 to download PVRHub
Check the “PowerVR Tools –> PVRHub” and click next
But can’t found PVRHub on /Users/Shared/Imagination/PowerVR_Graphics
Any suggestion ?

PVRHub.apk is created under: PowerVR_Graphics\PowerVR_Tools\PVRHub\Android

[/quote]
Thanks .
The zip file link of *.pvrtune and *.pvrtrace had been sent to you via private mail.
Please kindly do me a favor to have a check .

We have identified an issue on our OpenCL drivers that was leading to poor data transfer speeds. Please note this only applies to clEnqueueCopyBuffer. It has been filed on our internal ticketing system with the reference: RDI5386.

[quote quote=49487]We have identified an issue on our OpenCL drivers that was leading to poor data transfer speeds. Please note this only applies to clEnqueueCopyBuffer. It has been filed on our internal ticketing system with the reference: RDI5386.

[/quote]
Great!!!
Thanks PauIL.
BTW. I found another OpenCL example APK with slow execute speed also but not use clEnqueueCopyBuffer.
The source code is :
http://developer.sonymobile.com/downloads/code-example-module/opencl-code-example/
And the trace file link is :
http://lockscreen.mobi/RDI5386_OpenCL_pvrtrace_batch3_batch4.zip
I try to login and submit trace file via https://pvrsupport.imgtec.com/new-ticket.
But the code isn’t show for me when submit form!

Would you please help to transfer above information to OpenCL team?
Thanks.