I do gemm on PowerVR GT 7400(meizu pro 7 plus)! I use half, the compile is right,but when run in device,the kernel error , the information is : NDRANGE_KERNEL executed abnormally.
what can I do to slove it!
Hi,
It looks like it may be a driver issue. Would you be able to provide more information on the application you are trying to run?
Thank you.
Best regards,
Omar.
Hi,
Thank you for help me! I have tested this problem for nearly a month and still have no result.
Next I will describe the problem in detail and my question.
First, I made a matrix multiplication(GEMM) api using the GPU of GT7400(meizu pro 7 plus). The data type I am using is half(half4 half8).
After tesing this api calculation results are no errors. Then I use this api to complete the convolutionnal layer in cnn. When I am running a large cnn prototext and he will make a mistake. Errors occur randomly in the gemm of different convolutional layers, not at a same location. I tested the input and output of the GEMM at the wrong location,their inputs are corret, but the output is wrong.
Second, when running cnn, the GEMM kernel occasionally reports an error, and the error message is *** NDRANGE_KERNEL executed abnormally ***, this is not a standard opencl error message. I am not sure what caused it. I initially suspected that the memory access was caused by the cross-border. Later, I found that the memory did not cross the boundary and still had this problem.This error is also occasional and the location is not fixed. For example,
half4 var = (aa == NULL) ? 0 : vload4(0, aa[i]); // no crossing, don’t error
half4 var = (bb == 0) ? 0 : vload4(0, aa[i]); // no crossing, error; Where aa == NULL when b == 0; a != NULL when b == 1;
I think the two methods are the same, but the following way will give an error , error message is : *** NDRANGE_KERNEL executed abnormally ***. It will not report an error when compiling, and will only report an error when the program is running.
Also, I have a few questions.
First, when querying device information, this device does not support cl_khr_fp16, and the vector width of half is 0. Therefore, I am not sure whether the device supports half calculation.
Second, do you have some gemm kernels on this gpu or other powerVR gpus? As a demo, let me learn. Or about OpenCL related documentation for powerVR gpus. I also read some documentation on the official website, and did not find more detailed information about OpenCL. I’m not sure if this powerVR’s support for OpenCL differs from other OpenCL devices. Is there some special, or obvious, difference that can cause GPU computing errors?
Finally, sincerely thank you, I hope to get your help.
Thank you !
The hardware definitely has FP16 processing capability, it might not however be exposed for OpenCL on that platform for some reason. Can you try if GLSL compute gives you the same result?
We do have some demos in our SDK, please see here: PowerVR SDK - Imagination Developers
We also have documentation on OpenCL and the Rogue Instruction Set Reference in general, available here:
https://www.imgtec.com/developers/powervr-sdk-tools/documentation/
PowerVR has a unique architecture, but that’s mostly relevant if you are doing rasterization. For compute you need to take into consideration the specifics of our ALUs.
@2know Have you already solved this issue?