Hi,
I had a look for you, the GE8300 (and GE8320) does support the SOPMAD instruction.
Generating efficient code is not so straightforward, but I think you’ve got a good strategy there (seems to be working). It depends on a number of things, usually whether you can route your data (which you guessed correctly) through the internal datapaths so that you can use all “phases” of the ALU.
I don’t have much experience with OpenCL on PVR but with GLSL the key is using mediump.