Getting at the backbuffer


Hi all,

I'm interested in mobile vision-based augmented reality, see the video here:
http://mi.eng.cam.ac.uk/~sjt59/hips.html

My device is a Samsung i8910, but I'd like to avoid getting too device-specific. Unfortunately I still can't achieve 30fps. The biggest contributor to the frame cost is uploading the video as a texture to the GPU. I've read lots of threads in various places on the net about this, but no one has yet found a good solution as far as I can tell.

Here's the options that I've come up with, and the reasons they don't work for me:

1. egl supposedly allows "native" rendering to the same surfaces as the "client api" (GL ES in this case)
-> None of the eglConfigs support NATIVE_RENDERABLE, at least on S60 5th Ed.

2. OpenVG and OpenGL ES can share surfaces, and OpenVG has a vgWritePixels function which should avoid lots of the texture processing steps and give more direct access to the backbuffer
-> eglQueryString(..., EGL_CLIENT_APIS) just returns "OpenGL_ES", ie hardware OpenVG not supported on S60 5th Ed.

3. Use OpenGL ES to render into a normal-memory pixmap where I've dumped the video frame previously.
-> I don't think the hardware GL implementations support pixmaps. [disclaimer: not tried this one yet, but that was the answer on forum nokia in a thread I saw]

4. Have a texture for the video which is updated every frame and drawn by GL ES.
-> This is what I'm currently doing but it's slow.

5. Do all the rendering in software, display the bitmap with direct screen access.
-> Not tried this yet but I suspect it will be faster with simple graphics. Seems silly not to use the GPU at all though, and I'd need to write a software rasterizer...

6. Render the gl stuff into a pbuffer, copy to normal-memory, do the overlaying in software.
-> Not tried this either but as it might avoid the texture format swizzling it might be possible to do it faster.

Option 1 would be the best, but unfortunately seems not to be possible. Isn't all the memory shared anyway in the OMAP3? Really I'd just like to know the exact format of the backbuffer, and I'll happily just dump the video in there before rendering the GL augmentations.

Doing the compositing in software (option 6) might end up being the best bet. The new ARMs have NEON which seems pretty quick - I have code to convert the 320x240 YUV from the camera into RGB 565 in around 1.5ms, which is why the ~20ms cost of uploading that to the GPU is starting to get annoying!

Thanks for any help, or suggestions that I haven't thought of yet!

Simon

simontaylor12009-09-16 16:16:52