Reduction: many texel lookups VS Multiple renders

wonwoolee · September 4, 2012, 2:10pm

Hello.

I'm currently implementing a kind of reduction.

Generally the reduction is conducted by rendering rectangles multiple times, by reducing their size in the order of 2.

I have to sum values in each 11x11 patch in a texture.

My basic implementation was computing horizontal sum for 11 texels, and then do the same thing in vertical.

However, reading 11 texels in a fragment shader is slow as known in general.

In this case, could I improve the speed if I do the sum in a typical way ?

(By putting a 11x11 patch to a 16x16 patch and performing rendering multiple times)

Actually, both methods seem not much different in computational complexity to me.

marco · September 4, 2012, 3:00pm

Hi WonwooLee,

did you have a look at the Bloom training course from our SDK?
In there we optimize the amount of texture fetches by using the filtering capabilities of the texture units.
If your filter coefficients allow it, then you can save quite a few texture lookups that way.

Do you require a 11x11 kernel to downsample your image?
You will be most likely bandwidth bound depending on your input image size.

Best regards,
Marco

marco2012-09-04 17:00:30

Topic		Replies	Views
Branching vs Multiple texture fetch and step/mix PowerVR Insider	1	379	March 24, 2016
Help with speeding up a pixel shader please... PowerVR Insider	1	298	March 20, 2009
iPad poor performance on texture lookup? PowerVR Insider	3	287	September 21, 2010
glDrawArrays and glDrawElements PowerVR Insider	1	336	September 22, 2009
Fastest uniform read vs. fastest texture read	3	605	January 17, 2020

Reduction: many texel lookups VS Multiple renders

Related topics