Vertex shader

Vertex shader

Hi WonwooLee,

Due to a limitation in our earlier shader compilers, packing texture coordinates into vec4s is generally a bad idea, as it will force a dependent texture read. Equally, computing a vec2 from two different varyings can also cause a dependent texture read, as the vec2 won't be created until the fragment shader is run. This means the shader won't be able to fetch any of the data ahead of time, and your fragment shader would be entirely memory speed limited - which for 11 samples is a huge amount of slow down. For optimum performance, the best thing to do is actually just have 11 vec2s, each containing the correct coordinates.

E.g.

uniform sampler2D tex;

varying lowp vec2 myVaryings;

varying lowp vec2 myVaryings1;

varying lowp vec2 myVaryings2;

varying lowp vec2 myVaryings3;

varying lowp vec2 myVaryings4;

varying lowp vec2 myVaryings5;

varying lowp vec2 myVaryings6;

varying lowp vec2 myVaryings7;

varying lowp vec2 myVaryings8;

varying lowp vec2 myVaryings9;

varying lowp vec2 myVaryings10;

void main()

{

lowp vec4 texels[11];

texels[0] = texture2D(tex, myVaryings) ;

texels[1] = texture2D(tex, myVaryings1) ;

texels[2] = texture2D(tex, myVaryings2) ;

texels[3] = texture2D(tex, myVaryings3) ;

texels[4] = texture2D(tex, myVaryings4) ;

texels[5] = texture2D(tex, myVaryings5) ;

texels[6] = texture2D(tex, myVaryings6) ;

texels[7] = texture2D(tex, myVaryings7) ;

texels[8] = texture2D(tex, myVaryings8) ;

texels[9] = texture2D(tex, myVaryings9) ;

texels[10] = texture2D(tex, myVaryings10) ;

gl_FragColor = texels[0]+texels[1]+texels[2]+texels[3]+texels[4]+texels[5]+texels[6]+texels[7]+texels[8]+texels[9]+texels[10];

}

Saying that - 11 texture fetches may be too many to pre-fetch; there's a limit to how many samples can be made ahead of time, and 11 is probably too many. Reducing this number would speed your shader up regardless, even if all of them were pre-fetched. If there's anyway you could reduce this number I'd suggest looking into it - a common thing to do is sample only one side of each axis, rather than both (see the EdgeDetection training course for an example) - though whether you can do this will depend on your use case.

Thanks,

Tobias

Tobias2012-08-21 11:15:53

Hi WonwooLee,

There’s actually a problem in the older compilers (which is currently the majority of what’s out there unfortunately) where it won’t repack the vec2s for you. So for one of these textures, you would have to pack the varying yourself. For example:

uniform sampler2D tex;

varying lowp vec2 myVaryings;

varying lowp vec2 myVaryings1;

varying lowp vec2 myVaryings2;

varying lowp vec2 myVaryings3;

varying lowp vec2 myVaryings4;

varying lowp vec2 myVaryings5;

varying lowp vec2 myVaryings6;

varying lowp vec2 myVaryings7;

varying lowp vec2 myVaryings8;

varying lowp vec4 myVaryings9_10;

void main()

{

lowp vec4 texels[11];

texels[0] = texture2D(tex, myVaryings) ;

texels[1] = texture2D(tex, myVaryings1) ;

texels[2] = texture2D(tex, myVaryings2) ;

texels[3] = texture2D(tex, myVaryings3) ;

texels[4] = texture2D(tex, myVaryings4) ;

texels[5] = texture2D(tex, myVaryings5) ;

texels[6] = texture2D(tex, myVaryings6) ;

texels[7] = texture2D(tex, myVaryings7) ;

texels[8] = texture2D(tex, myVaryings8) ;

texels[9] = texture2D(tex, myVaryings9_10.xy) ;

texels[10] = texture2D(tex, myVaryings9_10.zw) ;

gl_FragColor = texels[0]+texels[1]+texels[2]+texels[3]+texels[4]+texels[5]+texels[6]+texels[7]+texels[8]+texels[9]+texels[10];

}

Also any further varyings you need to pass should be packed into your existing varyings, so adding them as the zw components of one of the texture coordinate varyings would be the best approach. I should clarify that the dependent texture read on a vec4 which I previously mentioned, only affects texture coordinates coming from the .zw components of the vector, or equally the .z component of a vec3.

Thanks,

Tobias