Packing varying variables as vec4s ?




Hello. 

In my vertex shader, I compute 11 varying variables to access texels in the fragment shader. 

Usually we use vec2 types for varying variables, but in my case, I need to interpolate the texture coordinates in x axis only. 

So that, one reference vec2 and 10 floats are all I need for the values, (total 12 floats actually)

Thus, declaring the varyings as vec2 myTexCoords[11] might be inefficient. 


Is it possible to declare them as vec4 myVaryings[3] and use them like this ?: 

 

Vertex shader 

attribute vec4 myVertexCoords ;
attribute vec2 myTexCoords ; 
uniform float texelStepU ;

varying vec4 myVaryings[3] ;

void main() 
{
  myVaryings[0].xy = mytexCoords ;

  myVaryings[0].z = myTexCoords + texelStepU ;
  myVaryings[0].w = myTexCoords + 2.0*texelStepU ; 

  myVaryings[1].x = mytexCoords + 3.0 * texelStepU ;
  myVaryings[1].y = myTexCoords + 3.0 * texelStepU ;

..... and so on..

  gl_Position = ... ;
}
[CODE]

Fragment shader 

[CODE]
varying vec4 myVaryings[3] ;

void main() 
{
  vec4 texels[11] ;
  float uy = myVaryings[0].y ; // v coordinates common to all texels to sample

  texels[0] = texture2D(tex, myVaryings[0].xy) ;
  texels[1] = texture2D(tex, vec2(myVaryings[0].z, uy)) ;
  texels[2] = texture2D(tex, vec2(myVaryings[0].w, uy)) ;
  texels[3] = texture2D(tex, vec2(myVaryings[1].x, uy)) ;
  texels[4] = texture2D(tex, vec2(myVaryings[1].y, uy)) ;
  texels[5] = texture2D(tex, vec2(myVaryings[1].z, uy)) ;

  ... and so on..

  gl_FragColor = ... ;
}
[CODE]

[CODE]
attribute vec4 myVertexCoords ;
attribute vec2 myTexCoords ; 
uniform float texelStepU ;

varying vec4 myVaryings[3] ;

void main() 
{
  myVaryings[0].xy = mytexCoords ;

  myVaryings[0].z = myTexCoords + texelStepU ;
  myVaryings[0].w = myTexCoords + 2.0*texelStepU ; 

  myVaryings[1].x = mytexCoords + 3.0 * texelStepU ;
  myVaryings[1].y = myTexCoords + 3.0 * texelStepU ;

..... and so on..

  gl_Position = ... ;
}
[CODE]

Fragment shader 

varying vec4 myVaryings[3] ;

void main() 
{
  vec4 texels[11] ;
  float uy = myVaryings[0].y ; // v coordinates common to all texels to sample

  texels[0] = texture2D(tex, myVaryings[0].xy) ;
  texels[1] = texture2D(tex, vec2(myVaryings[0].z, uy)) ;
  texels[2] = texture2D(tex, vec2(myVaryings[0].w, uy)) ;
  texels[3] = texture2D(tex, vec2(myVaryings[1].x, uy)) ;
  texels[4] = texture2D(tex, vec2(myVaryings[1].y, uy)) ;
  texels[5] = texture2D(tex, vec2(myVaryings[1].z, uy)) ;

  ... and so on..

  gl_FragColor = ... ;
}
[CODE]

[CODE]
varying vec4 myVaryings[3] ;

void main() 
{
  vec4 texels[11] ;
  float uy = myVaryings[0].y ; // v coordinates common to all texels to sample

  texels[0] = texture2D(tex, myVaryings[0].xy) ;
  texels[1] = texture2D(tex, vec2(myVaryings[0].z, uy)) ;
  texels[2] = texture2D(tex, vec2(myVaryings[0].w, uy)) ;
  texels[3] = texture2D(tex, vec2(myVaryings[1].x, uy)) ;
  texels[4] = texture2D(tex, vec2(myVaryings[1].y, uy)) ;
  texels[5] = texture2D(tex, vec2(myVaryings[1].z, uy)) ;

  ... and so on..

  gl_FragColor = ... ;
}
[CODE]


Hi WonwooLee,

Due to a limitation in our earlier shader compilers, packing texture coordinates into vec4s is generally a bad idea, as it will force a dependent texture read. Equally, computing a vec2 from two different varyings can also cause a dependent texture read, as the vec2 won't be created until the fragment shader is run. This means the shader won't be able to fetch any of the data ahead of time, and your fragment shader would be entirely memory speed limited - which for 11 samples is a huge amount of slow down. For optimum performance, the best thing to do is actually just have 11 vec2s, each containing the correct coordinates.

E.g.

uniform sampler2D tex;

varying lowp vec2 myVaryings;


varying lowp vec2 myVaryings1;


varying lowp vec2 myVaryings2;


varying lowp vec2 myVaryings3;


varying lowp vec2 myVaryings4;


varying lowp vec2 myVaryings5;


varying lowp vec2 myVaryings6;


varying lowp vec2 myVaryings7;


varying lowp vec2 myVaryings8;


varying lowp vec2 myVaryings9;


varying lowp vec2 myVaryings10;




void main()


{


lowp vec4 texels[11];




texels[0] = texture2D(tex, myVaryings) ;


texels[1] = texture2D(tex, myVaryings1) ;


texels[2] = texture2D(tex, myVaryings2) ;


texels[3] = texture2D(tex, myVaryings3) ;


texels[4] = texture2D(tex, myVaryings4) ;


texels[5] = texture2D(tex, myVaryings5) ;


texels[6] = texture2D(tex, myVaryings6) ;


texels[7] = texture2D(tex, myVaryings7) ;


texels[8] = texture2D(tex, myVaryings8) ;


texels[9] = texture2D(tex, myVaryings9) ;


texels[10] = texture2D(tex, myVaryings10) ;



gl_FragColor = texels[0]+texels[1]+texels[2]+texels[3]+texels[4]+texels[5]+texels[6]+texels[7]+texels[8]+texels[9]+texels[10];


}



Saying that - 11 texture fetches may be too many to pre-fetch; there's a limit to how many samples can be made ahead of time, and 11 is probably too many. Reducing this number would speed your shader up regardless, even if all of them were pre-fetched. If there's anyway you could reduce this number I'd suggest looking into it - a common thing to do is sample only one side of each axis, rather than both (see the EdgeDetection training course for an example) - though whether you can do this will depend on your use case.

Thanks,

Tobias

Tobias2012-08-21 11:15:53

Hi WonwooLee,

There’s actually a problem in the older compilers (which is currently the majority of what’s out there unfortunately) where it won’t repack the vec2s for you. So for one of these textures, you would have to pack the varying yourself. For example:

uniform sampler2D tex;

varying lowp vec2 myVaryings;


varying lowp vec2 myVaryings1;


varying lowp vec2 myVaryings2;


varying lowp vec2 myVaryings3;


varying lowp vec2 myVaryings4;


varying lowp vec2 myVaryings5;


varying lowp vec2 myVaryings6;


varying lowp vec2 myVaryings7;


varying lowp vec2 myVaryings8;


varying lowp vec4 myVaryings9_10;




void main()


{


lowp vec4 texels[11];




texels[0] = texture2D(tex, myVaryings) ;


texels[1] = texture2D(tex, myVaryings1) ;


texels[2] = texture2D(tex, myVaryings2) ;


texels[3] = texture2D(tex, myVaryings3) ;


texels[4] = texture2D(tex, myVaryings4) ;


texels[5] = texture2D(tex, myVaryings5) ;


texels[6] = texture2D(tex, myVaryings6) ;


texels[7] = texture2D(tex, myVaryings7) ;


texels[8] = texture2D(tex, myVaryings8) ;


texels[9] = texture2D(tex, myVaryings9_10.xy) ;


texels[10] = texture2D(tex, myVaryings9_10.zw) ;



gl_FragColor = texels[0]+texels[1]+texels[2]+texels[3]+texels[4]+texels[5]+texels[6]+texels[7]+texels[8]+texels[9]+texels[10];


}


Also any further varyings you need to pass should be packed into your existing varyings, so adding them as the zw components of one of the texture coordinate varyings would be the best approach. I should clarify that the dependent texture read on a vec4 which I previously mentioned, only affects texture coordinates coming from the .zw components of the vector, or equally the .z component of a vec3.


Thanks,


Tobias