eglSwapInterval = 0 tearing problem

Should we expect tearing/flickering when eglSwapInterval is set to 0 or is it (possibly) the display driver’s bug or lack of vsync implementation?

I am trying to understand how the driver handles eglSwapInterval = 0.

The documentation says that when eglSwapInterval is set to 0, buffer swaps are not synchronized to a video frame.

So, does it mean there should be tearing? or not?

For example calling glClear( ) with alternating RGB values and then eglSwapBuffers( ) show tearing in some targets and does’nt in some other. So, which one is normal?

For example drawing at 300fps on the PVRFrame emulator shows no tearing/flickering. but on some targets it shows tearing.

Does the driver has any internal mutex regardless of eglSwapinterval to prevent tearing or it is just a matter of luck or does it depend on the display’s manufacturer?

Thanks in advance.

Setting eglSwapInterval to 0 will allow tearing. In the cases where there is no visible tearing, it’s possible that frames complete in sync with the display refresh rate by chance.

Additionally, some platforms may ignore an eglSwapInterval setting at the driver level, as defined in the specification:

interval is silently clamped to minimum and maximum implementation dependent values before being stored; these values are defined by EGLConfig attributes EGL_MIN_SWAP_INTERVAL and EGL_MAX_SWAP_INTERVAL respectively.

Are you 100% sure?

What does the VSYNC do then?

Because, on some devices it doesn’t even tear at 200 fps (not on a single frame) with various kinds of scene (as if something like tripple buffering is implemented on those devices). And the targets which show tearing begin to tear even with lower framerates.

For example, PVRFrame 10.0 emulator on my PC never show any tearing with any framerate with (eglSwapInterval =0)

Does the windowing system or the display driver has something to do with this tearing?

Say it clamps to 1 when I don’t see any tearing.
(Since 0 is supposed to cause tearing)

That time I shouldn’t get more than 60 fps, right?

But how am I getting ~300fps without tearing on some systems?

Try this code.
I’m not convinced that all of the frames always match with the VSYNC just by chance.
[pre]
for (int i=0;;i++)
{
int colorFlag=i%3;
colorFlag=(0x01<<colorFlag);
glClearColor(0x01&colorFlag,(0x02&colorFlag)>>1,(0x04&colorFlag)>>2,1);

	glClear(GL_COLOR_BUFFER_BIT);
	eglSwapBuffers( display, surface);
	Sleep(rand()%17);
}

[/pre]

Does your desktop/window manager utilize a compositor?

If so, investigate how you can disable it or bypass it.

Background:

If you’re running on system without a compositing window manager (e.g. Aero or DWM on Windows), or with the compositor disabled, then your app can render as fast as possible with SwapInterval 0, and you will see tearing artifacts. Essentially, you’re writing directly to an on-screen display/surface, and the buffer swaps you request can occur in the middle of scanning out the video signal for that display.

However, when there’s a compositor in between your application and the GPU video output, and when it’s enabled, it’s what’s in charge of the on-screen display/surface content not your application, and it is what synchronizes with VSync directly, not your application. It decides what frames from your application to display and when. And so it can prevent tearing just by when it decides to take a new frame from your application.

Essentially you are (unbeknownst to you) not actually render to the framebuffer that’s being scanned out but to an off-screen buffer. With SwapInterval 0 you can still render as fast as possible, but it’s up to the compositor to decide which of your many frames actually get copied to the on-screen display/surface and when. In other words, though you’re requesting SwapInterval 0, the compositor is just pretending that it’s off for your app, but it’s still really enabled for the entire display, and it is still synchronizing with it.

If this is your situation, I would investigate disabling your compositor or figuring out how to bypass it. Sometimes there’s a setting to disable it. Sometimes full-screening your window will bypass it. etc.

@Dark_Photon, I think that is exactly what I wanted to know.
I didn’t know about the compositor.
It seems that the compositor is preventing PVRFrame from tearing at 200 fps.
And actually, I don’t want to bypass it. I want to activate it in another target where it seems to be off.

This is because, Lets say my app takes around 20ms to render a frame. So if I set eglSwapInterval = 1 then I should get 30 fps. Because the swap buffer will wait every time for the vsync. But if I set eglSwapInterval = 0 then hopefully swapBuffer will not wait for the next vsync and start drawing the next frame instantenously.

That’s when tearing may be visible and that is what I am trying to avoid.

So, If a compositor can take care of the tearing then we can happily use eglSwapInterval = 0 :slight_smile:

Is the presence of Compositor completely platform dependent? Is there any other way to prevent halving the framerate when the time to draw a frame is more that VSYNC time?

Thanks.

@Zulkarnine: Yes, the presence (and behavior) of a compositor is dependent on the platform and the system configuration. So it’s not something cross-platform where you can depend on it always being available and working a particular way.

[blockquote]Lets say my app takes around 20ms to render a frame. So if I set eglSwapInterval = 1 then I should get 30 fps. Because the swap buffer will wait every time for the vsync. But if I set eglSwapInterval = 0 then hopefully swapBuffer will not wait for the next vsync and start drawing the next frame instantenously. [/blockquote]

If your app takes 20ms to render and (presumably) your VSYNC rate is 30Hz (33.3ms), why would you want to start rendering the next frame early? It’s just going to waste CPU and GPU cycles, memory bandwidth, power, and generate wasted heat.

Let’s take a more extreme example to illustrate. Suppose it only takes you 4ms to render with a 33.3ms VSYNC period. With SwapInterval 1, you render 1 frame every 33.3ms, which is exactly the rate that frames are scanned out the video output. Result = No wasted work. However with with SwapInterval 0, you’d render 8 frames in the time it takes 1 frame to be scanned out the video output. Result = 7 completely wasted frames. Plus the extra cost in the compositor in picking up one of these 8 frames and blending it onto the desktop.

@Dark_Photon, I was actually trying to give the example for a system with 60Hz VSYNC rate where the interval is arount 17ms and my app takes around 20 ms to render. So, ultimately the swap will occur at 34 ms point, 68ms… and so on. Hence, it will make us wait for extra 14ms before starting to render the next frame. Here having swapInterval 0 would be beneficial but if tearing cannot be prevented then that is not an option.

Instead, we may have to think about triple buffering in this case. Or is there any better solution?

On PowerVR drivers, I think you’ll get what you want. But I hope Joe or one of the PowerVR folks here can confirm.

The way it was explained to me by PaulS is that SwapBuffers will only block if the front buffer hasn’t been scanned out at least once. So for instance in the case where your CPU render time takes “less” than the VSync interval, SwapBuffers will limit you to the VSync rate, which is what you want:

[pre]VSYNC CLOCK: |||
CPU : |N|S| |N+1|S|[/pre]

However, when your CPU render time takes “more” than the VSync interval, you enter a situation where when you call Swap, the front buffer has already been scanned out once, so SwapBuffers can return immediately letting your CPU get busy on submitting the next frame early:

[pre]VSYNC CLOCK: |||
CPU : |N__|S|N+1__|S|[/pre]

(Arggh. The preformatted option totally munged what I put within the pre tags… Let’s try underscores instead of dashes. …that works.)

Above |S| is SwapBuffers and | is the clock pulse.

That said, I haven’t actually witnessed this last in PVRTune myself, so defer to ImgTech or what you’re seeing in your profiling.

Haha underscores did the job.

So, doesn’t it mean even if I don’t use triple buffering, I would get similar frameRate with double buffering?
Also, is it only specific to PowerVR GPUs or is it OpenGL specification?

To summarize, if I use eglSwapInterval = 1. It wouldn’t half the framerate if it takes little bit longer that the VSYNC interval, right? And it will also reduce the overhead of having triple buffering.

@PaulS or @PaulL, could you please confirm if Triple buffering is needed or not to avoid halving the framerate while using eglSwapInterval = 1 (when the time taken to render 1 frame is little bit over VSYNC time interval) on PVR GPUs?

@Dark_Photon , After giving some thoughts, it seems there is a catch.

[pre]Please check the attachment below[/pre]

[attachment file=“GPUTimingDiagram.PNG”]

See, the framerate will ultimately be half even if the above condition
[blockquote]when your CPU render time takes “more” than the VSync interval, you enter a situation where when you call Swap, the front buffer has already been scanned out once, so SwapBuffers can return immediately letting your CPU get busy on submitting the next frame early[/blockquote]
is true because GPU finishes the work much later after calling swapBuffer.

In the above diagram W means GPU is waiting for the VSYNC to start scanning.
S is the swapBuffer call.
So the very first call will return immediately but all the swapBuffers after that will eventually need to wait for the VSYNC, resulting having the framerate.

Please correct me if I am wrong.

Yes, that appears correct.

The choice of v-sync options all require a compromise of some kind.

Disabling v-sync will give you the most straightforward performance, however will be susceptible to screen tearing.
Double-buffered v-sync will stall frames that miss the v-blank interval until the next interval. If you can render consistently at or above the display refresh rate, then there are no drawbacks to enabling this.
Triple-buffered v-sync will avoid the issue of stalling frames, however it will introduce an extra frame of latency in any situation. Additionally, maintaining the extra buffer is costly memory-wise.