Shadow Layers, and learning by failing

A hot topic for Firefox at the moment is the new out-of-process rendering, but is it common knowledge that this has already been in Firefox Mobile for a long time? For mobile, there’s what we call a ‘chrome’ process (this processes and renders the main UI) and then ‘content’ processes, which handle the rendering of the pages in your tabs. There are lots of fun and tricky issues when you choose to do things like this, mostly centering around synchronisation – and recently, I was trying to add a feature that’s lead me to writing this post.

You may have already heard about how Firefox accelerates the rendering of web content. In a nut-shell, a page is separated into a series of layers (say, background, content, canvases, plug-ins, etc.) These layers are then pasted onto each other, in what tends to get called composition. If you’re lucky and have decent drivers, or you run on Windows, this process of composition is accelerated by your video card. Turns out video cards are very good at composition, so this is often a nice bonus. We also try to accelerate the rendering of these layers too, but that’s another topic…

These layers are arranged in what’s known as a layer-tree – when something on the screen needs to update, this tree is traversed, and painted to the screen. But how is this affected by out-of-process rendering? You can’t have both processes painting to the screen simultaneously without some kind of coordination, and often there are various rules on memory sharing/protection that limit how sharing happens too. We choose to let the chrome process handle getting things to the screen. It’s important, however, that the content process not be able to hold up the chrome process too readily. But if we want the page to render correctly and respond to user input, we need the page’s layer tree… So how do we go about solving this?

We use what we’ve called ‘shadow’ layers – the chrome process has a mirror-copy of the content process’s layer tree, and the content process can update it when it’s ready. In the meantime, we have something we can paint and the page continues to be reactive, to the extent at least that you can read it, you can scroll it and you can zoom it. We render a larger area of the page than is visible so that while the content process is busy rendering, we don’t appear to ‘fall behind’ (when we do, you see the checker-board background, similar to the iPhone).

We have various implementations of these layers for different platforms, so we can take advantage of platform-specific features. There’s a GL implementation (GL[X] and EGL), a Direct3D implementation (9 and 10) and a ‘basic’ implementation that uses cairo and runs in software. When the content process changes its layer tree, it sends a transaction representing that change over to the chrome process. Part of this transaction is likely to involve updating of visible buffers. If both processes use basic layers (the default case, on android at least), we use shared memory and page-flipping. That is, the content process renders into one buffer while the chrome process renders out of another buffer, and when the content process updates, they swap around.

For accelerated layers, this is a slightly different and more complicated story. As we can’t share textures across processes and we don’t currently have a remote cairo implementation, the content process always uses basic layers and renders into memory (though there is work going on to allow remote access to acceleration). The chrome process is free to use whatever implementation it likes though, and not all of these implementations allow for page-flipping. The GL layers implementation only uses a single buffer on the content side, and when this is updated, it is synchronously uploaded to the GPU on the chrome side (and the content has to wait). Thankfully, on Maemo and X11, there are extensions that make this very fast (EglLockSurface on Maemo, texture-from-pixmap on GL/X11), though it’s still quite a large, synchronous copy. On Android, this copy is very slow – we have no fast-path due to the API we need not currently being advertised (and possibly not implemented yet).

There are things that we could do to avoid this speed hit though. I thought, for example, we could use EGLImage (which, thankfully, is available on Android) and asynchronously update textures in a thread (or even in chunks in the main-loop). I still think this is a sound idea, but there are caveats. This would require, for example, that either we double-buffer, or we make the content process wait for the asynchronous update to complete. The latter would involve adding asynchronous shadow layer transactions. Not an easy task. If we double-buffer, we then double the system memory cost of storing a layer (and bear in mind, that layer is mirrored in graphics memory, so we’re talking 1.5 times the cost vs. basic layers). We also have to synchronise the updating of the layer coordinates with asynchronous updating to avoid what would otherwise be a huge and visible rendering glitch, and if we want the update to not be viewable while it’s happening, we have to double-buffer the layer’s texture too. We now have twice the memory cost we had before, and these tend to be quite large buffers!

Altogether, not an easy problem to solve. So I’ve given up for now. There are other, easier and less disrupting changes that can be made, that I’ll be trying out next. I’m disappointed that this didn’t pan out as I thought it would, but I’m pleased to have learnt something. I hope this is useful/interesting to someone.


My First Firefox Mobile Bug

One thing I regret over the past couple of years, is reducing my blog output. I think I used to blog fairly regularly in the past, and that seemed to stop when I joined Intel (though I’d love to blame someone other than myself, unfortunately, it is entirely my fault). So I’d like to get back into the habit of writing again, by writing about what I’m doing here at Mozilla.

Always good to start with an easy one, so I’m starting with the first bug I fixed. Bug #661843, “GeckoSurfaceView may double memory requirement for painting”. Doug Turner assigned this to me as I joined, and I’m still very grateful as it turned out to be pretty easy to fix and a massive win for Firefox Mobile on android.

Getting stuff to the screen from a native app on Android was quite difficult up until Gingerbread and we target Android 2.0 and up, so moving to the new native app SDK isn’t currently an option. It’s a lot easier if you cheat (by using undocumented interfaces), but winners don’t cheat. Or at least they don’t get caught. Or something. To get around the lack of ‘native’ interfaces to the Android app components, Firefox Mobile on android consists of a small Java shim and the main application. This shim acts as our input/output to the device and interfaces via JNI to the various internal services.

For drawing to the screen, our Java shim builds up a simple Android application and provides a buffer for the native code to draw into. When the native code wants to draw, it calls the Java methods to get the buffer, does its thing and sends a signal back to the Java code to let it know that it’s finished drawing – this can be seen mostly in nsWindow.cpp, in the OnDraw method. Prior to Android 2.2, there was no way for native code to draw straight into an Android Bitmap, and no way to copy a raw data buffer onto the application’s Canvas. The only option in this case is to create a Bitmap based on the data buffer (which ends up copying that buffer), then blit that Bitmap onto the Canvas.

Android 2.2 added native access to the Bitmap class, allowing native code to directly manipulate the memory backing it – this is exactly what we needed. Unfortunately, this pushes our requirement up, which isn’t something we want to do just yet. My fix for this patch involved loading the new native graphics access library at runtime and using it if it’s available. To make things easier, I reshuffled the code on the Java side (which can be seen in so that the two paths share most of the code. The slow path backs the browser canvas with a ByteBuffer (which allows direct access via JNI, but can’t be copied directly to the Canvas), the fast path uses a Bitmap and Android’s libjnigraphics. This halved the memory usage required for updates to the screen and reduced the amount of allocation/copying going on, providing a nice speed boost.

I believe you should see this if you’re running Firefox Mobile Beta, available on the Android Market and it’ll be incorporated in Firefox Mobile 6.