WebGL in the real world – a short case study – Part 2

In a recent post I described a WebGL pilot project for a client.  After experimenting with a couple of WebGL frameworks I reverted to basic principles and wrote a purpose-built display app that was able to display 506K textured triangles at interactive rates.  The demo let the user navigate through a pseudo-architectural scene using first-person-shooter style keyboard navigation.

There were some caveats on performance.  One was that the scenes appeared to be fill-rate limited.  That meant that performance would vary inversely with the size of the canvas that I was using.  Another is that interactivity would periodically “jump” every second or so – in other words, you’d miss a frame or two every second or so as you moved through the scene.  Anecdotally, I noticed this more in Chrome than Safari or Firefox (this was on the Mac — things looked better on Windows).

I attributed the jumping to browser issues, and to the way, perhaps, that WebGL flow control was being implemented.  My experience with real time browser programming in the past had conditioned me to not expect rock solid performance.  I’d seen this talk, which details some of the issues that Google was having with implementing WebGL.  And finally, my clients were not complaining about the jumping.  So I let it be.

But some weeks later when I returned to the project for some cleanup and refactoring I idly ran some profiling in the Chrome Developer Tools.  To my surprise, the profiler showed that a lot of time was being spent in a call to setMatrixUniforms(), which I was calling in the main display loop for every one of 560 objects.  The definition of this function is

  function setMatrixUniforms() {
    gl.uniformMatrix4fv(shaderProgram.pMatrixUniform, false, new Float32Array(pMatrix.flatten()));
    gl.uniformMatrix4fv(shaderProgram.mvMatrixUniform, false, new Float32Array(mvMatrix.flatten()));

and is responsible for setting the perspective and model view matrices in the GLSL shader program.  It turns out that these matrices were not changing on a per-object basis, so I moved this call from the display loop to the beginning of the scene display function. After this simple change, the profiler was no longer showing excessive activity, and the jumpiness in scene navigation went away.

Astute readers may already have guessed what was going on, but I have to admit that not only did I not have any idea, but I didn’t really care at the time because I was busy with other matters.  But I mentioned what had happened to a colleague and fortunately he was a little more interested than I was.  There were two hypotheses:

  1. resetting the matrix uniforms was somehow bringing the whole WebGL pipeline to a halt and degrading performance.
  2. the memory allocation and/or matrix flattening in new Float32Array(pMatrix.flatten()) were taking more time than we would have thought.

It was easy enough to test the hypotheses.  It turned out not to be the pipeline, and it wasn’t just the allocation of Float32Array.  Another look at the profiler showed that a lot of time was being spent on garbage collection.  There had been two memory allocations per object, for 560 objects, for 30 frames every second.  In other words, over 30,000 allocations per second.  Which presumably was triggering a garbage collection pass every second or so.

As it turns out, moving the function call was all I had to do.  But if that had not been the case, it would have been straightforward to rewrite the function to use a pre-allocated FloatArray to avoid the overhead of allocation and garbage collection.

I chalk this up to my relative inexperience with garbage collected languages, and to the relative unimportance of this issue in my previous programming projects.  Many years ago I spent a week doing a mathematical visualization with Java – my one and only experience with Java – and there I encountered a massive performance hit when garbage collection kicked in.  So maybe I should have seen this coming. I didn’t, but lesson learned.  Don’t be so fast to blame inconsistent WebGL frame rates on general browser flakiness.  And have more awareness of what’s going on in the bowels of Javascript.

Acknowledgement: the setMatrixUniforms() snippet comes from the lessons at learningwebgl.com, although I of course take full responsibility for my careless use of it.

Update: learningwebgl.com has updated to a new matrix library.  Not only does it look faster and more appropriate to WebGL, but I don’t think the problem described in this post exists any more.

WebGL in the real world – a short case study – Part 1

I started following WebGL a few months ago when it was in beta in several browsers.  Many creative web folks were already working with it, and some of the experiments were spectacular.  Fast forward to the present, and Google Chrome now officially supports WebGL (although your computer may not be up to it), and Google has a WebGL Experiments website.

So the experiments are fun and impressive, and I’ve even done some myself. But potential employers and clients with visualization needs were properly skeptical.  WebGL was not generally available other than in beta, Internet Explorer was not going to support it, and iOS support was not even on the horizon. Plus it ran on Javascript (considered by many to be a slow toy language not suitable for realtime graphics), and was subject to the inefficiencies of having the browser as an intermediary between the app and the metal.

I was fortunate to be contacted by a client who needed a browser based visualization solution, and was willing to fund a short “show me” pilot project.  The objective was to create a small demo that would run a typical architectural scene of 500K triangles at 30 frames per second.  A secondary objective was to fail fast – there was no point in going on if download times and interactive performance were dismal.  It would be a win for both of us regardless of the outcome – I was going to get some practical experience in real-world WebGL development, and the client would have enough data to navigate the next fork in their visualization roadmap.

Since the client’s files were in Collada DAE format, I was able to try them out in some of the existing WebGL frameworks such as GLGE and SpiderGL, which happened to have Collada import functionality and demos.  Results were mixed: on the one hand, setting up a demo for an architectural scene was relatively easy (for GLGE, I started from the Quake-style demo).  On the other hand, the scene had some quirks (such as multi-texturing with multiple UV maps per surface) that the frameworks didn’t handle.  And performance was disappointing: 3 FPS for a 500K-triangle scene in GLGE. Nevertheless, my clients seemed to be excited by the fact that their scene could be displayed in a browser at all.

GLGE is an impressive framework, and there is much to learn by browsing its source code.  It is a “how to” for a number of techniques such as shadow generation, canvas textures, collision detection, and object selection, and I will refer to it when I want to see examples of advanced techniques.  But I concluded, wrongly or rightly, that its generality was what lay behind the disappointing performance.  If your scenes are not as large as the ones I was using, it may well work well for you.  But my next step was to write a display engine that exploited the relative simplicity and predictability of my scene to get maximum performance.

An excellent place to get started in WebGL development is the Learning WebGL website, so that’s where I went, resolving to learn the basics of WebGL.  It seems these days that getting past the “Hello world” stage of learning a new technology is getting harder and harder, and that’s certainly the case for the “Hello triangle” stage of learning WebGL.  The source code for that, not including some supporting libraries, is over 200 lines of not only HTML and Javascript, but also GLSL, which is the low level language for writing shaders on the GPU (Graphics Processing Unit).  You are working on several levels, from the overall HTML page to the canvas element to the geometry and colors in the scene, and the GPU code that makes it all work on a pixel level.  Somehow all this ties together to give you a white triangle (and square) on a black background.  These tutorials are well written and deservedly popular, and probably the best place to get started.

I quickly discovered that Lesson 10 involved downloading a small scene and navigating it Doom-style, so I skipped ahead and using it as a scaffolding to build the demo, slowly replacing most of what was there.  And after a day or two I had a pseudo-architectural scene that ran 506K triangles (spread over almost 600 objects, most of them textured, some semi-transparent) at something like 26-28FPS (on my MacBookPro) .  Success!  This was certainly enough for the client to greenlight a second phase.

I think the lesson learned was that WebGL can be very fast.  After all, the triangles – once they’re in the GPU – are rendering just as fast as they would in OpenGL, modulo some canvas compositing passes in browser.  The trick is to get them down there and control the draw with minimum overhead.  And there is at least one not-so-obvious gotcha involved with using Javascript that I will discuss in a subsequent post.