A Load of Garbage

Posted in JavaScript by ebenpack
Tags: ,

After I completed a working version of my 3D rendering engine wireframe.js, I started looking for ways to make improvements, both in terms of performance, and usability. While both efforts are still ongoing, there are a few things I have learned with regard to performance that I wanted to get down in writing.

One of the main areas of focus for performance on this project has been memory management. In order to achieve a smooth, stable 60 frames per second, you have just under 17ms to complete all of the operations required to get a frame on the screen. This includes any physics or game logic you might need to execute, as well as the time required to paint the frame itself. I don't know if you've heard, but 17ms is not very much time at all. So when a garbage collection (GC) event occurs, this takes some time away from the already tight 17ms you have to draw your frame. And if you don't have those extra milliseconds to spare for a GC event (and you don't really have much say in when they occur, so you can't exactly plan for them), this can result in skipped frames, which can appear to the user as a stutter or pause. I believe some folks call it jank.

So how do we avoid this? Well, the obvious answer is: if you want to avoid garbage collection, then don't make so much garbage. It may sound an easy thing, but the details of just how to achieve this can be a little more tricky. First, just where does garbage come from in JavaScript? If you aren't experienced with memory management, the answer may not be obvious.

Simply put, creating a new object requires the interpreter to allocate memory for that object. But isn't everything in JavaScript an object? Well... yes and no. Strings, numbers, and booleans (so long as they aren't instantiated by their respective String, Number or Boolean constructors) and null and undefined are all primitive types, meaning they're not, strictly speaking, objects. They do, however, behave like objects on occasion. When a method is called on a primitive type, a wrapper object is quickly created for that value in order to call the method, and then, just as quickly, it is discarded. And, more importantly, while they may not technically be objects, they still need some memory allocated in order to store them.

Garbage is what happens to the memory allocated to objects that are no longer in use (or, more specifically, objects that are no longer referenced). Once an object is no longer referenced in a program, the interpreter can mark this object as garbage that needs to be collected (i.e. memory which can be deallocated). E.g.:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
// Memory allocated for first array object.
var foo = [1,2,3,4,5,6];

// foo assigned to second array object. Memory allocated
// for second array object. First array object
// no longer referenced, can be marked for GC.
foo = ['a', 'b', 'c'];

// bar assigned to value of foo. Second array object now
// referenced by both foo and bar.
var bar = foo;

// foo assigned to 'bar', but second array still
// referenced by bar, so cannot be marked for GC.
foo = 'bar';

But while virtually anything you do in JavaScript will require the allocation of some amount of memory, the situation is not so hopeless as it may seem. After all, the goal is to reduce garbage, not necessarily to reduce the total amount of memory allocated (although memory use should always be kept as low as possible). As garbage is created when an object is no longer referenced, there are a few strategies to reduce garbage collection.

What it boils down to is that, in order to reduce garbage, it is imperative that, as much as possible, you should not discard objects once they have been created. wireframe.js provides us with a good case study of how this can be achieved in a project that makes use of the canvas, and for JavaScript in general.

In the first, 'high-garbage' iteration, I was using a version of the linearalgea.js math library (which I created especially to perform the matrix and vector math required by a 3D rendering engine) that created and returned a new matrix, or a new vector, for virtually every method call. As a general design decision, this mostly made sense. You want your matrices and vectors to be more or less immutable. You don't want the methods you're calling to change the value of the object itself, as you often need to use these matrices and vectors in multiple operations. E.g.:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
// Old, high-garbage methods
Vector.prototype.subtract = function(vector){
    return new Vector(
        this.x - vector.x,
        this.y - vector.y,
        this.z - vector.z
    );
};
Vector.prototype.cross = function(vector){
    return new Vector(
        (this.y * vector.z) - (this.z * vector.y),
        (this.z * vector.x) - (this.x * vector.z),
        (this.x * vector.y) - (this.y * vector.x)
    );
};

// Three vertices of a triangle
var vertex1 = new Vector(1,2,3);
var vertex2 = new Vector(4,5,6);
var vertex3 = new Vector(7,8,9);

// Find vectors representing two sides of the triangle
// We may still need to use the vertices later,
// so an in-place operation would not be appropriate
var side1 = vertex2.subtract(vertex1);
var side2 = vertex3.subtract(vertex1);

// Find the normal of the triangle, using the two sides
var normal = side1.cross(side2);

The above code does what we want, but it also creates three new vectors along the way. Assume we run this code just once per frame (in reality it will run many times per frame, potentially once per triangle of every mesh in our scene). This means that for every frame, we are adding three new vectors as garbage that will need to be collected (assuming the interpreter isn't smart enough to make some reuse of them, which... are there any interpreters that do that? I don't know).

The solution to this problem was to add 'low-garbage' versions of all of these methods which do not create a new vector or matrix, but which rather assign the result of the operation to a matrix or vector object that is passed to the method call. As JavaScript uses a call-by-sharing evaluation strategy (which means that mutations made to a mutable argument inside a function will be visible outside of that function), we can pass an object to a function that we will use to store the results of the function call.

This method of returning results might be familiar to those of you who are have some experience with C. In C, functions are limited in what they can return; e.g. a function cannot return an array. Instead, a function can return a pointer to an array, or, alternatively, a pointer to the array can be assigned to a pointer that is passed as an argument to the function.

The above example, rewritten to use the low-garbage methods, would look something like this:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
// New, low-grabage methods
Vector.prototype.subtractLG = function(vector, result){
    result.x = this.x - vector.x;
    result.y = this.y - vector.y;
    result.z = this.z - vector.z;
};
Vector.prototype.crossLG = function(vector, result){
    result.x = (this.y * vector.z) - (this.z * vector.y);
    result.y = (this.z * vector.x) - (this.x * vector.z);
    result.z = (this.x * vector.y) - (this.y * vector.x);
};

// Results vectors
var side1 = new Vector(0,0,0);
var side12 = new Vector(0,0,0);
var normal = new Vector(0,0,0);

// Three vertices of a triangle
var vertex1 = new Vector(1,2,3);
var vertex2 = new Vector(4,5,6);
var vertex3 = new Vector(7,8,9);

// Find vectors representing two sides of the triangle
vertex2.subtractLG(vertex1, side1);
vertex3.subtractLG(vertex1, side2);

// Find the normal of the triangle, using the two sides
side1.crossLG(side2, normal);

This example may look very similar to the earlier example. And if it was executed as-is, since the results vectors would be created anew for every frame, there would be little or no difference between this and the earlier version in terms of memory use. The key difference comes when these results vectors are created just once, and a reference is kept in order to prevent them from being garbage collected. So now, instead of creating and discarding a new object for every method call, multiple times per frame, we're creating just a handful of objects upon initialization, and using them over and over.

Implementing low-garbage versions of all of the matrix and vector methods had a fairly large impact for wireframe.js. Compare the before and after graphs below.

5.7MB-9.7MB

Memory usage for 'high-garbage' math methods

9.9MB-13.9MB

Memory usage for 'low-garbage' math methods

While overall these memory graphs have a similar shape, the latter has a far more regular GC pattern. It grows steadily, and then there's a GC event, which produces the characteristic sawtooth pattern. The 'high-garbage' version, on the other hand, grows some, then has a GC event, grows, grows, GC, etc. Overall the shape is vaguely sawtooth-like, but there are far more GC events that occur sporadically between the trough and the peak of the sawtooth, and the memory graph appears much more erratic.

Depending on what your aims are, in order to fully make use of this sort of approach it may be necessary to implement object pooling. What this means is that you have a 'pool' of objects that you have instantiated. When you need a new object, you pull one from the pool. When you no longer need that object, you return it to the pool. There is a bit more overhead in this, so if you know ahead of time exactly how many objects you will need, and you are certain that you will never need more than this, object pooling would likely not be worth the effort.

It should be noted, though that care must be taken when using this method, whether with an object pool or without, to initialize the results object to ensure that old values from previous calculations are completely removed from your results object. For example, the following matrix translation static method can cause problems if the results matrix still carries values from previous calculations:

1
2
3
4
5
6
7
// If the result matrix still has values in 0-11 or 15 from
// being used in previous calculations, this will cause problems.
Matrix.translation = function(xtrans, ytrans, ztrans, result){
    result[12] = xtrans;
    result[13] = ytrans;
    result[14] = ztrans;
};

The way I have implemented the low-garbage methods in these examples also makes method chaining impossible, but chaining can be achieved by simply explicitly returning the result. This does not have any negative repercussions in terms of memory usage or performance.

Alright, so far, so good. Implementing the low garbage math methods—and refactoring the main 3D rendering function to make use of these methods—had a very noticeable effect on the memory-use profile of the program. Garbage collection was now much less frequent, and at more regular intervals. But garbage can also sneak in where you don't expect it, and there was still work to be done.

The two graphs above represent a ~2 second time slice. So in the 'good' version, GC events are still occurring approximately every 200ms. This could be better, and I was certain that there were still areas of the program that were still making too much garbage. The most obvious place to look was this line, which occurs at the beginning of the render function:

1
back_buffer_img = back_buffer_ctx.createImageData(width, height);

The back buffer is an ImageData array (which is a Uint8ClampedArray, which is a typed array) where pixel data is written. This image data is then drawn to an offscreen buffer canvas, which is then used to draw to the main canvas. While this may seem like a lot of extra, unnecessary steps, the main advantage of this is that it allows the 3D scene to be initially drawn in one resolution, but displayed in a different resolution. So a scene can be upscaled in order to use fewer resources, or it can be downscaled to provide a better looking image.

As createImageData creates a new ImageData array, in this early version of the program a new back buffer array was being created for every frame. Naturally, this created a lot of unnecessary garbage. In my first attempt to tackle this problem, I was running over the entire array, zeroing out all the values.

Backing up a little, an ImageData array is a one-dimensional array representing the RGBA values of all pixels in a 2D canvas context, with each element representing either a red, green, blue, or alpha value of a pixel. So, for example, a 1x1 pixel ImageData array might look like this [1, 7, 2, 9], where the RGBA values are 1, 7, 2, and 9 respectively.

So it should be clear that for any ImageData array, its length will be described by canvas width * canvas height * 4. As it turned out, looping over the entire array was actually noticeably slower than just creating a new ImageData array for every frame, even with all the extra garbage that brought with it. However, if you consider for a moment what it is we're trying to achieve (clearing the back buffer for the next frame), it may become clear that we can actually get away with doing ¼ of the work. By setting the alpha value to zero, we can ignore the RGB values, just so long as we make sure to set all RGBA values when we draw a new pixel to the array. With this change, each frame took less time to draw (meaning fewer resources were used), and GC events became much less frequent. Here is what memory use looked like after this change (the scale is the same as the previous two graphs):

4.6MB-6.6MB

Memory usage using back buffer zeroing

GC events are now happening about once a second. Which can certainly still be improved upon, but it's miles ahead of where it was to begin with.

As a side note: it occurs to me that in order to clear the back buffer, it would be possible to do potentially much less work than mentioned above. If we kept track of which pixels had been drawn in the previous frame, we could then clear only those pixels which needed clearing. But I suspect that the increased overhead and complexity of such an approach would ultimately not be worthwhile. I will leave it as an exercise to the reader to determine the feasibility of this approach.