As a swordsman needs to know about the balance, weight, sharpness and grip of a sword they wield, programmers need to know about the intricacies and “small letters” of the APIs they are using. As a graphics developer I deal with linear algebra APIs most of time. Although most of my time is consumed with these APIs, there are some times I’ve solved a problem that in the end “works around” some API property I didn’t notice. Had I noticed earlier, I wouldn’t “lose” that time and the solution to the problem would be “cleaner”. This post is to help you have a better time when coding linear algebra transformations.
Study linear algebra
This should be a no-brainer. If you want to apply linear algebra transformations, you should devote time in reading about them. If you are new in the domain, having the theoretical knowledge will transform your work. Instead of trying to figure out the way you need to multiply the matrices and vectors, you will spend your time on the real thing – the problem of the program you are working on.
Understand the reference system of the API you are working with
This can be a result from the above study, but there are “realizations” that we come to while working with something which we didn’t actually understand while reading. For this point, let’s have a transformation matrix M in row major order as follows. Every transformation matrix contains a rotation part and a translation part. The part in the green box is the rotation one and the part in the blue box is the translation one.

This is well understood but here comes the practical part. We have a certain object that is already translated and rotated and these form the matrix K. This object has been rotated by 180 degrees on the world X axis and moved by 2 on the world Y axis.

Here comes a small puzzle. We want to move the object with the vector v = (0, 1, 0)
in the world coordinate system and it’s final position in the world should be (0, 3, 0)
. With the following transformation matrix N, will N * K
give us the desired result?

By doing the math, we can see that it indeed produces the desired matrix with the last column (0, 3, 0, 1)
.
An API pitfall
To code the above transformation, we would need to:
1. Create a new identity matrix.
2. Set the last column to (0, 1, 0, 1)
.
3. Perform the multiplication N * K
.
Let’s say we are using the Qt API which has the class QMatrix4x4 which provides the handy function translate(QVector3D)
. Now that’s something that could come in handy! All the above steps, summed into one! Except that there is a catch. The translate function will perform the 1 and 2 steps that we would do but in the third step it will invert the multiplication and perform K * N
. The resulting position will be (0, 1, 0)
instead of (0, 3, 0)
.
This is because Qt uses the translate function to perform the translation based on the object’s coordinate system (the current orientation) and not the world coordinate system, but the documentation doesn’t state that!
So, if we already had a working code and we wanted to clean it up using the translate function, we would end up introducing bugs to our program!
Different APIs introduce functions with the same name which might behave differently. For example, the GLM API contains the translate function which (with careful reading) provides the details of how the transformation is applied. If we cleaned up our code with the GLM API, we would not introduce bugs because GLM would perform N * K
, instead of the Qt K * N
.
This was not a post implying that one should prefer GLM over Qt. This was a post trying to get a point on the importance of studying and experimenting with the APIs chosen for each programming environment and how these small implementation details can leave people scratching their heads for hours.
Until next time,
Stay healthy and have a Great New Year!