Published on
views

Exploring Quake 3's Fast Inverse Square Root

Authors

Introduction

The following code, which may have been written by John Carmack, was found when the Quake 3 source code was open-sourced:

Fast Inverse Square Root Function

Whilst most code looks complicated, this one even baffled most but the best software developers, whilst this wasn't originally created by John Carmack and was shown as early as in the late 80s, developers around the world struggled to understand how this piece of code functioned and more importantly why it was used.

In the early 90s, unlike our modern-day processors, video games were a relatively new concept and suffered heavily from hardware constraints. Now our modern-day processors are; faster, more optimized, and better balanced with relatively equal speeds for all instruction sets. However, in the early days of computing, this was not the case.

Due to this, developers had to be creative in order to find unique ways to bypass these hardware restrictions, hence the inverse square root "hack" was born.

Whilst with most games, heavy optimization is possible, it is typically not done. Heavily optimizing a game, or even a function may not be cost-effective or worth it in performance vs development time tradeoff. However, in the case of Quake 3, it was beneficial. In video game graphics, specifically 3D engines both; Pathing, lighting & reflection all heavily utilize vector normalization techniques. As these normalization functions may run hundreds if not thousands of times per frame in a game. As lighting conditions update and enemy AIs recalculate their paths, this simple function can become incredibly computationally intensive quite quickly.

Whilst this operation may seem simple, especially for an integer, at the time, calculating the reciprocal of a floating-point integer was incredibly computationally intensive and this fast root bypassed this step. As a result, this floating-point hack managed to shave off the time required to perform this calculation by opting to use an incredibly accurate approximation instead, treating the floating-point as an integer in most places.

History

The origin of the source code for the video game Quake III was officially released in 2006 during the annual QuakeCon convention, which is 7 years after the game's release. QuakeCon is a yearly gaming event held by ZeniMax Media to introduce and celebrate id Software games and other studios. The author of the code is suggested to be Teje Mathisen, who is an x86 assembly language programmer that helped the company id Software with Quake optimization within their video games. He was able to utilize some code which he had written during the late 1990s which was used for 3D computer graphics, which can be seen in id Software’s first-person shooter game in 1999, Quake III Arena. This code was crucial to the creation of the Fast inverse square root function and is thought to have been sourced from the SGI Indigo computer work station, run by Gary Taroll in the 1990s. Although, after years of investigating, the original author of the Fast Inverse Square Root function was found to be Greg Walsh. He has had many monumental impacts upon the computing world and has helped develop the technologies we have today.

The history of how the magic number 0x5f3759df, which is used in the Fast Inverse Square root function, is not precisely known by researchers but has been developed and improved by mathematicians over the years. Chris Lomont was able to improve the number by minimizing the approximation error when choosing the magic number over a given range.

How Does The Code Work?

The FastInvSqrt algorithm was developed to address a common problem in computer graphics: the need to compute the normalized vector of a given vector (x,y,z)(x, y, z). Normalizing involves calculating 1x2+y2+z2\frac{1}{\sqrt{x^2 + y^2 + z^2}}.

To begin, the algorithm starts with an initial approximation for x12x^{-\frac{1}{2}}. It achieves this by halving the input floating-point number and storing it as xhalf.

The line of code i = 0x5f3759df - (i>>1) computes an initial guess y0y_0 by manipulating the exponent bits of xx to approximately equal 12-\frac{1}{2}. The shift operation (i>>1) divides the integer ii by 2 in a bitwise manner, inadvertently modifying the exponent bits of the floating-point representation.

The hexadecimal constant 0x5f3759df (equivalent to 1,597,463,007 in decimal) is a predefined constant used in this calculation. Subtracting our initial guess y0y_0 from this constant adjusts the bits to align the exponent correctly for the inverse square root calculation.

Next, the algorithm reinterprets the integer xx as a floating-point number. This sets the stage for applying one iteration of Newton's method to refine the initial guess. Newton's method iteratively improves an estimate yny_n using the formula:

[yn+1=ynf(yn)f(yn)][ y_{n+1} = y_n - \frac{f(y_n)}{f'(y_n)} ]

For the specific function f(y)=y12xf(y) = y^{-\frac{1}{2}} - x, the iteration simplifies to:

[yn+1=12yn(3xyn2)][ y_{n+1} = \frac{1}{2} y_n \left( 3 - x \cdot y_n^2 \right) ]

This is implemented in the line:

[x=x(1.5fxhalfxx)][ x = x \cdot (1.5f - xhalf \cdot x \cdot x) ]

After iterations, this algorithm yields an approximation very close to x12x^{-\frac{1}{2}}, providing an efficient computation of the inverse square root.

Testing Quake's Fast Inverse Square Root

Most of the lines of code in the function are assignments/arithmetic which cannot be graphed aside from the final line as mentioned in the section ‘How does the code work’. The image above displays the generated graph for the final line of the function, one iteration of Newton’s method where ‘yy’ symbolizes the first estimate and ‘f(y)f(y)’ symbolizes the final returned estimate.

Testing Quake's Fast Inverse Square Root

To test the accuracy of Quake’s fast inverse square root, the code was first translated into Python using NumPy. Then a new function, called true value, was coded to calculate the true inverse square root value. A random float number was generated and its inverse square root was computed using both the Quake function and the true value function. Both values as well as the difference between the two was stored in an array. This was done inside a loop to generate results for five random float numbers.

Python Benchmarks
Value InputQuake EstimateTrue ValueDifference/Error
00.9966121.000011.00170.00169044
10.4418861.502361.504340.00197802
20.1317562.753252.754960.00171016
30.2581411.961661.968210.00204742
40.1114832.991242.994990.00375186

As you can see, the Quake estimate and the true value for the five random numbers have a difference of less than 1e2(0.01)1e-2 (0.01), making Quake's function accurate to 2 decimal points.

Conclusion

I personally found this topic very interesting, great programmers writing even greater code. Even though this wasn't made by solely one person, but instead several through open-source work & collaboration. It can show how even a simple problem can be tackled and optimized to insane levels for the sake of efficiency or convince for programmers. Whilst normal computers nowadays have optimized instruction sets and clock speeds making this performance trick redundant. Many other programmers still continue to follow these same trends of efficiency even to this day, writing innovative and frankly insanely optimized code, that many of us wouldn't know how to read, let alone code.