Fast Inverse Square Root Algorithm Breakdown

When creativity meets computer science: an ingenious way to calculate fast on old computers

Colorblind mode:

Fast Inverse Square Root is a clever algorithm that computers used in the past to render 3D graphics faster! If you are unfamiliar with this algorithm, you should first read the explanation below.


Algorithm Demo

Enter starting value here:
(You can enter numbers in scientific notation; e.g. “6.022e23” for 6.022 × 1023)

Loading math expressions... Some text will not display properly until loading has finished.

Starting value: \(x\) =

Inverse root: \(\frac{1}{\sqrt{x}}\) =

Algorithm output: ( error from actual inverse root)

Evil Bit Hacking

  1. Reinterpret float value as integer without changing its bits
  2. Set integer value to  - (value >> 1), roughly equivalent to  - (value / 2)
  3. Reinterpret integer value as float without changing its bits

Actual 32-bit value stored:

Value 1: Binary representation of starting value ()

Click here for details about this floating-point format!

Value 2: Magic number ( = )

Value 3: Bit-shifted number (value >> 1)

Value 4: Magic number minus bit-shifted number ( - (value >> 1))

Pre-Newton result:

Pre-Newton error from actual inverse root:

Newton’s Method (Click here for details!)

Post-Newton result = y(1.5 - 0.5xy2) (x = starting value; y = pre-Newton result)

Post-Newton result:

Post-Newton error from actual inverse root:

Newton’s method accuracy multiplier:

Algorithm Code

This is a modified version of the most famous implementation of this algorithm, used in the first-person shooter game Quake III Arena. It is written in the C programming language.

          float InvSqrt(float x) { // Accepts a float x and returns a float
  long i = 0; // Declare i as a long (32-bit integer)
  float y = 0.0F; // Declare y as a 32-bit float

  y = x; // Set y to starting value
  i = * (long *) &y; // Convert float "y" to long "i" without changing bits
  i = 0x5F3759DF - (i >> 1); // 0x5F3759DF is the "magic number"; i >> 1 is the "bit shift"
  y = * (float *) &i; // Convert long "i" to float "y" without changing bits
  y = y * (1.5F - (0.5F * x * y * y)); // Newton's method; "F" declares a 32-bit float

  return y; // Output y as the result
}
        

Table of Contents


What’s Happening?

This is a basic explanation of the Fast Inverse Square Root (FISR) algorithm intended for all knowledge levels. However, it still involves a lot of math, and if you are ever confused on one of the math concepts mentioned, check out the “Refreshers” section below.

Note: All links in these explanations except the Wikipedia ones link to different parts of this page and will not bring you to an external website. External links are colored in a darker blue.

What Is Fast Inverse Square Root? (Wikipedia)

The Fast Inverse Square Root (FISR) algorithm was an algorithm used in the 1990s to calculate the reciprocal of the square root of a number, or \(\frac{1}{\sqrt{x}}\). It is known for its ingenuity and the mysterious constant 0x5F3759DF that appears in its code.

Here are some examples of the algorithm in action:

\(x\) \(\frac{1}{\sqrt{x}}\) Algorithm Output
0.03 5.773503 5.769970
0.5 1.414214 1.413860
2 0.707107 0.706930
100 0.100000 0.099845

Calculating the reciprocal square root with conventional techniques was very slow, as it involves a division and a square root operation, both of which were slow on 90s computers.

FISR calculates the reciprocal square root in an unconventional way. It avoids using division, which speeds up the algorithm a lot - at the time, it was about 4 times faster than calculating it normally. However, there is a trade-off: FISR gives a less accurate result. (The algorithm can be up to 0.18% off from the true result - the demo shows this error under “Post-Newton error”.)

FISR is not used much anymore, as due to hardware advancements, there are even faster and more accurate methods to calculate the reciprocal square root nowadays.

Why Use Fast Inverse Square Root? (Wikipedia)

FISR was used in 3D rendering programs, such as first-person shooter games, where precision was less important than speed. These programs often dealt with 3D vectors, primarily to simulate lighting. (From now on, I’ll be explaining 2D vectors, but the same principles apply to 3D vectors.)

A vector is a quantity with magnitude and direction. You can think of it like an arrow in 2D (or 3D) space.

(Coordinate grid image source: Wikimedia Commons)

2D vectors have an x-component and y-component. The x-component is the distance a vector travels horizontally and the y-component is the distance it travels vertically. In this example, the vector \(\mathbf{v}\) has an x-component of 3 and a y-component of 4. We will denote these components as \(\mathbf{v}_x\) and \(\mathbf{v}_y\), and the vector itself as \(\langle\mathbf{v}_x, \mathbf{v}_y\rangle\). In this example, \(\mathbf{v} = \langle3, 4\rangle\).

Vectors also have magnitude, which is the total length of the vector. We will use \(|\mathbf{v}|\) to denote the magnitude of \(\mathbf{v}\). Magnitude can be calculated with the Pythagorean Theorem:

\[ |\mathbf{v}| = \sqrt{\mathbf{v}_x^2 + \mathbf{v}_y^2} \]

In this case, vector \(\mathbf{v}\) has a magnitude of \(\sqrt{3^2 + 4^2} = 5\).

Often, we want to deal with unit vectors (also called normalized vectors), which are vectors with a magnitude of 1. To turn a vector into its corresponding unit vector, we divide the vector’s components by its magnitude. For example, to get the unit vector for \(\mathbf{v}\), we divide its components by 5 (its magnitude).

The vector \(\mathbf{u}\) in the image above is the unit vector for \(\mathbf{v}\). Here are the steps for how we calculated the unit vector:

  1. Calculate the magnitude of \(\mathbf{v}\).
\[ |\mathbf{v}| = \sqrt{\mathbf{v}_x^2 + \mathbf{v}_y^2} = \sqrt{3^2 + 4^2} = 5 \]
  1. Divide the components of \(\mathbf{v}\) by its magnitude.
\[ \mathbf{u} = \frac{\mathbf{v}}{|\mathbf{v}|} = \frac{\langle3, 4\rangle}{5} = \langle\frac{3}{5}, \frac{4}{5}\rangle = \langle0.6, 0.8\rangle \]

As you can see, finding a unit vector involves both division and the square root operation. This is where FISR comes into play - it can be used to calculate unit vectors much faster! We can rewrite our steps to involve the reciprocal square root.

3D rendering programs often had to calculate millions of unit vectors, so it was important that each calculation was as fast as possible.

How This Algorithm Works (Wikipedia)

Here is a simple explanation of how FISR works.

FISR takes advantage of how two types of numbers are stored in computer memory:

These two types of numbers are stored differently in computers. For example, here’s how the number 100 is internally represented in binary as an integer and as a float:

(Note: These are specifically the representations for 32-bit integers and floats, meaning each number has 32 binary digits. Nowadays, 64-bit integers and floats are more common, but the FISR algorithm uses the 32-bit representations.)

Floats are internally represented in computers similarly to the form \(2^x\). Let’s say we want to take the reciprocal square root of a number in this form. Using exponent properties, we can write this as:

\[ \begin{flalign} \frac{1}{\sqrt{2^x}} & = \frac{1}{(2^x)^\frac{1}{2}} = [(2^x)^\frac{1}{2}]^{-1} = (2^x)^{-\frac{1}{2}}\\ & = 2^{-\frac{1}{2}x} \end{flalign} \]

This means that to take the reciprocal square root of a number in the form \(2^x\), all we have to do is take the exponent \(x\) and multiply it by -1/2, which is the same as dividing by -2. For example, the reciprocal square root of 210 is 2-5 because 10 / -2 = -5.

Enter a number to find its reciprocal square root:

1/√(2) = 2(? / -2) = 2?

FISR takes in a float as input. Floats are internally represented similarly to this \(2^x\) form, so we can take advantage of this. FISR does three steps first: (Remember, we are trying to avoid division operations because they are slow.)

  1. Reinterpret the bits (binary digits) of the float as an integer. This is necessary for the next step as it allows us to manipulate the number as if it was an integer and not a float. Without this conversion, the next step would not be valid code.
  2. Do a bit shift and subtraction on the integer. The “bit shift” used in FISR moves all of the bits of a binary number one place to the right, which is equivalent to removing the last bit. This bit shift is the same as dividing the number by 2 and rounding down, except it is much faster than a normal division operation. (For example, the binary number 1001 (9 in decimal) bit-shifted 1 to the right is 100 (4 in decimal).) This step essentially divides the exponent of the original input in half and makes it negative (i.e. multiplies it by -1/2 or divides it by -2).
  3. Reinterpret this new integer as a float in order to get back a decimal result.

However, doing this calculation on a floating-point number only results in an approximation, because floats are actually represented in the form \(a \times 2^b\) (not \(2^x\)).

Therefore, we use a technique called Newton’s method to get us closer to the true answer. (To learn more about Newton’s method, check out this section.)

The “magic number” 0x5F3759DF in the code is used to refine this approximation and also account for the specific details of how floats are stored in binary. (0x5F3759DF is a hexadecimal (base 16) number, equal to 1,597,463,007 in decimal.)

If you want to learn more, the section “The Floating-Point Format” provides more details on how floats are stored in memory. After reading that, you can also view the advanced explanation below.

The Floating-Point Format (Wikipedia)

(In this section, the key terms sign, exponent, biased exponent, and coefficient are color-coded to help with explanations. The examples of scientific notation and binary are also color-coded the same way.)

The way floating-point numbers, or “floats”, are stored in binary is similar to scientific notation, a way of writing small and large numbers in the form \(a \times 10^b\).

Floating-point numbers use scientific notation because storing a small or large number takes up less space with scientific notation than with regular notation. For example, a 32-bit integer (without a sign bit) can hold numbers up to 4,294,967,295, but a 32-bit float can hold numbers up to 3.403 × 1038, a number with 39 digits!

However, floating-point sacrifices a little bit of precision in exchange for a wider range of numbers it can represent. (When you type a number into the demo above, it converts the number to a 32-bit float in order for the algorithm to work. The demo displays the exact value stored by the 32-bit float, which may be different from the value you typed because of this imprecision.)

There are three parts in a float:
  1. The sign bit (determines if the number is positive or negative - 0 is positive, 1 is negative)
  2. The exponent
  3. The coefficient

These three parts are also the parts of a number in decimal scientific notation. For example, here’s Avogadro’s number (the number of particles in 1 mole of substance) rounded to 4 significant figures:

+6.022 × 1023

Floating-point works like this, except it uses binary instead of decimal. This means the base is 2 instead of 10. If we convert 6.022 × 1023 to binary scientific notation, we get:

+1.992... × 278

In floats, the sign comes first, then the exponent, then finally the coefficient.

If we convert this number to binary, we get:

  1. Sign: 0
  2. Exponent: 1001110
  3. Coefficient: 1.11111110000101010100111101...

This is not our final floating-point representation, as we have to apply a few more rules first. (We are specifically talking about 32-bit floats here - 64-bit floats are more commonly used nowadays, but the algorithm uses 32-bit floats.)

  1. Add 127 to the exponent. 127 is called the “exponent bias” in this case. This results in a “biased exponent” and allows us to store positive as well as negative exponents. (For example, an exponent of -100 would be stored as a biased exponent of 27.)
  2. Drop the leading 1 from the coefficient. The coefficient in binary scientific notation will always start with a 1, unless the value is 0, so we can drop it in most cases. (The float 0 is stored in a different way.) This gives us one more bit to work with, and also means that only the decimal part of the coefficient is actually stored.
  3. Extend each part to the right number of bits. The biased exponent takes up 8 bits and the coefficient takes up 23 bits. We add leading zeros to the biased exponent until it is 8 bits long, and we round the coefficient to 23 bits after the decimal point.

This is what we get after applying these rules:

  1. Sign: 0
  2. Biased exponent: 11001101 (78 + 127 = 205)
  3. Coefficient: 11111110000101010101000

Thus, our final floating-point representation is:

01100110111111110000101010101000

which represents:

+1.992... × 2205 - 127 =
+1.992... × 278 =
+6.022 × 1023


Advanced FISR Explanation

This section is for curious readers who want to learn more about the specific details of the Fast Inverse Square Root algorithm.

The Floating-Point Representation Is a Logarithm Approximation (Wikipedia)

(If you don’t have a strong understanding of logarithms yet, you should read over the refresher for logarithms first, otherwise this part won’t make much sense.)

FISR uses the integer representation of a positive float as a good approximation of the base 2 logarithm (abbreviated log2) of that float. In other words, reinterpreting a float as an integer (then correcting for a few things) is a fast way to approximate the log2 of that float.

(From now on, we will ignore the sign bit of a float because the reciprocal square root operation is only valid for positive numbers, meaning the sign bit will always be 0.)

This is because the exponent of a float is equal to its log2 rounded down. The coefficient part of the float also serves as a decent approximation of the fractional part of the logarithm. For example, log2(1.5 × 220) ≈ 20.585. (In this case, the fractional part of the logarithm (.585) is only 0.085 off from the fractional part of the coefficient (.5).) The exponent of a float comes before the coefficient, so the entire representation of a float can be used as a log2 approximation.

To get the actual log2 approximation, we need to divide the integer representation by 223. That’s because we need to move the decimal point to the spot in between the exponent and coefficient. We also need to subtract 127 to compensate for the exponent bias. In other words, the integer representation of a float x is approximately 223 · [log2(x) + 127].

Here’s an example with the number 1,000,000: (Note: Numbers in binary are indicated with a subscript 2.)

Actual log2 of 1,000,000 ≈ 19.932

Bits of float 1,000,000 reinterpreted as an integer: 010010010111010000100100000000002 = 1,232,348,160 ≈ 223 · [log2(1,000,000) + 127]

Dividing by 223: 1,232,348,160 / 223 = 10010010.11101000010012146.907

Subtracting 127: 146.907 - 127 = 10011.1110100001001219.907

Notice how the final result (19.907) is very close to the actual logarithm (19.932).

What does this have to do with the reciprocal square root? Well, one way to take the reciprocal square root of a number is to multiply its logarithm in any base by -1/2. Here’s an example starting with 1,000,000:

Starting value: 1,000,000 = 106 (therefore log10(1,000,000) = 6)

Reciprocal square root: \(\frac{1}{\sqrt{1,000,000}}\) = \(\frac{1}{1,000}\) = 0.001 = 10-3 (therefore log10(0.001) = -3)

Notice how the reciprocal square root has a logarithm (-3) that is -1/2 times the original number’s logarithm (6). This is the goal of the FISR algorithm: multiply this logarithm by -1/2. (Note: the base of the logarithm does not matter in this case; this still works with base 2 logarithms.)

(By turning on the “Show log2 information” option in the advanced settings of the demo, you can view the log2 of each value and the log2 approximation that FISR uses for each value.)

Evil Bit Hacking (Wikipedia)

Here are the steps FISR takes to multiply the logarithm of the input value by -1/2:

  1. Take the original float’s logarithm by reinterpreting it as an integer.
  2. Multiply the integer (and thus the logarithm of the float) by -1/2.
  3. Reinterpret this integer as a float to undo the logarithm.

In the algorithm’s code, the first step is accomplished by the line i = * (long *) &y;. This essentially tells the C programming language to take the memory address of y and reinterpret the bits at that address as a long (a 32-bit integer) instead of a float, and assign that to the variable i.

Note that this is not a typical float-to-integer conversion: the bits are preserved instead of the actual values they represent, meaning that doing this conversion results in an integer with a significantly different value from the original float.

For example, the float 3.14 would normally be converted to an integer value of 3, but doing this special type of reinterpretation instead changes the value to 1,078,523,331. (Using the formula mentioned previously, this number is approximately 223 · [log2(3.14) + 127].)

The last step is done with the line y = * (float *) &i;. This does the opposite of the first step: it takes the long we calculated and reinterprets its bits as a float, assigning that back to the variable y.

The second step, written as i = 0x5F3759DF - (i >> 1);, is the most complicated. Let’s break this down.

The bit-shift in the code (i >> 1) divides the integer representation by 2 (and rounds it down to the nearest integer). Since the integer representation is an approximation of the original floating-point number’s logarithm, this effectively divides the logarithm of this number by 2. The subtraction (- (i >> 1)) makes this logarithm negative (i.e. multiplies it by -1).

Are these the only two things we need to do to multiply the logarithm by -1/2? No - we’re forgetting about the exponent bias. We can’t manipulate the exponent of the number directly, since it’s not directly stored in the floating point representation. Instead, we have to manipulate the biased exponent.

The bit shift and subtraction are actually manipulating the biased exponent, not the actual exponent, meaning that simply multiplying by -1/2 will not give us the result we want. Here’s what we want to actually do:

Let’s say that we start with an exponent \(x\). In a float, this exponent is stored as the biased exponent \(127 + x\). We want to change the exponent to \(-\frac{1}{2}x\) in order to take the reciprocal square root. In order to do that, we need to change the biased exponent to \(127 - \frac{1}{2}x\), since the biased exponent is 127 more than the true exponent.

In short, we want the biased exponent, or our logarithm approximation, to go from \(127 + x\) to \(127 - \frac{1}{2}x\). Here’s how we do that:

  1. Multiply the entire expression by -1/2. This is the equivalent of a subtraction and bit shift.
\[ (127 + x) \cdot -\frac{1}{2} = -63.5 - \frac{1}{2}x \]
  1. Add 190.5 to the expression.
\[ \left(-63.5 - \frac{1}{2}x\right) + 190.5 = 127 - \frac{1}{2}x \]

As you can see, simply multiplying by -1/2 is not enough. We also need to add 190.5 to correct for the exponent bias. (Note: 190.5 is exactly 3/2 of the exponent bias of 127.)

This is where the magic number 0x5F3759DF (1,597,463,007 in decimal) comes in! Notice in the demo that the biased exponent of this number is 190. That’s because this number is approximately 190.5 · 223.

0x5F3759DF - (i >> 1) can be rewritten as - (i >> 1) + 0x5F3759DF, meaning we’re essentially adding the magic number to the negative of our bit-shifted number. Because the magic number is about 190.5 · 223, this adds about 190.5 to our logarithm approximation, which compensates for the exponent bias (exactly what we want!)

(Note: The magic number is not exactly 190.5 · 223; it’s actually about 190.432 · 223. This is because slightly changing the magic number actually improves the accuracy a little bit (for complicated reasons) since we’re working with an approximation of the logarithm instead of the actual logarithm.)

Newton’s Method (Wikipedia)

The result that the above process gives us is usually not very accurate. The algorithm uses a technique from calculus called Newton’s method to improve the accuracy of its result.

Newton’s method is a method to find the approximate roots (or zeros) of a function. In other words, if we have a function f(x), Newton’s method can help us find an approximate value of x where f(x) = 0. (As a refresher, a function takes in a value and outputs exactly one value. Newton’s method helps us find an input that gives us an output of 0.)

The blue point is the root of the function f(x) = 2x - 6.

For example, the root of the function f(x) = 2x - 6 is 3, because f(3) = 2(3) - 6 = 0. This is equivalent to the solution of 2x - 6 = 0.

The blue points are the roots of f(x) = x2 - 4.

Newton’s method takes in a function and gives us an expression that we can use to find the roots of that function.

For example, consider the function f(x) = x2 - 4. The roots of this function are 2 and -2, since 22 - 4 = 0 and (-2)2 - 4 = 0. Newton’s method can only give us one root at a time, so we will focus on the positive root 2. (By finding the positive root of f(x) = x2 - 4, we are essentially finding the positive square root of 4.)

Newton’s method for this function gives us the expression \(\frac{1}{2}\left(x + \frac{4}{x}\right)\).

(In general, for the function f(x) = x2 - a, Newton’s method gives the expression \(\frac{1}{2}\left(x + \frac{a}{x}\right)\). This expression can be used with Newton’s method to find the square root of any number a.)

All we have to do now is to start with an initial guess for the root of the function, then plug it into this expression.

For example, let’s start with an initial guess of 1. Plugging x = 1 into the expression \(\frac{1}{2}\left(x + \frac{4}{x}\right)\), we get:

\[ \begin{flalign} \frac{1}{2}\left(x + \frac{4}{x}\right) & = \frac{1}{2}\left(1 + \frac{4}{1}\right) = \frac{1}{2}(5)\\ & = 2.5 \end{flalign} \]

2.5 is closer to the actual root of 2 than our initial guess of 1. To make our approximation better, we simply plug in this new value (2.5) into the expression again:

\[ \begin{flalign} \frac{1}{2}\left(x + \frac{4}{x}\right) & = \frac{1}{2}\left(2.5 + \frac{4}{2.5}\right) = \frac{1}{2}(4.1)\\ & = 2.05 \end{flalign} \]

This is even closer to the actual root! If we plug in this new value one more time, we get an even better result of about 2.0006. The more times we iterate Newton’s method, the more accurate our approximation becomes.


Newton’s Method Demo

To see Newton’s method in action, enter a number you want to find the square root of and a guess for that number’s square root. You will see how the method described above can be used to find the square root of that number.

Number:

Initial guess:

1st Newton iteration:

2nd Newton iteration:

3rd Newton iteration:

Actual square root of number:


So how can we use Newton’s method to find the reciprocal square root?

Consider the function \(f(y) = \frac{1}{y^2} - x\). The roots of this function are \(y = \pm\frac{1}{\sqrt{x}}\).

This means that finding the positive root of this function gives us \(\frac{1}{\sqrt{x}}\), exactly what we want the FISR algorithm to output!

So let’s use Newton’s method with this function. Newton’s method gives us the expression \(\frac{y(3 - xy^2)}{2}\). However, in order to use this expression, we need to start with an initial (positive) guess for the reciprocal square root. But wait, we already have a good guess from the steps we’ve done before, such as 0x5F3759DF - (i >> 1)!

All we need to do now is plug in our guess into this expression to get a better approximation. This is the final step of the FISR algorithm. Division operations are slow on computers, so instead of calculating \(\frac{y(3 - xy^2)}{2}\), the FISR code uses the equivalent expression y(1.5 - 0.5xy2) instead, written as y = y * (1.5F - (0.5F * x * y * y));.

The algorithm performs 1 iteration of Newton’s method, but more or less iterations can be done to adjust the speed and accuracy of the algorithm. Performing more iterations provides more accuracy but slows down the algorithm.


An Even Deeper Dive

Warning: This part of the page is intended for readers who are very interested in the specific details of the FISR algorithm. I’ve tried to make this part as easy as possible to understand without sacrificing accuracy, but it still might be hard for casual readers to understand everything.

More Details on Newton’s Method

(If you don’t have a strong understanding of derivatives yet, you should read over the refresher for derivatives first, otherwise this part won’t make much sense.)

(Image source: Wikimedia Commons)

As a refresher, Newton’s method is used to find a better approximation of a root of a function. Newton’s method takes in an existing approximation of that root and returns a better approximation.

The idea behind Newton’s method is to approximate a function \(f(x)\) as a straight line. In the image above, the red line is a linear approximation of the blue line that represents \(f(x)\). \(x_n\) is our initial guess and \(x_{n+1}\) is the improved approximation that Newton’s method gives us. In other words, Newton’s method brings us from \(x_n\) to \(x_{n+1}\), and \(x_{n+1}\) is closer to the actual root of the function.

Here is how Newton’s method works: First, we start with an initial guess \(x_n\). Then, we find the tangent line at the point \((x_n, f(x_n))\), which is the point on the function that has the \(x\)-value of our initial guess \(x_n\). The tangent line is the line that just barely touches a certain point, as shown by the red line in the diagram above. The slope of this tangent line is the derivative of \(f(x)\) at \(x_n\), or \(f'(x_n)\), since the tangent line represents the instantaneous rate of change, or slope, of \(f(x)\) at \(x_n\).

To get our improved guess \(x_{n+1}\), we need to calculate the distance between \(x_n\) and \(x_{n+1}\) in the diagram above, or \(x_n - x_{n+1}\). We can do this by rearranging the slope formula.

\[ \text{slope} = \frac{\text{rise}}{\text{run}} \] \[ \text{slope} \times \text{run} = \text{rise} \] \[ \text{run} = \frac{\text{rise}}{\text{slope}} \]

In the diagram above,

\[ \text{slope} = f'(x_n) \] \[ \text{rise} = f(x_n) \] \[ \text{run} = x_n - x_{n+1} \]

Therefore, because \(\text{run} = \frac{\text{rise}}{\text{slope}}\),

\[ x_n - x_{n+1} = \frac{f(x_n)}{f'(x_n)}. \]

Now that we’ve found the distance between \(x_n\) and \(x_{n+1}\), all we have to do now is subtract this distance from \(x_n\) to get our final result \(x_{n+1}\).

\[ x_{n+1} = x_n - \frac{f(x_n)}{f'(x_n)} \]

This is the formula behind Newton’s method! To get the Newton’s method expression for any function \(f(x)\), find its derivative \(f'(x)\) and use the formula above.

For example, consider the function \(f(x) = x^2 - 4\). This function has the same derivative as \(f(x) = x^2\), because constants (such as 4) can be ignored when taking a derivative.

Using the power rule, \(f(x)\) has a derivative of \(f'(x) = 2x^1 = 2x\). Plugging \(f(x)\) and \(f'(x)\) into the formula, we get \(x_{n+1} = x_n - \frac{x_n^2 - 4}{2x_n}\), which can be simplified to \(x_{n+1} = \frac{1}{2}\left(x_n + \frac{4}{x_n}\right)\).

FISR uses Newton’s method on the function \(f(y) = \frac{1}{y^2} - x\), where \(x\) is the algorithm’s input value and \(y\) is the equivalent of \(x_n\) in the Newton’s method formula. Let’s use the formula above to find the Newton’s method expression for this function.

First, we need to find the derivative of \(f(y)\). \(f(y) = \frac{1}{y^2} - x\) can be rewritten as \(f(y) = y^{-2} - x\). The derivative of \(f(y) = y^{-2} - x\) is the same as the derivative of \(f(y) = y^{-2}\), because when running the algorithm, \(x\) is a constant that can be ignored. Using the power rule, the derivative of \(f(y) = y^{-2}\) is \(f'(y) = -2y^{-3}\).

Then, we use the Newton’s method formula, giving us \(y_{n+1} = y_n - \frac{y_n^{-2} - x}{-2y_n^{-3}}\). This can be simplified to \(y_{n+1} = \frac{y_n(3 - xy_n^2)}{2}\) \(= y_n(1.5 - 0.5xy_n^2)\), which is written in the code as y = y * (1.5F - (0.5F * x * y * y));.

More Magic Number Details

Coming soon!

(Note from the author: I wrote this “Coming soon!” text back in 2023, and I still haven’t added it yet. So I actually lied that this was coming soon. Maybe one day I’ll work on this?)


Refreshers

Want to learn or refresh your memory of a specific topic mentioned on this page? If so, this is the right place for you!

Scientific Notation

Short summary:

Scientific notation is a shorter way of writing small and large numbers. A number in scientific notation contains two parts: the coefficient, a number between 1 and 10, and the exponent, which specifies the power of 10 to multiply the coefficient by. The coefficient represents the first digits of the number and the exponent gives you a rough idea of how large or small the number is.

Full explanation:

Some numbers are really large or small, and writing them in full would take a lot of time and space. One way we can save time writing these numbers is to use a special format called scientific notation.

Here are some examples:

As you can see, scientific notation uses powers of 10 to achieve this shortened form.

Power of 10 Value Expanded Form
... ... ...
10-3 0.001 = 1 / 10 / 10 / 10
10-2 0.01 = 1 / 10 / 10
10-1 0.1 = 1 / 10
100 1 = 1
101 10 = 1 × 10
102 100 = 1 × 10 × 10
103 1,000 = 1 × 10 × 10 × 10
... ... ...

We can create any positive number by taking a power of 10 and multiplying it by a number between 1 and 10.

Examples: Terminology to know:

(The examples above are color-coded to highlight the coefficient and exponent. Future examples will also be color-coded this way.)

One way to think of scientific notation is that the coefficient represents the first digits of the number and the exponent represents how many times you’ve moved the decimal point.

A positive exponent means you’ve moved the decimal point to the left to get the coefficient and a negative exponent means you’ve moved the decimal point to the right to get the coefficient.

Examples:

Another way of thinking of scientific notation is that for integers, the exponent represents how many digits the number has, minus 1. Instead of writing a large number in full, scientific notation specifies the first digits of a number and how many digits it has in total.

Example:

Binary

Short summary:

Computers use binary, a different way of writing numbers that only uses the digits 0 and 1. Any number that we can normally write in decimal, including numbers with a fractional part, can be converted into a binary number.

Full explanation:

We normally write numbers using 10 different digits: 0, 1, 2, 3, 4, 5, 6, 7, 8, and 9. We call this decimal, or base 10, because it uses 10 different digits.

However, computers use binary, or base 2 numbers, because it’s easier to implement in computer hardware. Binary numbers only have the digits 0 and 1. Each digit in a binary number is also called a “bit”, so a term like “32-bit” refers to a binary number with 32 digits.

Each digit in a decimal number has its own place value: the value that each digit represents. The place value of a digit is determined by its position relative to the decimal point - going left 1 digit multiplies the place value by 10, and going right 1 digit divides the place value by 10. Here’s an example with the number 123.45:

To get the value of this number, we multiply every digit by its corresponding place value, then add them all up.

(1 × 100) + (2 × 10) + (3 × 1) + (4 × 0.1) + (5 × 0.01)

= 100 + 20 + 30 + 0.4 + 0.05

= 123.45

In binary, the same is true, except we’re multiplying and dividing by 2 instead of 10. When we do this, we only need 2 digits (0 and 1) instead of 10 to represent any number! For example, here’s 5.25 in binary:

To get the value of this number, we once again multiply each digit by its corresponding place value.

(1 × 4) + (0 × 2) + (1 × 1) + (0 × 0.5) + (1 × 0.25)

= 4 + 0 + 1 + 0 + 0.25

= 5.25

Alternatively, since 0 and 1 are the only digits, we can simply add up the place values of all 1s in the number.

4 + 1 + 0.25 = 5.25

We can break up any number into a sum of powers of 2 to get its binary representation.

Examples:

Note that binary numbers are usually much longer than decimal numbers. For example, the number 100,000 has 6 decimal digits, but the same number in binary is 11000011010100000, which has 17 bits.

Representing fractions and decimals in binary can be a little bit more complicated, as they can sometimes result in infinitely repeating numbers.

Example:

(Sidenote: Floating-point numbers cannot store these repeating numbers exactly, so instead they round to the nearest representable number. For example, 0.1 is actually stored as about 0.1000000015 in a 32-bit floating-point number. This also means that 0.1 + 0.2 does not equal 0.3 in floating-point arithmetic. Programmers must be careful when using floats in code as it can cause problems like this.)

In summary, binary can be used to represent both integers and decimals, and the floating-point format uses binary to represent both of these types of numbers.

Logarithms

Short summary:

A logarithm of a number tells you what power a given base must be raised to in order to get that number. For example, log2(8) = 3 because 23 = 8.

Full explanation:

Repeated multiplication is a very important concept in math. Consider this equation:

2 × 2 × 2 = 8

There are three ways to mathematically describe this relationship:

  1. Exponentiation: 23 = 8
  2. Roots: 8 = 2
  3. Logarithms: log2(8) = 3

We’ll be focusing on the third one - logarithms. Logarithms tell you how many of one number (the base) you need to multiply together to get another number. The base is the number indicated by the subscript after “log”. In other words, the base b logarithm of a number asks, “b raised to what power equals that number?”

Here’s an example: What is log10(10,000)? To answer this, ask yourself, “10 raised to what power equals 10,000?” In other words, how many 10s do you need to multiply together to get 10,000? The answer is 4, because 104 = 10 × 10 × 10 × 10 = 10,000.

Exponentiation and logarithms are closely related. In this example, log10(10,000) = 4 because 104 = 10,000. The answer to the logarithm in the first equation is the exponent in the second equation. You can think of logarithms as undoing exponentiation.

FISR uses base 2 logarithms because floating-point numbers are stored in binary. The base 2 logarithm can be used to extract the exponent from a floating-point number. For example, log2(1 × 220) = 20. This is because the logarithm undoes the exponentiation.

For floats that aren’t exactly a power of 2 (i.e. the coefficient isn’t exactly 1), the base 2 logarithm will be a decimal. For example, log2(1.5 × 220) is greater than 20 but less than 21 because 1.5 × 220 is in between 220 and 221. The actual answer is about 20.585, because 1.5 × 220 ≈ 220.585.

Now you’re ready to finish reading The Floating-Point Representation Is a Logarithm Approximation!

Derivatives

What is the slope of the function at the blue point? In general, how can you find slope at just one point? Find out how by reading on!

Derivatives are one of the core concepts behind Newton’s method. A derivative tells you how to find the rate of change, or slope, of a function at a given point. (As a refresher, the slope is the steepness of a line, specifically how many units it moves vertically for every unit it moves horizontally.) In other words, a derivative can tell you how fast a function is increasing at any \(x\)-value, like a speedometer but for functions.

Finding the slope of a line between two points is simple: you just calculate the change in \(y\) between the two points divided by the change in \(x\), or “rise over run”.

\[ \text{slope} = \frac{\text{rise}}{\text{run}} = \frac{\text{change in }y}{\text{change in }x} \]

But what if we want to find the slope of a function at just one point? The rise over run formula won’t work in this case. (The slope at one point is formally known as the “instantaneous rate of change”. From now on, I will refer to instantaneous rate of change as simply “slope”.)

What we can do instead is imagine a second point getting closer and closer to that first point and then find what the slope between the two points approaches. (The value that the slope approaches is known as a “limit”.)

Side Lesson: Limits

(Reading this section is completely optional, but it will help you get a better understanding of derivatives and calculus in general.)

I said that to find the slope at one point, we take a second point, move that second point closer and closer to the first point, and then find what the slope between two points approaches. But how do we define “approach”? We use something called a limit.

This is the graph of \(f(x) = 2x\). What happens to the \(y\)-value as \(x\) approaches 2? We can use a limit to answer this question.

A limit in mathematics is the value an expression or function approaches as a variable approaches a certain value. For example, the “limit of \(2x\) as \(x\) approaches 2” is 4, because as \(x\) gets closer to 2, the value of \(2x\) gets closer to 4. This is written in mathematical notation as \(\displaystyle\lim_{x\to 2} 2x = 4\). Importantly, we can make \(2x\) as close to 4 as we want (“arbitrarily close” to 4) as we move \(x\) closer to 2. Also, it doesn’t matter if \(x\) is increasing or decreasing to 2, \(2x\) will approach 4 either way. These two conditions are required for a limit to exist.

\(x\) \(2x\)
1.9 3.8
1.99 3.98
1.999 3.998
... ...
\(x\) \(2x\)
2.1 4.2
2.01 4.02
2.001 4.002
... ...

\(\displaystyle\lim_{x\to 2} 2x = 4\) because as shown by the tables, we can make \(2x\) as close to 4 as we want by moving \(x\) closer and closer to 2. It doesn’t matter if \(x\) is increasing (1.9 → 1.99 → 1.999 → ...) or decreasing (2.1 → 2.01 → 2.001 → ...), \(2x\) will approach 4 either way.

You might have noticed that using a limit was unnecessary here because we could have just plugged in \(x = 2\) into the expression \(2x\) to get the same answer 4. However, limits become much more powerful when we can’t simply plug in the \(x\)-value into the expression.

This is the graph of \(f(x) = \frac{x^2}{x}.\) This function is undefined at \(x = 0\), but we can still describe what happens near \(x = 0\) by using a limit!

Limits are especially useful in situations where we would otherwise be dividing by 0. For example, if \(f(x) = \frac{x^2}{x}\), then \(f(0)\) is undefined, but \(\displaystyle\lim_{x\to 0} f(x) = 0\), as shown by the graph. Notice how we were still able to get an answer for the limit as \(x\) approaches 0 even though the function is undefined at \(x = 0\). That’s because, importantly, the value of \(f(0)\) is not relevant when taking the limit as \(x\) approaches 0.

Note that in some cases when trying to find a limit, a limit will not exist.

How do we relate all of this back to derivatives? Remember, to find the slope of a function at a single point, we move a second point on that function closer and closer to that first point and then we determine what the slope approaches. This means that we are taking the limit of the slope as the distance between the \(x\)-values of the two points approaches 0. Remember our slope formula:

\[ \text{slope} = \frac{\text{rise}}{\text{run}} = \frac{\text{change in }y}{\text{change in }x} \]

Let’s say that our first point has an \(x\)-value of simply \(x\), and that the distance between the \(x\)-values of our two points is \(h\). Thus, our second point has an \(x\)-value of \(x + h\). This also means that the “run”, or change in \(x\), between the two points is \(h\).

By plugging the \(x\)-values of our two points into a function \(f(x)\), we can find the \(y\)-values of the points. Our first point has a \(y\)-value of \(f(x)\) and our second point has a \(y\)-value of \(f(x + h)\). Because of this, the “rise”, or change in \(y\), between the two points is \(f(x + h) - f(x)\).

Plugging the expressions for the rise and run into the slope formula, we get:

\[ \text{slope} = \frac{\text{rise}}{\text{run}} = \frac{f(x + h) - f(x)}{h} \]

We want the second point to approach the first point, so we will be shrinking \(h\) towards 0. We will change the above expression to a limit to get a definition for the derivative of \(f(x)\) at \(x\):

\[ \lim_{h\to 0} \frac{f(x+h) - f(x)}{h} \]

This limit definition is often used to find derivatives of functions. Notice how we need to use a limit here to avoid dividing by 0. One reason limits are so important in math is because without them, we wouldn’t be able to find derivatives. Let’s jump into an example of a derivative now!


For example, consider the function \(f(x) = x^2\), a parabola. What is the slope at \(x = 2\)?

First, we’ll choose two points on the function with \(x\)-values close to 2. We’ll choose the points with \(x\)-values of 2 and 2.1, which are (2, 4) and (2.1, 4.41). (Remember, the \(y\)-values of these points are equal to \(x^2\), because that’s our function!)

Let’s find the slope of the line between these two points.

\[ \begin{flalign} \text{slope} & = \frac{\text{rise}}{\text{run}} = \frac{4.41 - 4}{2.1 - 2} = \frac{0.41}{0.1}\\ & = 4.1 \end{flalign} \]

Now let’s move the second point closer to the first point. We’ll change the \(x\)-value of the second point to 2.01, giving us the point (2.01, 4.0401). Let’s calculate the slope between this point and our other point (2, 4).

\[ \begin{flalign} \text{slope} & = \frac{\text{rise}}{\text{run}} = \frac{4.0401 - 4}{2.01 - 2} = \frac{0.0401}{0.01}\\ & = 4.01 \end{flalign} \]

If we do this slope calculation again with the even closer points (2, 4) and (2.001, 4.004001), we get a slope of 4.001.

Here is a table summarizing our results, as well as testing what happens when the second point is to the left of the first point:

1st Point 2nd Point Change in \(x\) Slope
(2, 4) (2.1, 4.41) 0.1 4.1
(2, 4) (2.01, 4.0401) 0.01 4.01
(2, 4) (2.001, 4.004001) 0.001 4.001
(2, 4) (2, 4) 0 ???
(2, 4) (1.999, 3.996001) -0.001 3.999
(2, 4) (1.99, 3.9601) -0.01 3.99
(2, 4) (1.9, 3.61) -0.1 3.9

The table shows that as the points get closer and closer, the slope gets closer and closer to 4. In fact, we can make the slope as close to 4 as we want (“arbitrarily close” to 4) by moving the points closer and closer to each other. Because of this, 4 is defined as the slope of the function at \(x = 2\).

The slope of \(f(x) = x^2\) at \(x = 2\) is 4. This is shown with the green line, which has a slope of 4. Notice how the green line just barely touches the red line at the blue point (2, 4). The green line is called a “tangent line” because it is tangent to the function.

In this case, the slope (4) is exactly twice the value of \(x\) (2). It turns out that this is true for any \(x\)-value for the function \(f(x) = x^2\): the slope at \(x\) is always equal to \(2x\).

This means that the derivative of \(f(x) = x^2\) is \(f'(x) = 2x\). (\(f'(x)\) means the derivative of \(f(x)\), and is pronounced “f-prime of x”.)

A derivative of a function is an expression that tells you the slope of that function at any particular \(x\)-value. For example, if a function has a derivative of \(2x\), at \(x = 10\), that function has a slope of 2(10) = 20.

There are many rules that can be used to quickly find the derivative of a function. One of the most important derivative rules is the power rule. This is the rule that was used to implement Newton’s method in the FISR algorithm.

The power rule states that the derivative of \(f(x) = x^\textcolor{blue}{n}\) is \(f'(x) = \textcolor{blue}{n}x^{\textcolor{blue}{n}-1}\). In other words, to get the derivative of \(f(x)\), you multiply \(f(x)\) by its exponent then subtract 1 from its exponent.

The power rule works for any real number exponent, including decimals and negative numbers! Unfortunately, proofs of the power rule are complicated, so I will not be providing a proof.

If we use the power rule on \(f(x) = x^\textcolor{blue}{2}\), we get \(f'(x) = \textcolor{blue}{2}x^{\textcolor{blue}{2}-1} = 2x^1 = 2x\), which is the same as the derivative we determined above!

Here are some examples of the power rule:

\(f(x)\) \(f'(x)\) \(f(x)\) Simplified \(f'(x)\) Simplified
\(x^{-2}\) \(-2x^{-3}\) \(\frac{1}{x^2}\) \(-\frac{2}{x^3}\)
\(x^{-1}\) \(-1x^{-2}\) \(\frac{1}{x}\) \(-\frac{1}{x^2}\)
\(x^0\) \(0x^{-1}\) \(1 \: (x \ne 0)\) \(0 \: (x \ne 0)\)
\(x^\frac{1}{2}\) \(\frac{1}{2}x^{-\frac{1}{2}}\) \(\sqrt{x}\) \(\frac{1}{2\sqrt{x}}\)
\(x^1\) \(1x^0\) \(x\) \(1\)
\(x^2\) \(2x^1\) \(x^2\) \(2x\)
\(x^3\) \(3x^2\) \(x^3\) \(3x^2\)

One more important thing about derivatives: When taking the derivative of a function, addition or subtraction of constants can be ignored. For example, the derivative of \(f(x) = x^2 - 4\) is the same as the derivative of \(f(x) = x^2\) because 4 is a constant. Here’s why:

The red \((f(x) = x^2)\) and blue \((f(x) = x^2 - 4)\) functions have the same derivative. The green and purple lines show the slope of each function at \(x = 2\). Notice how the two slopes are the same. The derivative of both functions is \(2x\).

Adding or subtracting a constant to an expression does not change its derivative, since it just moves the graph up or down by a constant amount. This does not change the slope of the function at any \(x\)-value.

Now you’re ready to finish reading More Details on Newton’s Method!


Credits / Special Thanks

This page uses the MathJax JavaScript library to display some mathematical formulas and equations.

I used Pixlr, Google Drawings, and Desmos Graphing Calculator to create some of the diagrams on this page.

This font was created by Manfred Klein. (Special thanks to the game Antimatter Dimensions by Hevipelle for introducing me to this font.)

Syntax highlighting in the “Algorithm Code” section was inspired by Visual Studio Code’s light theme syntax highlighting.

Thanks to mathsisfun.com for a great explanation of derivatives that helped me create and helped inspire my own explanation of derivatives.

And finally, special thanks to everyone who has contributed to the development of the FISR algorithm. I love this algorithm because it’s a perfect example of the creativity involved in computer science. Usually, creativity is associated with subjects like art and literature, but we can’t forget that the world of computer science is also filled with creative algorithms that make our world so much better!