How to simulate "tail recursion" in C#?

Asked

Viewed 1,141 times

15

No. Net, I know it is possible to make cause calls because the compiler of F#, when optimizing code, transforms a function with tail recursion into a function with a loop, thus avoiding stack bursts. To make it clearer I will use as an example a factorial function.

In F#:

let rec fatorial_iter n r : int64 =
  if n <= 1L then r else fatorial_iter (n - 1L) (n * r)

let fatorial n = fatorial_iter n 1L

This prevents stack bursts as the stack will always be the same size after optimization. Illustrating a call to fatorial(5):

fatorial(5, 1)   -> fatorial(5 - 1, 5 * 1)  ->
fatorial(4, 5)   -> fatorial(4 - 1, 4 * 5)  ->
fatorial(3, 20)  -> fatorial(3 - 1, 3 * 20) ->
fatorial(2, 60)  -> fatorial(2 - 1, 2 * 60) ->
fatorial(1, 120) -> 120

If we try to do this in C#, we can also make the recursive call on the tail, as in this example:

static long Fatorial(int n, long r = 1) {
    return n <= 1 ? r : Fatorial(n - 1, n * r);
}

But the C# compiler cannot optimize this code (at least not yet, using . Net Framework 4.5).

Is there any way, without losing the immutability of the code, to make this function cannot cause a stack overflow?

  • I found a article that shows this by manipulating the IL, but it is half out of hand to edit the IL for every function. Another solution not to occur stackoverflow is to use a Thread with the parameter maxStackSize in the builder. Depending on the recursion depth this does not help.

4 answers

7


As others have mentioned, C# doesn’t give you a surety that a tail recursion will be optimized. I’m a huge fan of tail recursion but I think you’re focusing a little bit on the wrong direction - I think flexible flow control is a more crucial point of tail recursion than immutability and it changes the way we approach this problem a little bit.

In a simple loop case with your example, using tail recursion turns out to be as changeable and low-level as writing your code using Gotos. At each step you have to update the accumulator and the accountant and say that you will make a jump back to the beginning of the loop.

  int acc = 1;
  int i = N;
loop:
   if (i >= 0) {
     // finge que pode usar atribuição múltipla estilo Python
     i, acc = i-1, acc*i; goto loop;
   } else {
     return acc;
   }

The only advantage of tail recursion compared to Gotos is the assignment of more than one variable in one step and that the compiler will let you know if you forget to tell the new value to one of the variables. Anyway, a for loop ends up being more structured and high level, since you just need to take care of logic to update the product and the counter update and the Gotos comes for free. It is somewhat similar to programming using a fold instead of tail recursion

int acc = 1;
for (int i = N; i >= 0; i--){
   acc *= i;
}

Only tail recursion doesn’t just serve to make simple loops in which a function calls itself. The part where tail recursion makes the most difference is when you have more than one mutually recursive function. A forced example is this state machine defined in Haskell:

par 0 = True
par n = impar (n-1)

impar 0 = False
impar m = par (m-1)

The equivalent of this without recursion of cause is a state machine:

  int n;
  int m;
par:
  if (n == 0) {
    return true;
  } else {
    m = n - 1;
    goto impar;
  }
impar:
  if (n == 0) {
    return false;
  } else {
    n = m - 1;
    goto par;
  }

However in this version all functions have to be combined in a code snippet only, which hurts the encapsulation. For example, we need to declare all parameter variables at the top, which makes it possible for them to be used in the wrong place.

In addition, tail recursion is also present when we call a function that has been passed to us as a parameter. This is equivalent to a computed drop and cannot be translated into a static loop.

In the latter two cases it is when the language support for tail recursion is most needed. If you have to do something equivalent, the closest will be to use a "trampoline" pattern, but it is inefficient and quite boring to use.

  • I liked the answer. In fact, I was looking for cause recursion precisely because of the more concise code when compared to the use of jumps and loops. The concept of the trampoline I found interesting, but apparently it is more a loss of productivity than wearing laces.

4

This type of optimization is present only in 64-bit JIT in Release mode.

Test your x64 code in Release mode and Stackoverflow will no longer occur.

And of course, never trust a JIT optimization. You don’t know which framework, platform, or architecture your code will run on.

Code tested here:

class Program
{
    static void Main(string[] args)
    {
        Console.WriteLine(Fatorial(2000000));
    }

    static long Fatorial(int n, long r = 1)
    {
        return n <= 1 ? r : Fatorial(n - 1, n * r);
    }
}
  • Now I’m confused. If I compile my code for Framework 4.5 as Release, it can run on older versions of Framework?

  • Depending on what you are doing, yes... This code is . net 1.0 code, so the IL is the same. I tested this on . net 3.5 (it was possible to run in 2.0 without problems some things that were not new, and even some new ones like Extension Methods, since no new IL was generated for this). The fact is that there is Mono, x86, x64, etc...

  • 2

    And by the way, I said it only works on Release x64 because every architecture (x64/x86/Arm) has its own JIT. There are things that are only optimized in a specific JIT, like this case. Do exactly the same code and only change x64 to Anycpu and will give exception. Switch to DEBUG and give exception.

  • Got it. Unfortunately I’m not x64 here to test, and we can’t always count on it. Isn’t there a method that doesn’t depend on x64? That I know other languages don’t depend on x64 (Which wouldn’t make much sense, either)...

  • As I said, NEVER trust optimizations, as they may not be present. Here the only way is to remove recursion. This would ensure that the method would not be called and then the return address would not be placed on the stack and the same would not pop =)

1

In this case the only guaranteed way not to cause a StackOverflow would be to use a non-rerecursive algorithm. For example:

static long Fatorial(int n) 
{
    long r = 1;

    for (; n > 1; n--)
    {
        r *= n;
    }

    return r;
}
  • I made an algorithm similar to these, but my idea was to use immutability even. for (long i = n, j = 1; ; j *= i--) if (i <= 1) return j;

0

I may be talking nonsense right now, but I believe Jcködel’s answer is incorrect.

I believe that you only managed to compute such a large factorial number precisely because you are using the 64-bit architecture and not because of any optimization.

NOTE: I had to post as an answer, I can’t "comment" on other answers yet, I don’t have enough reputation.

  • You are giving pertinent information to the question that is not contained in another answer, so I believe there is no problem. I don’t have an x64 system to test, but if you do, you can compare the two generated Ils to see if there’s a difference.

  • 1

    Are you sure? The stack overflow has to do with the size of the input, not the size of the response obtained (and the factorial will overflow in a 64-bit int in no time). And this article here indicates that the 32-bit and 64-bit Jits are different: http://blogs.msdn.com/b/jomo_fisher/archive/2007/09/19/adventures-in-f-tail-recursion-in-three-languages.aspx?Redirected=true

Browser other questions tagged

You are not signed in. Login or sign up in order to post.