What’s the use of the reserved word "Yield"?

Asked

Viewed 11,626 times

48

What is the use of the keyword (reserved) yield?

When and where it is used?

  • http://msdn.microsoft.com/pt-br/library/9k7k7cf0.aspx

  • From what I could understand its use would be for cases where I would have more than one Return, but I confess that I can not see a good application of the examples I saw in the link I sent as reference

3 answers

52


TL;DR

We can say it’s a syntactic sugar existing since C# 2 to control program execution flow while maintaining a state.

It is widely used to give better efficiency and abstraction in the execution of a data sequence (technically this sequence could have only one element but the fewer elements, even it is interesting).

For the sake of legacy his syntax is always yield return, the most commonly used, or yield break.

The first terminates the operation of a method by returning a value to the one called as expected from a return. But in this case the value is encapsulated in a data structure that has a iterator creating a Generator indicating where he stopped to be able to resume from there.

The second closes the method "definitive" ending the iterator.

In fact the yield is a limited form of continuation.

It controls execution through a hidden state that determines at which point of an enumeration the program is running, so the call can continue from where it left off. Note that it returns an enumerable type and not the type of what you want in itself (if it is to return one int, in fact it returns a IEnumerable<int>). This enumerable structure is what controls the continuity of execution from where it left off.

So we can say, otherwise, that the method with yield returns a value without leaving this method. Of course he leaves, but he leaves knowing where he left off and knows he has to go back there when he is summoned again so he gives the impression that he never left.

It generates what we might call a temporary virtual data collection that is materialized later when the data is actually needed.

Functioning

In that Jon Skeet article shows how it works:

using System;
using System.Collections;

class Test {
    static IEnumerator GetCounter() {
        for (int count = 0; count < 10; count++) {
            yield return count;
        }
    }
}

I put in the Github for future reference.

Compile for this:

internal class Test {
    // Note how this doesn't execute any of our original code
    private static IEnumerator GetCounter() {
        return new <GetCounter>d__0(0);
    }

    // Nested type automatically created by the compiler to implement the iterator
    [CompilerGenerated]
    private sealed class <GetCounter>d__0 : IEnumerator<object>, IEnumerator, IDisposable {
        // Fields: there'll always be a "state" and "current", but the "count"
        // comes from the local variable in our iterator block.
        private int <>1__state;
        private object <>2__current;
        public int <count>5__1;

        [DebuggerHidden]
        public <GetCounter>d__0(int <>1__state) {
            this.<>1__state = <>1__state;
        }

        // Almost all of the real work happens here
        private bool MoveNext() {
            switch (this.<>1__state) {
                case 0:
                    this.<>1__state = -1;
                    this.<count>5__1 = 0;
                    while (this.<count>5__1 < 10) {
                        this.<>2__current = this.<count>5__1;
                        this.<>1__state = 1;
                        return true;
                    Label_004B:
                        this.<>1__state = -1;
                        this.<count>5__1++;
                    }
                    break;

                case 1:
                    goto Label_004B;
            }
            return false;
        }

        [DebuggerHidden]
        void IEnumerator.Reset() {
            throw new NotSupportedException();
        }

        void IDisposable.Dispose() {
        }

        object IEnumerator<object>.Current {
            [DebuggerHidden]
            get {
                return this.<>2__current;
            }
        }

        object IEnumerator.Current {
            [DebuggerHidden]
            get {
                return this.<>2__current;
            }
        }
    }
}

He doesn’t do magic, he’s just another design pattern that is so useful that it was inserted as language ease. The code turns into a state machine.

In C# it is always implemented through a IEnumerable and more recently through its generic version.

Although it can be used independently it is very much used to abstract a loop. He performs every step of this loop within a method in every call from it. Unlike the developer’s intuition when he sees a return he thinks the method has completely closed. He even closes. But the application knows at what point the loop it is. When the method is called again appropriately, it knows it was stopped at that point in the iterator (from loop) and he continues from there, instead of him starting the loop from scratch, it starts from where it left off. This repeats every call of the method with yield until all the loop close.

This is why it is very common for a method that has a yield inside be called as part of another loop.

This process is called Lazy Evaluation (lazy assessment).

Another good example of working in a very simple sequence can be found in that article.

Utilizing

It is often used to delay the execution of a code. This allows to improve the performance in several scenarios because instead of running a whole loop to iterate in a data sequence it goes step by step to where it needs to go. And it definitely closes if whoever calls a method found what it needed without going through all values. In some cases all iteration can be avoided. In addition to this the time spent to iterate is for the moment of actual use.

It is a powerful resource that few understand how to use. Especially when creating a loop this way. People have more ease to understand concrete things and this is an abstraction that seems unnecessary. Use a method that has yield it’s simple but you have to understand that not all data comes at once. Create a method that creates a virtual iteration with yield is a little more complicated, needs to be well thought out.

An example:

// Display powers of 2 up to the exponent 8:
foreach (int i in Power(2, 8)) {
    Console.Write("{0} ", i);
}

public static IEnumerable<int> Power(int number, int exponent) {
    int counter = 0;
    int result = 1;
    while (counter++ < exponent) {
        result = result * number;
        yield return result;
    }
}

Call this code with debug connected and walk step by step through the program to understand how it runs. In debug you can see that after the first execution the code is always inside the while. I strongly advise doing this to follow up. This is a desktop test that it seems that programmers no longer learn to do.

Practical use

A good example is found in that reply in Programmers.SE:

//retorna um milhão de itens que podem ser iterados 
List<object> GetAllItems() {
    List<object> millionCustomers;
    database.LoadMillionCustomerRecords(millionCustomers); 
    return millionCustomers;
}

//chamada
int num = 0;
foreach(var itm in GetAllItems())  {
    num++;
    if (num == 5)
        break;
}
//Nota:um milhão de itens são retornados, somente 5 são usados. 

Now see with yield:

//retorna cada item em cada chamada
IEnumerable<object> IterateOverItems() {
    for (int i; i < database.Customers.Count(); ++i)
        yield return database.Customers[i];
}

//chamada
int num = 0;
foreach(var itm in IterateOverItems())  {
    num++;
    if (num == 5)
        break;
}
//somente executa os 5 itens dentre um milhão existente

See that the code runs until you find the number it is looking for, in case 5. In the first code it returns all the data running a loop a million times. In the second he goes one by one.

Note that the calling method does not know if you have one yield in the method called or not. This is an implementation detail. Still it is good for the developer to be aware of how it will run.

On the other hand the called method does not know when it will no longer need to be executed. It is prepared to perform everything if necessary. The calling method is who will determine when to close.

Imagine that this list is being returned by a user-interface program. Without the yield all data needs to be returned to be shown on the interface. This may take a few seconds, maybe even in extreme cases, minutes. The user will find that u "dick".

With the yield it is possible to take element by element and already go showing to the user without blocking the interface.

As this is not always simple to do the C# created a new abstraction, a new syntactic sugar, the async and await and that helps to assemble a more complex state machine.

Another example

In the same answer there is another interesting example showing that you can run two loops concomitantly interspersing its elements more efficiently since each loop one element at a time:

IEnumerable<object> EfficientMerge(List<object> list1, List<object> list2) {
    foreach(var o in list1) 
        yield return o; 
    foreach(var o in list2) 
        yield return o;
}

In the answer there are other examples that can be observed.

Performance comparison

I made an example generating a sequence of Fibonacci using both forms.

In it you can see that the yield is great for catching a limited amount of data. But do a Fork in the code and have all the data taken from the sequence. You’ll see that the performance gets worse. All this flow control and internal state has a cost. If you’re going to read every sequence of the yield is more expensive.

Yet it can be used to improve abstraction. Performance is not always the most important.

Behold working on dotNetFiddle.

Abstraction

Let’s say you have to take clients from a database. The logic to take that data can be a little complex. For a code that only wants to have the customers in hand, it doesn’t matter how you take the data, it only matters that the data is at hand.

So the solution that every good programmer uses is to encapsulate this complex logic into a method and the code that you need from customers only calls this method to have the list of customers without worrying about the implementation - which can even change without concern from the list consumer.

This is an abstraction. The problem is that it always comes back to the whole list. What if you want to establish some filter? You’ll have to take all the data and then "walk" through it to get what matters.

With the yield you have abstraction efficiently enabling the filter to be applied externally to the implementation that effectively takes the data in the database running only through the necessary data.

yield break

See the extracted code of that answer in the OS:

int i = 0;
while (true) {
    if (i < 5) {
        yield return i;
    } else {
        //quando chegar aqui ele sai do *loop* sem retornar um valor
        yield break;
    }
    i++;
}
Console.Out.WriteLine("não executa");

In this case the yield is needed near the break to indicate that the control state must be terminated. In case it does not leave only the while, it completely terminates this iterator without returning any value. It is a way to end the loop not allowing the same iteration to continue.

This is the way of saying within the method that controls the iteration with the yield that it no longer has elements to be read. So the loop the calling method will be closed also.

References

Official documentation microsoft.

I answer something about this and there’s already some explanation in that reply that even has a link to show the operation in detail.

That article starts a series that demonstrates the inner workings by one of the best developers on the face of the Earth.

In that question talks about a specific use of it.

I also answer about another use of it in that reply.

And there is still one of the most important uses in language, the whole LINQ is based on it, so the expressions are not evaluated at the time of its declaration but at the time of the use of the data, as demonstrated in that reply.

A good way to learn some uses is to look at the LINQ source code on the .Net. You can start with the Where(). Notice the use of yield. Hence LINQ has several methods that are executed in sequence on the same item and only after executing these methods of the LINQ expression does it go to another element of the data collection.

Other interesting article to read.

Several articles on the subject written by those who made it all work in the compiler.

  • Good explanation.

12

The keyword yield added to the language to help build iterators of the type IEnumerable.

In other words, there is a "hidden" state machine there that "remembers" the last position you were in within the context of your iterator. Therefore, when you run the code below, you will get the next value until you finish the iterator:

public IEnumerable<int> IdadesImparesJovens()
{
    yield return 1;
    yield return 3;
    yield return 5;
    yield return 7;
    yield return 9;
    yield return 11;
    yield break;  // não necessário.. mas interessante...
}

public void VerificarIdades()
{
    foreach(int i in IdadesImparesJovens())
    {
        Console.WriteLine(i.ToString());
    }

}

It is logical that this is a very simple example, but it only serves to debug. Write a console program and try debugging to check what happens in iterations.

In this case, the break was not necessary. But let’s assume (and every time we use the yield) you have a more complex logic. In this case, it would be interesting to use the yield to break the sequence of its iterators.

Perks:

  • The yield gives the ability to read as you go (Lazy-loading). Old constructions made you need an entire array in memory to iterate over it. With yield you can iterate and read the information as needed (Linq uses this constructions a lot on the cloths);

  • Helps preserve state during iterations

11

Yield returns an object that implements the interface Ienumerable, namely a iterator.

Follow an example, where yield return is used to return a validation message list:

using System.IO;
using System;
using System.Collections.Generic;

class Program
{
    static void Main()
    {
        Entidade novaEntidade = new Entidade();
        var mensagens = novaEntidade.ObterMensagensValidacao();
        foreach(var msg in mensagens)
        {
            Console.WriteLine(msg);
        }
    }

}

public class Entidade
{
    public string Nome { get; set; }

    public string Telefone { get; set; }

    public IEnumerable<string> ObterMensagensValidacao()
    {
        if(String.IsNullOrWhiteSpace(this.Nome))
        {
            yield return "O nome deve ser preenchido.";
        }

        if(String.IsNullOrWhiteSpace(this.Telefone))
        {
            yield return "O telefone deve ser preenchido.";
        }
    }
}

Already yield break you can use to exit an iteration. yield break specifies that a iterator came to an end.

Example:

public IEnumerable<int> ObterNumeros()
{
    int i = 0;
    while (true) {
        if (i < 5) {
            yield return i;
        } else {
            yield break;
        }
            i++;
    }
}

When iterating on the return of the above method the numbers from 0 to 4 will be printed as per yield break used.

    var numeros = novaEntidade.ObterNumeros();
    foreach(var num in numeros)
    {
        Console.Write(num); 
        //Serão impressos: 0 1 2 3 4
    }

Browser other questions tagged

You are not signed in. Login or sign up in order to post.