Yield does not return data

Asked

Viewed 216 times

5

When calling the method, an HTML component enumerable should be returned.

I’m using the HTML Agility Pack to read an HTML file. The same method works as expected when removing the yield and add manually to a list

    HtmlNode slideCineAll = GetNodeById(cinema, "slide-cine-all");
    HtmlNode section = GetNodeByName(slideCineAll, "section");
    IEnumerable<HtmlNode> articles = GetNodesByName(section, "article");

    private static IEnumerable<HtmlNode> GetNodesByName(HtmlNode root, string node)
    {
        foreach (HtmlNode link in root.ChildNodes)
        {
            if (link.Name.Equals(node))
            {
                yield return link;
            }
        }
    }

    private static List<HtmlNode> GetNodesByNameList(HtmlNode root, string node)
    {
        List<HtmlNode> nodes = new List<HtmlNode>();
        foreach (HtmlNode link in root.ChildNodes)
        {
            if (link.Name.Equals(node))
            {
                nodes.Add(link);
            }
        }
        return nodes;
    }

This is the result stored in the variable when executing the method

{ConsoleApplication1.Program.GetNodesByName}
node: null
root: null
System.Collections.Generic.IEnumerator<HtmlAgilityPack.HtmlNode>.Current: null
System.Collections.IEnumerator.Current: null

Expected result

values
Count = 20
[0]: Name: "article"}
.
.
.
values[0]
_attributes: {HtmlAgilityPack.HtmlAttributeCollection}
_childnodes: {HtmlAgilityPack.HtmlNodeCollection}
_endnode: Name: "article"}
.
.
.

This is the structure I’m going through, through the method GetNodesByName or GetNodesByNameList i can retrieve a list of any structure node html

<div id="slide-cine-all">
<section>
    <article>
        <!--mais elementos-->
    </article>
    <article>
        <!--mais elementos-->
    </article>
    <article>
        <!--mais elementos-->
    </article>
    <article>
        <!--mais elementos-->
    </article>
    <article>
        <!--mais elementos-->
    </article>
    <article>
        <!--mais elementos-->
    </article>
</section>
</div>

As described at the beginning, the Getnodesbynamelist method returns all items, in this case type article found in the file structure, but the same doesn’t happen when I use Yield.

  • 1

    Since we don’t know what the expected result is, it’s difficult to help. We’re not even seeing the code that produces the variable you’re referencing. Give us more information so we understand the problem.

  • I added some more information.

  • Much improved but there is still not enough information to identify what is wrong. I do not know for example what is serving the GetNodesByNameList to define the problem. But it certainly lacks information to know how you expect to arrive at the expected result, what data is being used to produce this result. On the other hand, perhaps you can better describe what problem you are encountering so that someone can provide you with a solution.

  • I added some more information and the data used

  • Only when searching for the "article" does it give the problem? Or with "Section" too? Are there other calls before? How is this GetNodeById? It is difficult to see the whole. The problem may be occurring because of previous problems or in parts you are not demonstrating. But I think for example that the method with yield works but does not return what you But it’s still just a kick, I don’t know if I understand the whole problem.

  • the first two lines are working correctly. Note that the method GetNodesByNameList has the same structure as the method GetNodesByName, but in the case it uses the yield, using the list all these Articles are returned and with yield that doesn’t happen. It’s clearer now?

  • I think I understand the problem. It seems to me that you don’t understand the workings of yield. It doesn’t work like it seems you expect. You know it’s a return, that it closes the execution in the first interaction of the for each? Of course, the interaction can be picked up where it left off on the next call. I’m not sure what your goal is but it seems to me that in this case you can’t use the yield.

  • The first call of the method is returned to "Section" and nothing else. Then you will search for "article" and have nothing to find. I would probably find if I called the method by searching for "Section" more often, then I would read the whole node. Without the yield comes the whole node, so it works. You have to know that a method that returns a yield within a loop qq will execute a step from this loop in each call to method. If you want it to come 10 links, will have to call the GetNodesByName 10 times.

Show 3 more comments

1 answer

5


See the documentation of yield return. He doesn’t do what seems to be what you expect of him. The part of return is important. When it arrives on this line, it finishes the execution of the method. So your code only returns an element of the existing nodes in your XML.

You can return more. You need to call the method more often. Each call will run a new line from where it stopped. The yield creates something called Generator (in English). It controls execution through a hidden state that determines at which point of an enumeration the program is running, so the call can continue from where it left off. Note that it returns a enumerable type and not the type of what you want in itself. This enumerable structure is what controls the continuity of execution from where it left off.

So your problem is that you get a single link of the knot and then tries to search inside it as if there were other elements of this knot. Of course there are no other elements, you haven’t read them yet. That’s where the problem is.

In the other method that works by not having the yield, the loop runs completely and scans all nodes and returns a more complete tree that can then be searched without problem. All you need is there.

I didn’t understand the problem as a whole but I think in this case the yield is getting in the way. And I advise using it only when you fully understand how it works. It is excellent but not a solution to all problems. I am not saying that this problem cannot benefit from it (generate efficiency by not sweeping the whole structure but only what is needed at the moment) but it would need to change some things in the code that consumes this method. In practice when you use one yield, roughly speaking, you will have another loop external to scan the entire structure you are searching for (of course you can also do it manually again again).

I suggest to inspect the data and follow the execution on debug to better understand what is happening with the code. It can help to learn about the functioning of the yield and see clearly your problem, and who knows until you find a better solution. Or explain otherwise the problem that I couldn’t see better than this.

To help understand execute the following code taken from this response in the OS:

public void Consumer() {
    foreach(int i in Integers()) {
        Console.WriteLine(i.ToString());
    }
}

public IEnumerable<int> Integers() {
    yield return 1;
    yield return 2;
    yield return 4;
    yield return 8;
    yield return 16;
    yield return 16777216;
}

Or this one response from the OS:

// Display powers of 2 up to the exponent 8:
foreach (int i in Power(2, 8)) {
    Console.Write("{0} ", i);
}

public static IEnumerable<int> Power(int number, int exponent) {
    int counter = 0;
    int result = 1;
    while (counter++ < exponent) {
        result = result * number;
        yield return result;
    }
}

I put in the Github for future reference.

Noted that consumer code always end up having to repeat the calls, in a certain way the repeats existing in the generator code (of yield)? The great advantage of yield for most situations is to create better abstractions.

See a quick explanation of what the Internals of command. And a more complete explanation.

  • I get it. This is similar to LINQ, where the query is stored in the variable, and the data itself is recovered when the variable containing the query is used. It crossed my mind when I saw that the variable had a reference to the method GetNodesByName instead of a IEnumerable<HtmlNode>. That is what is described?

  • 1

    LINQ is all based on yield. I’m not sure I understand your question. It would be difficult for me to test here but you can change the code and do the internal search for all the elements of "Section" within the method itself when searching for "article". It is not so simple, mainly generic, but it is possible.

  • That’s exactly what @bigown was. I tested it now, and it worked. Before posting the question I stopped the execution when I saw that the object did not have what I expected, but it is as you described, I had not yet read the elements. If I had left the code running, I would not have noticed. But it served as learning. :)

Browser other questions tagged

You are not signed in. Login or sign up in order to post.