How does the interpretation of HTML code work?

Asked

Viewed 130 times

9

I’m making a tree to derive an HTML code. And I’m treating some particular cases. And one of those cases is just when someone opens a tag and doesn’t close it:

<html>
  <body>
    <div>
      <div>
        Hello World!!
      </div>
      Hello People!!
  </body>
</html>

I have studied compilers at the university, but I don’t understand how the html interpretation works. In the case I have shown to close a </div> before the </body>. Opening the page and looking at the code I saw that the page inserts a </div> before the </body. But what made the page add the </div> there and not before the text: Hello People!!. It was the fact of </body> be called and still the first <div> not yet closed?

Code of the generated page:

<html><head></head><body>
    <div>
      <div>
        Hello World!!
      </div>
      Hello People!!
  
</div></body></html>

Some specific name for the interpretation of html code?

  • 2

    Interestingly, I had already noticed that the Browser itself closes some tags, actually according to the W3C documentation several tags do not need to be closed, as the <p> etc, But as the browser "hit" where to close the div there after the text really can’t explain...

  • 1

    I’ve read about it somewhere. But I think modern browsers have a technology that automatically fixes several things.

  • 1

    But I think the div closing before the </body> is because he disregards text knots. It makes more sense to close the element after all text nodes, because the texts are supposed to be part of the div.

  • 1

    Or rather, it will close the tag where it closes your father, in case, the body.

  • 2

    Nor can I say exactly what the internal implementation, but keeping an open tag counter and seeing by opening and closing sequence, when reaching the body closure, the </body> the browser knows that there is a div to close and that it has to be closed before, and then chooses to close there because from now on it cannot be right

2 answers

2

So, it doesn’t mean that if the browser closed the tags, it closed them in the right place. I’ve had serious problems with tag forgetfulness at the beginning of my history, especially with Divs. The truth is that if you don’t close the correct way, your code probably won’t work as you’d like, although it "seems" to be correct, it won’t work properly.

Obs.: For simple codes it may even work, but when you have many complex codes, surely not.

To answer your question: the way the browser closes I believe to be similar to the way the development IDES suggest to the programmer, is the browser programming, was made to assimilate this, but it does not miracle.

This is called browser error tolerance.

You never get an "Invalid Syntax" error in an HTML page. Browsers fix any invalid content and proceed with their functions.

Take this HTML as an example:

<html>
  <mytag>
  </mytag>
  <div>
  <p>
  </div>
    Really lousy HTML
  </p>
</html>

I must have broken a million rules ("mytag" is not a standard tag, incorrect nesting of the "p" and "div" elements and more), but the browser still displays the content correctly. Therefore, much of the parser code is done to fix HTML author errors. You can learn more about how browsers work in this link

1


You mixed a lot of things in your question:

  • "HTML code", the correct is HTML documentation.

  • compilers interpret lexemas and not documentation!

About the question itself, it needs to be closed or not?

You always need to tag, not just because good practice tells you to, but because it depends on the browser you’re using, which in turn counts tags and adds what’s missing.

  • I get your point, I know html is not an LP. But I would like to understand how the browser "understands" this language.

  • 2

    Hudson and why if I put <div> lalala <p>fsdfsd to <div> locks in the right place, but if I put <p>fsdfsd <div>lalala first closes the <p> and then closes the <div>, the expected would be the tag <p> close after the </div> and that doesn’t happen... but it did <p></p><div></div>

  • 1

    @hugocsl depends on the browser, in the official documentation you need to always close the link above.

  • 1

    @Viniciusmorais every browser has its engine. Vc can create its engine in c++, c#, java...

  • 3

    FF didn’t even close the tags, and Edge behaved like Chrome, if it is <div> lalala <p>fsdfsd stays <div> lalala <p>fsdfsd</p> </div> but if so: <p>fsdfsd <div>lalala the result is like this: <p>fsdfsd</p> <div>lalala</div> Both in Chrome and Edge, No FF as I said in the close...

  • 1

    @hugocsl by W3 default will be before the next tag, the missing tag will be closed before opening the next tag. with examples: https://www.w3.org/TR/html51/syntax.html#optional-tags

  • 2

    If that was already clear, the doubt is because P he closes one way and Div he closes another. P looks like it closes at the end of its contents, whereas the div looks like it closes from the bottom up. Here is the doubt, just you test with this simple code there I commented and you will see that the behavior of one and the other is different...

  • 1

    As I said <P> should not even close according to W3 the rule to close 'and that the next tag is one of these: address, article, aside, blockquote, details, div, dl, fieldset, figcaption, figure, footer, form, h1, h2, h3, h4, h5, h6, header, hr, main, menu, nav, ol, p, pre, section, table, ou ul what happens that the browser is doing that job.

  • 1

    and even then it does not guarantee that browsers created by third parties reproduce these rules in full.

  • 1

    @hugocsl I believe it is because the div cannot stay inside a <p>, so the <p> is closed before the opening of the div.

  • as I said above the <p> will be closed before pq in the list cited above the div this included.

  • already when it starts with "address", "article", "aside", "blockquote", "center", "Details", "dir", "div", "dl", "fieldset", "figcaption", "figure", "footer", "header", "main", "menu", "Nav", "ol", "p", "Section", "Summary", "ul" p is inside. rules of W3.

Show 7 more comments

Browser other questions tagged

You are not signed in. Login or sign up in order to post.