A request for a web page or app starts with an HTML request. The server returns the HTML - response headers and data.
The browser then begins parsing the HTML, converting the received bytes to the DOM tree.
The browser initiates requests every time it finds links to external resources, be they stylesheets, scripts, or embedded image references.
Some requests are blocking, which means the parsing of the rest of the HTML is halted until the imported asset is handled.
Render blocking resources are static files, such as fonts, HTML, CSS, and JavaScript files
The browser continues to parse the HTML making requests and building the DOM, until it gets to the end, at which point it constructs the CSS object model.
With the DOM and CSSOM complete, the browser builds the render tree, computing the styles for all the visible content.
After the render tree is complete, layout occurs, defining the location and size of all the render tree elements.
Once complete, the page is rendered, or 'painted' on the screen.
Not a context free grammar #
As we have seen in the parsing introduction, grammar syntax can be defined formally using formats like BNF.
Unfortunately all the conventional parser topics do not apply to HTML.
HTML cannot easily be defined by a context free grammar that parsers need.
There is a formal format for defining HTML - DTD (Document Type Definition) - but it is not a context free grammar.
This appears strange at first sight; HTML is rather close to XML. There are lots of available XML parsers. There is an XML variation of HTML - XHTML - so what's the big difference?
The difference is that the HTML approach is more "forgiving": it lets you omit certain tags (which are then added implicitly), or sometimes omit start or end tags, and so on. On the whole it's a "soft" syntax, as opposed to XML's stiff and demanding syntax.
This seemingly small detail makes a world of a difference. On one hand this is the main reason why HTML is so popular: it forgives your mistakes and makes life easy for the web author. On the other hand, it makes it difficult to write a formal grammar. So to summarize, HTML cannot be parsed easily by conventional parsers, since its grammar is not context free. HTML cannot be parsed by XML parsers.
HTML DTD #
HTML definition is in a DTD format. This format is used to define languages of the SGML family. The format contains definitions for all allowed elements, their attributes and hierarchy. As we saw earlier, the HTML DTD doesn't form a context free grammar.
There are a few variations of the DTD. The strict mode conforms solely to the specifications but other modes contain support for markup used by browsers in the past. The purpose is backwards compatibility with older content. The current strict DTD is here: www.w3.org/TR/html4/strict.dtd
The parsing algorithm #
As we saw in the previous sections, HTML cannot be parsed using the regular top down or bottom up parsers.
The 4 reasons are:
1- The forgiving nature of the language.
2-The fact that browsers have traditional error tolerance to support well known cases of invalid HTML.
3-The parsing process is reentrant. For other languages, the source doesn't change during parsing, but in HTML, dynamic code (such as script elements containing document.write() calls) can add extra tokens, so the parsing process actually modifies the input.
4-Unable to use the regular parsing techniques, browsers create custom parsers for parsing HTML.
The parsing algorithm is described in detail by the HTML5 specification. The algorithm consists of two stages: tokenization and tree construction.
Tokenization is the lexical analysis, parsing the input into tokens. Among HTML tokens are start tags, end tags, attribute names and attribute values.
The tokenizer recognizes the token, gives it to the tree constructor, and consumes the next character for recognizing the next token, and so on until the end of the input.
- 1 more item...
-
Render tree construction
Rendering steps include style, layout, paint and, in some cases, compositing.
The CSSOM and DOM trees created in the parsing step are combined into a render tree which is then used to compute the layout of every visible element, which is then painted to the screen.
In some cases, content can be promoted to their own layers and composited, improving performance by painting portions of the screen on the GPU instead of the CPU, freeing up the main thread.
-
Interactivity
Once the main thread is done painting the page, you would think we would be "all set."
That isn't necessarily the case.
If the load includes JavaScript, that was correctly deferred, and only executed after the onload event fires, the main thread might be busy, and not available for scrolling, touch, and other interactions.
- 1 more item...