Tidy Tag Table

Preamble

This started with issue (PR) #848, which, in essence, is the addition of the slot tag in 'libTidy', a good, important aim...

Some of this disccussion gets into how to 'define' this NEW tag, slot, in the 'libTidy' def_tags table definitions. It seems the current PR choice of 'ParseInline', with only a 'CM_INLINE' flag... is a question... or at least other options/flags be tested, considered, ... but in the end, maybe this is indeed the best choice for 'libTidy'...

It seems the choice may be related to the definition of Phrasing content, and Flow Content... The tag slot is present in BOTH... for a quick comparison, these online pages have been roughly added below Phrasing and Flow.

But 'libTidy' has NO definition of 'Phrasing content' or 'Flow content', but uses the tag 'parser', and 'flags', to achieve maybe the same aim...

Maybe a study on the current C table, in libtidy, will yield some clues...

Input file: D:\UTILS\tidy\tidy-html5\src\tags.c

index: | Table 1 - Parsers | Table 2 - Bit Flags | Table 3 - Inline | Table 4 - Block | Table 5 - HTML Versions |


Table 1: Tidy uses some 22 different parsers for the HTML. They are associated, to each of the some 152 tags 'libTidy' supports.

Parsers 22 used, with tag list
Parser Count TAGS
ParseInline 58 [abbr acronym b bdi bdo big blink button cite code comment data del dfn dt em font h1 h2 h3 h4 h5 h6 i ilayer ins kbd label legend mark marquee menuitem meter nobr noembed output p picture progress q rb rbc rp rt rtc ruby s samp small span strike strong sub sup time tt u var ]
ParseBlock 44 [a address align applet article aside audio blockquote canvas caption center dd details dialog div fieldset figcaption figure footer form header hgroup iframe layer li main map menu multicol nav nolayer nosave noscript object section servlet slot source summary td template th track video ]
ParseEmpty 20 [area base basefont bgsound br col command embed frame hr img input isindex keygen link meta nextid param spacer wbr ]
ParsePre 4 [listing plaintext pre xmp ]
ParseList 3 [dir ol ul ]
ParseRowGroup 3 [tbody tfoot thead ]
ParseScript 3 [script server style ]
ParseNamespace 2 [math svg ]
ParseText 2 [option textarea ]
ParseBody 1 [body ]
ParseColGroup 1 [colgroup ]
ParseDatalist 1 [datalist ]
ParseDefList 1 [dl ]
ParseFrameSet 1 [frameset ]
ParseHTML 1 [html ]
ParseHead 1 [head ]
ParseNoFrames 1 [noframes ]
ParseOptGroup 1 [optgroup ]
ParseRow 1 [tr ]
ParseSelect 1 [select ]
ParseTableTag 1 [table ]
ParseTitle 1 [title ]

It seems there is/are 1 slot tag(s) in the above Table 1

Go to top


Table 2: Tidy uses some 21 different bit flags for the HTML parsing...

Flags 21 used, with tag list
Flag Count TAGS
CM_INLINE 79 [a abbr acronym applet audio b basefont bdi bdo big blink br button cite code command comment data datalist del dfn em embed font i iframe ilayer img input ins kbd keygen label legend map mark marquee math menuitem meter nobr noembed nolayer noscript object output param picture progress q rb rbc rp rt rtc ruby s samp script select server servlet slot small source spacer span strike strong sub sup svg textarea time tt u var video wbr ]
CM_BLOCK 64 [a address align area article aside audio blockquote canvas center del details dialog dir div dl fieldset figcaption figure footer form h1 h2 h3 h4 h5 h6 header hgroup hr ins isindex layer link listing main math menu menuitem meta multicol nav noframes nolayer nosave noscript ol p plaintext pre script section server slot source style summary svg table template track ul video xmp ]
CM_EMPTY 22 [area base basefont bgsound br col command embed frame hr img input isindex keygen link meta nextid param source spacer track wbr ]
CM_OPT 17 [body colgroup dd dt head html li marquee optgroup option p tbody td tfoot th thead tr ]
CM_HEAD 11 [base bgsound command link meta nextid noscript script server style title ]
CM_MIXED 11 [a del ins math menuitem nolayer noscript script server slot svg ]
CM_TABLE 7 [caption col colgroup tbody tfoot thead tr ]
CM_HEADING 6 [h1 h2 h3 h4 h5 h6 ]
CM_IMG 6 [applet embed img input object servlet ]
CM_FIELD 5 [datalist optgroup option select textarea ]
CM_NO_INDENT 5 [dd dt li td th ]
CM_HTML 4 [body frameset head html ]
CM_OBSOLETE 4 [dir listing plaintext xmp ]
CM_FRAMES 3 [frame frameset noframes ]
CM_OBJECT 3 [applet object servlet ]
CM_OMITST 3 [body head html ]
CM_PARAM 3 [applet object servlet ]
CM_ROWGRP 3 [tbody tfoot thead ]
CM_DEFLIST 2 [dd dt ]
CM_ROW 2 [td th ]
CM_LIST 1 [li ]

It seems there is/are 0 slot tag(s) in the above Table 2

Go to top


Table 3: Tags that use the ParseInline parser 6...

ParseInline Flags, with tag list
Flags Count TAGS
CM_INLINE 46 [abbr acronym b bdi bdo big blink button cite code comment data dfn em font i ilayer kbd label legend mark meter nobr noembed output picture progress q rb rbc rp rt rtc ruby s samp small span strike strong sub sup time tt u var ]
CM_BLOCK|CM_HEADING 6 [h1 h2 h3 h4 h5 h6 ]
CM_BLOCK|CM_INLINE|CM_MIXED 3 [del ins menuitem ]
CM_BLOCK|CM_OPT 1 [p ]
CM_DEFLIST|CM_NO_INDENT|CM_OPT 1 [dt ]
CM_INLINE|CM_OPT 1 [marquee ]

It seems there is/are 0 slot tag(s) in the above Table 3

Go to top


Table 4: Tags that use the ParseBlock parser 12...

ParseBlock Flags, with tag list
Flags Count TAGS
CM_BLOCK 26 [address align article aside blockquote canvas center details dialog div fieldset figcaption figure footer form header hgroup layer main menu multicol nav nosave section summary template ]
CM_BLOCK|CM_INLINE|CM_MIXED 3 [a nolayer slot ]
CM_IMG|CM_INLINE|CM_OBJECT|CM_PARAM 3 [applet object servlet ]
CM_BLOCK|CM_INLINE 2 [audio video ]
CM_INLINE 2 [iframe map ]
CM_NO_INDENT|CM_OPT|CM_ROW 2 [td th ]
CM_BLOCK|CM_EMPTY 1 [track ]
CM_BLOCK|CM_EMPTY|CM_INLINE 1 [source ]
CM_BLOCK|CM_HEAD|CM_INLINE|CM_MIXED 1 [noscript ]
CM_DEFLIST|CM_NO_INDENT|CM_OPT 1 [dd ]
CM_LIST|CM_NO_INDENT|CM_OPT 1 [li ]
CM_TABLE 1 [caption ]

It seems there is/are 0 slot tag(s) in the above Table 4

Go to top


Tidy's 153 def_tags table list, alphabetic
A ABBR ACRONYM ADDRESS ALIGN APPLET AREA ARTICLE ASIDE AUDIO B BASE BASEFONT BDI BDO BGSOUND BIG BLINK BLOCKQUOTE BODY BR BUTTON CANVAS CAPTION CENTER CITE CODE COL COLGROUP COMMAND COMMENT DATA DATALIST DD DEL DETAILS DFN DIALOG DIR DIV DL DT EM EMBED FIELDSET FIGCAPTION FIGURE FONT FOOTER FORM FRAME FRAMESET H1 H2 H3 H4 H5 H6 HEAD HEADER HGROUP HR HTML I IFRAME ILAYER IMG INPUT INS ISINDEX KBD KEYGEN LABEL LAYER LEGEND LI LINK LISTING MAIN MAP MARK MARQUEE MATHML MENU MENUITEM META METER MULTICOL NAV NEXTID NOBR NOEMBED NOFRAMES NOLAYER NOSAVE NOSCRIPT OBJECT OL OPTGROUP OPTION OUTPUT P PARAM PICTURE PLAINTEXT PRE PROGRESS Q RB RBC RP RT RTC RUBY S SAMP SCRIPT SECTION SELECT SERVER SERVLET SLOT SMALL SOURCE SPACER SPAN STRIKE STRONG STYLE SUB SUMMARY SUP SVG TABLE TBODY TD TEMPLATE TEXTAREA TFOOT TH THEAD TIME TITLE TR TRACK TT U UL UNKNOWN VAR VIDEO WBR XMP


Table of HTML 137 version list supported, and the list of tags, with that same version.

HT20|HT32|H40T|H41T|X10T|H40F|H41F|X10F|H40S|H41S|X10S|XH11|XB10|HT50|XH50 38 A ADDRESS BASE BLOCKQUOTE BODY BR CITE CODE DD DL DT EM FORM H1 H2 H3 H4 H5 H6 HEAD HTML IMG INPUT KBD LI LINK META OL OPTION P PRE SAMP SELECT STRONG TEXTAREA TITLE UL VAR
HT20|HT32|H40T|H41T|X10T|H40F|H41F|X10F|H40S|H41S|X10S|XH11|xxxx|HT50|XH50 3 B HR I
HT20|HT32|H40T|H41T|X10T|H40F|H41F|X10F|H40S|H41S|X10S|XH11|xxxx|xxxx|xxxx 1 TT
HT20|HT32|H40T|H41T|X10T|H40F|H41F|X10F|xxxx|xxxx|xxxx|xxxx|xxxx|HT50|XH50 1 MENU
HT20|HT32|H40T|H41T|X10T|H40F|H41F|X10F|xxxx|xxxx|xxxx|xxxx|xxxx|xxxx|xxxx 2 DIR ISINDEX
HT20|HT32|xxxx|xxxx|xxxx|xxxx|xxxx|xxxx|xxxx|xxxx|xxxx|xxxx|xxxx|xxxx|xxxx 3 LISTING PLAINTEXT XMP
HT20|xxxx|xxxx|xxxx|xxxx|xxxx|xxxx|xxxx|xxxx|xxxx|xxxx|xxxx|xxxx|xxxx|xxxx 1 NEXTID
xxxx|HT32|H40T|H41T|X10T|H40F|H41F|X10F|H40S|H41S|X10S|XH11|XB10|HT50|XH50 8 CAPTION DFN DIV PARAM TABLE TD TH TR
xxxx|HT32|H40T|H41T|X10T|H40F|H41F|X10F|H40S|H41S|X10S|XH11|xxxx|HT50|XH50 7 AREA MAP SCRIPT SMALL STYLE SUB SUP
xxxx|HT32|H40T|H41T|X10T|H40F|H41F|X10F|H40S|H41S|X10S|XH11|xxxx|xxxx|xxxx 1 BIG
xxxx|HT32|H40T|H41T|X10T|H40F|H41F|X10F|xxxx|xxxx|xxxx|xxxx|xxxx|HT50|XH50 1 U
xxxx|HT32|H40T|H41T|X10T|H40F|H41F|X10F|xxxx|xxxx|xxxx|xxxx|xxxx|xxxx|xxxx 5 APPLET BASEFONT CENTER FONT STRIKE
xxxx|xxxx|H40T|H41T|X10T|H40F|H41F|X10F|H40S|H41S|X10S|XH11|XB10|HT50|XH50 5 ABBR LABEL OBJECT Q SPAN
xxxx|xxxx|H40T|H41T|X10T|H40F|H41F|X10F|H40S|H41S|X10S|XH11|XB10|xxxx|xxxx 1 ACRONYM
xxxx|xxxx|H40T|H41T|X10T|H40F|H41F|X10F|H40S|H41S|X10S|XH11|xxxx|HT50|XH50 13 BDO BUTTON COL COLGROUP DEL FIELDSET INS LEGEND NOSCRIPT OPTGROUP TBODY TFOOT THEAD
xxxx|xxxx|H40T|H41T|X10T|H40F|H41F|X10F|xxxx|xxxx|xxxx|xxxx|xxxx|HT50|XH50 2 IFRAME S
xxxx|xxxx|H40T|H41T|X10T|H40F|H41F|X10F|xxxx|xxxx|xxxx|xxxx|xxxx|xxxx|xxxx 1 NOFRAMES
xxxx|xxxx|xxxx|H41T|X10T|xxxx|H41F|X10F|xxxx|H41S|X10S|XH11|xxxx|HT50|XH50 2 MATHML SVG
xxxx|xxxx|xxxx|xxxx|xxxx|H40F|H41F|X10F|xxxx|xxxx|xxxx|xxxx|xxxx|xxxx|xxxx 2 FRAME FRAMESET
xxxx|xxxx|xxxx|xxxx|xxxx|xxxx|xxxx|xxxx|xxxx|xxxx|xxxx|XH11|xxxx|HT50|XH50 3 RP RT RUBY
xxxx|xxxx|xxxx|xxxx|xxxx|xxxx|xxxx|xxxx|xxxx|xxxx|xxxx|XH11|xxxx|xxxx|xxxx 3 RB RBC RTC
xxxx|xxxx|xxxx|xxxx|xxxx|xxxx|xxxx|xxxx|xxxx|xxxx|xxxx|xxxx|xxxx|HT50|XH50 34 ARTICLE ASIDE AUDIO BDI CANVAS COMMAND DATA DATALIST DETAILS DIALOG EMBED FIGCAPTION FIGURE FOOTER HEADER HGROUP KEYGEN MAIN MARK MENUITEM METER NAV OUTPUT PICTURE PROGRESS SECTION SLOT SOURCE SUMMARY TEMPLATE TIME TRACK VIDEO WBR

Go to top


Hopefully more libTidy tag/element analysis to follow...


Extracts from online documentation

NOTE: Copyright © 2019 WHATWG (Apple, Google, Mozilla, Microsoft). This work is licensed under a Creative Commons Attribution 4.0 International License. Also see Acknowledgments

It is obviously better to view this list online, where there are 'links' to each element, which may give further insight... and read the latest, since this is a Living Standard, thus subject to change, but this is a quick, rough, flat list for comparison purposes only.

3.2.5.2.2 Flow content

Most elements that are used in the body of documents and applications are categorized as flow content.

a, abbr, address, area (if it is a descendant of a map element), article, aside, audio, b, bdi, bdo, blockquote, br, button, canvas, cite, code, data, datalist, del, details, dfn, dialog, div, dl, em, embed, fieldset, figure, footer, form, h1, h2, h3, h4, h5, h6, header, hgroup, hr, i, iframe, img, input, ins, kbd, label, link (if it is allowed in the body ), main (if it is a hierarchically correct main element ), map, mark, MathML math, menu, meta (if the itemprop attribute is present), meter, nav, noscript, object, ol, output, p, picture, pre, progress, q, ruby, s, samp, script, section, select, slot, small, span, strong, sub, sup, SVG svg, table, template, textarea, time, u, ul, var, video, wbr, autonomous custom elements, text

3.2.5.2.5 Phrasing content

Phrasing content is the text of the document, as well as elements that mark up that text at the intra-paragraph level. Runs of phrasing content form paragraphs.

a abbr area (if it is a descendant of a map element) audio bbdi bdo br button canvas cite code data datalist del dfn em embed iiframe img input ins kbd label link (if it is allowed in the body )map mark MathML math meta (if the itemprop attribute is present) meter noscript object output picture progress qruby ssamp script select slot small span strong sub sup SVG svg template textarea time uvar video wbr autonomous custom elements text


Conclusions - to be done

But it seems there are no conclusion to be drawn

Go to top

Generated 2020/12/02 17:57:23 by perl script tidy-tags.pl

checked by tidy  Valid HTML 5