Update of Accessibility Checks

index

This page is under HEAVY man with shovel, digging a hole ;=)) construction ;=)) take care ...

Preamble:

The following are very helpful for some background decisions and thoughts ...

- CVS Logs:
access.c - http://tidy.cvs.sourceforge.net/tidy/tidy/src/access.c?view=log
accesscases.txt - http://tidy.cvs.sourceforge.net/tidy/tidy/test/accesscases.txt?view=log 

- Features and Requests:
obsolete items - http://tidy.sf.net/feature/1169854 

Using the CVS source as at 5 October, 2006, the 'failed' tests are -

As is: from CVS - 25 FAILED [
1-1-1-11 1-1-1-12 1-5-1-1 3-2-1-1 3-3-1-1 3-6-1-1 3-6-1-2 3-6-1-4 4-3-1-2 5-1-2-3 5-5-1-3 5-6-1-3 6-3-1-4 6-5-1-2 6-5-1-4 10-2-1-1 10-2-1-2 10-4-1-1 10-4-1-2 10-4-1-3 11-2-1-4 11-2-1-7 13-1-1-5 13-1-1-6 13-10-1-1] ...

Applying one simple change in the code, and that is the location at which the 'accessibility tests' are done, will reduce that by four (4).

--- F:\FGCVS\Tidy\src\parser.c  Sat Sep 16 11:39:25 2006
+++ F:\GTools\tidyproj\tidycvs6-3\src\parser.c  Thu Oct 05 17:32:13 2006
@@ -4101,7 +4101,13 @@
         break;
     }

-    if (!TY_(FindHTML)(doc))
+#if SUPPORT_ACCESSIBILITY_CHECKS
+    /* do this BEFORE any MORE document fixing */
+     if ( cfg( doc, TidyAccessibilityCheckLevel ) > 0 )
+         TY_(AccessibilityChecks)( doc );
+#endif
+
+     if (!TY_(FindHTML)(doc))
     {
         /* a later check should complain if <body> is empty */
         html = TY_(InferredTag)(doc, TidyTag_HTML);
--- F:\FGCVS\Tidy\src\tidylib.c Thu Oct 05 12:39:49 2006
+++ F:\GTools\tidyproj\tidycvs6-3\src\tidylib.c Thu Oct 05 17:41:24 2006
@@ -1175,7 +1175,6 @@

 int         tidyDocRunDiagnostics( TidyDocImpl* doc )
 {
-    uint acclvl = cfg( doc, TidyAccessibilityCheckLevel );
     Bool quiet = cfgBool( doc, TidyQuiet );
     Bool force = cfgBool( doc, TidyForceOutput );

@@ -1188,11 +1187,6 @@

     if ( doc->errors > 0 && !force )
         TY_(NeedsAuthorIntervention)( doc );
-
-#if SUPPORT_ACCESSIBILITY_CHECKS
-     if ( acclvl > 0 )
-         TY_(AccessibilityChecks)( doc );
-#endif

      return tidyDocStatus( doc );
 }

After the code modification -

Change in 'test' location - 21 FAILED [
1-1-1-11 1-1-1-12 1-5-1-1 3-3-1-1 3-6-1-1 3-6-1-2 3-6-1-4 4-3-1-2 5-1-2-3 5-5-1-3 5-6-1-3 6-3-1-4 6-5-1-4 10-2-1-1 10-2-1-2 10-4-1-1 10-4-1-2 10-4-1-3 13-1-1-5 13-1-1-6 13-10-1-1]

Fixed, by moving the location of the test: - 4
3-2-1-1 6-5-1-2 11-2-1-4 11-2-1-7

Out of the 17 remaining, it should be noted that 8 -
1-1-1-11 1-1-1-12 10-2-1-2 10-4-1-1 10-4-1-2 10-4-1-3 13-1-1-5 13-1-1-6
are mentioned in the 'obsolete' feature - http://tidy.sf.net/feature/1169854 

That leaves about 9 outstanding -
3-3-1-1 3-6-1-2 3-6-1-4 4-3-1-2 5-1-2-3 5-5-1-3 5-6-1-3 6-3-1-4 6-5-1-4
with some questions on 4
1-5-1-1 3-6-1-1 10-2-1-1 13-10-1-1

Of the 9 outstanding, 3 are due to the 'space-eating-tidy' - it has swallowed the spaces, making no record of having done so -
TABLE_MAY_REQUIRE_HEADER_ABBR_SPACES 5.6.1.3
TABLE_SUMMARY_INVALID_SPACES 5.5.1.3 FORM_CONTROL_DEFAULT_TEXT_INVALID_SPACES 10.4.1.3
so these can only be 'fixed' by 'knowing' that the original was all spaces, information removed from lexer ...

There is 1, at least, 6-3-1-4, that needs source test file change, from
-<script><!-- the script --></script>
+<applet><!-- the applet --></applet>

No more time to explore further for now ...

questions - perl_output - new_testcases_txt - diff_file_compare - first_differences - index

Original Questions (update: 5/10/2006)

The long term goal is to have Tidy 'pass' all the accessibility test suite, but the first part of this is understanding the access test suite as it presently stands in the CVS source ...

From a comparison of the CVS accessibility suite, in the test/accessTest folder, with those on the site -
http://www.aprompt.ca/Tidy/accessibilitychecks.html
it seems those presently in the Tidy source came from this, now 'older', online set. There are a considerable number of (relatively small) differences, which I have noted below.

But first, some questions ...

1. onsite download zip fails

While the zip file can be downloaded, my WinZip reports that it is 'corrupted', and can not be opened ;=(( I have tried this several times, over several days ... Does anyone else have this problem?

So, for checking, I downloaded each test file, thanks to a small perl script I wrote, and compared that to those in CVS. Of course, this required changing the line endings since those on site end with a CR (MAC?), while the CVS end with CR/LF (DOS) ...

I understand that those in the CVS server probably end with LF (Unix), and are translated to the DOS endings during the CVS download.

2. The Test File Download

(a) Two (2) files appear 'missing' on the site
[1.1.1.12] Test file 1-1-1-f12.html
[5.5.2.1] Test file 5-5-2-f1.html

Is there any particular reason for this?

(b) Nine (9) onsite test files appear MISSING
from CVS download ...
[1.1.1.11][1] [1.1.1.12][1] [10.2.1.1][2]
[10.2.1.2][2] [10.4.1.1][3] [10.4.1.2][3]
[10.4.1.3][3] [13.1.1.5][2] [13.1.1.6][2]

Again, is there any particular reason for NOT including these in the CVS test suite? These need to be checked against those termed 'obsolete' ... maybe some have been deliberately excluded, in that they are no longer applicable.

3. Priority Level

Comparing the 'priority' level given in the CVS test\accesscases.txt, with the 'priority' shown in the accessibilitychecks.html page showed some fifty two (52) differences. A subsequent cvs update, has removed all these differences.

Many I have 'checked' against the tidy code, and in every case checked, the 'onsite' priority matches that indicated by the code. That is, accesscases.txt appeared wrong, but is has been susequently updated ;=))

An example was test 1.1.1.1. in accesscases.txt set this as [2]; onsite it is [1], and in the code, in access.c, tidy does -
    if (Level1_Enabled( doc ))
for a missing 'alt' warning ... so the onsite indication appears 'correct' ...

That example has an elevated priority, so would still work as is, but cases like 2.2.1.1 had a [1] in accesscases.txt, but a [3] online ... and in access.c, tidy does, in CheckColorContrast() -
    if (Level3_Enabled( doc ))
so failed the test just because of using 1 instead of 3 ... Again the onsite priority appeared correct, and still to fully check the cvs update in exact detail.

Naturally I have also check some with the WCAG guidelines, on sites like -

http://www.w3.org/TR/WCAG10-TECHS/, and
http://www.w3.org/WAI/WCAG20/quickref/

but have yet to find another page that neatly lists each 'checkpoint' with a 'priority' next to it ... Perhaps someone knows another neat W3C/WCAG URL to go to for more checking on this ...

cvs accesscases.txt has been updated to match the onsite, and/or as indicated by the current internal coded priority, but still checking every case ...

In fact, the new testcases.txt I have prepared also gives the test 'description', like -
1-1-1-1 1.1.1.1 1 Error <img> missing 'alt' text
1-1-1-2 1.1.1.2 1 Warning suspicious 'alt' text (filename)
... etc ...
just to be more informative ...

4. Comparison of test files with CVS

Some thirty-seven (37) test files have changes, other than the line ending mentioned ...

(a) First, it seems all the 'single' tags, like 'img', 'area', 'meta', etc, have had a trailing '/' added. So, like in test 1-1-1-2.html, the line -

<img src="gifimage.gif" alt="gifimage.gif">
has been changed to -
<img src="gifimage.gif" alt="gifimage.gif"/>

Is online file the desired form? To add the xml close to tags without end tags ... There are about 15 plus test cases like this ... it seems not?

(b) There are a number of other, relative minor changes ... like test 1-1-3-1 has had the <form ...> tag removed, or test 1-1-8-1 has had the DOCTYPE removed, etc, etc ...

Would it be correct to update to using the ONLINE files?

5. Final Stage

Once the access cases in CVS match the online files, and all is checked and agreed, then it would be the time to also 'fix' tidy code, such that it would pass all tests when this test suite is run ...

I have already done this once, just to prove it was 'possible' ... and had a 99% success rate ... but am doing it again, in a carefully step by step way, making the minimum of changes, and 'neater' code, I hope ;=)).

There seems only 1 case where tidy would always fail, and that is case 4-1-1-1 (change in languages) since tidy does not have 'dictionaries' built in to test this, so I would suggest this case be left out ...

That is keep the test file, but exclude it from accesscases.txt ...

Another like this is 4-3-1-2 which tests -
<html lang="blah">
This will fail any test, since tidy does not contain an ISO 639 table, but this particular test case will work if modified to -
<html lang=""> That is, if this is made the test case instead ... or an ISO 639 table added ... which I have also prepared ...

There are a few others where the 'test file' needs minor adjustment ...

Four other cases can be fixed just by doing the accessibility test BEFORE tidy has done certain fixes and changes to the 'tree', after parsing is completed. That is doing the access test earlier in the code ...

There is at least one case, NOFRAMES_INVALID_LINK, 6.5.1.4, where a note would have to be made during the tidy 'parsing', since subsequent 'access testing' would now never find this item ... the access warning/error has already been removed from the tree before the 'access tests' can be reasonably done ...

And there are a small number of other cases, which can only function if Tidy is able to refer back to the original file data. Like 5-5-1-3 -
<table summary="       ">
since tidy presently 'eats' spaces without any record of having done so ...

But as stated, in general tidy can pass all the access test suite ... in some cases the code is already there, but no 'message' has been coded ... almost as if it was 'deliberately' left out? Like say IMG_MISSING_ALT_BULLET and IMG_MISSING_ALT_H_RULE, which are listed as obsolete. The code is all there, but no messages generated ... Should they be removed from accesscases.txt?

Perhaps the person, or people, like Mike Lam, Chris Ridpath, Terry Teague, et al, who initiated this code can fill me in, or point me to earlier discussion(s) where it was 'decided' to remove this or that particular test ... then we can drop it from accesscases.txt ... maybe some answers here lie in the cvs logs ...

There is a small mention in the log of updating to the WCAG 2.0 which this page does not address yet. This page is all about getting the current code working as advertised ...

In general, I would say the code changes to get it all working are relatively minor, and in no way effect tidy parsing or output ... and are ONLY done if requested from the command line ... and compiled with SUPPORT_ACCESSIBILITY_CHECKS ... I would also check that this can be 'turned off', and Tidy still compile cleanly ...

Looking forward to further feedback ...

Regards,
Geoff.

PS: You can download the windows EXE, in a ZIP file, if you want to give my 'first-attempt' binary a run. It works, but is full of 'discovery' coding ;=)) The zip file also contains the adjusted windows command files, and a new accesscases.txt list, to run these tests. The second ZIP file contains the modified test suite - that is, the html test input files ...

download md5 digest
tidycvs6a.zip 5c3828f1d430da84de74eb2987812d4f
tidycvs6asuite.zip b1000a34f077fcb464a4709c9c3d0f98

top - questions - perl_output - new_testcases_txt - diff_file_compare - first_differences - index

Below are several addendums.

1. Perl Output - perl_output

This is the output from my Perl script, as it downloads, and writes out the test files ... and then does a compare of the 'tests' given in the online page, and those in accesscases.txt, and writes an new accesscases text file, with descriptions added.

2. New Test Cases - new_testcases_txt

This is the contents of the new accesscases text file, with descriptions added. And the adjusted windows command files to run this new test cases file is below that ...

3. Diff File compare - diff_file_compare

This is a DIFF comparison between the downloaded test file, in this case in the tmp6 folder, and those from my last CVS update, circa 3 October, 2006.

4. First Differences - first_differences

This is a DIFF of my 'test' coding to get all access test to pass. This is NOT the final cut, but shows the ideas used to achieve the 'test' success ... If I get the time, and the Tidy 'community' agree, I would redo this work, making the minimum of changes possible.

top - questions - perl_output - new_testcases_txt - diff_file_compare - first_differences - index

1. Perl Output

Output from Perl script, as it downloads, and writes the test files ... then a compare of the 'tests' given in the online page, and those in accesscases.txt, and writes an new accesscases text file, with descriptions added.

<perl_output> see perl_out.txt </perl_output>

top - questions - perl_output - new_testcases_txt - diff_file_compare - first_differences - index

2. New Test Cases

Contents of new accesscases text file, with descriptions.
EXCEPTIONS:
1.1.12.1 - replace ASCII art - is listed as priority 1 onsite, but is coded in Tidy as 2
4.1.1.1 - changes in language - has been removed, since Tidy has no dictionaries to check this.
And the windows command files to run this new test cases file are also below ... there is a new output from acctest.cmd, if and only if (IFF) the test run successfully -
Appears ALL test suites ran completely successfully ... happy days ...
- just some encouragement ;=))

This now almost exactly the same as those in cvs, except here I have ADDED a message giving a brief description of the test being performed ...

<new_testcases_txt> see newtestcases.txt </new_testcases_txt>

To run the above NEW testcases.txt in a WINDOWS, some minor changes are required in the ACCTEST.CMD and ONETESTA.CMD files.

Note that there are THREE (3) user adjustable variables in acctest.cmd -
i. Where you copied the Tidy EXE file
ii. The location of the test suite files, and
iii. An EXISTING folder for the results output ...

The *nix shell command files would likewise have to be adjusted accordingly -

<acctest_cmd> see acctest.txt </acctest_cmd>

<onetesta_cmd> see onetest.txt </onetesta_cmd>

top - questions - perl_output - new_testcases_txt - diff_file_compare - first_differences - index

3. Diff File compare

This is a DIFF comparison between the downloaded test file, in this case in the tmp6 folder, and those from my last CVS update, circa 3 October, 2006. Note, the online files were the original source of the cvs source, with some corrections ... so I should have done this comparison 'in reverse' ... in your mind just swap the '-' and the '+' ;=))

<diff_file_compare> - see accessd01.txt </diff_file_compare>

top - questions - perl_output - new_testcases_txt - diff_file_compare - first_differences - index

4. First Differences - BETTER changes are a WORK IN PROGRESS

This is a DIFF of my 'test' coding to get all access test to pass. This is NOT the final cut, but shows the ideas used to achieve the 'test' success ... If I get the time, and the Tidy 'community' agree, I would redo this work, making the minimum of changes possible. It also contains some DEBUG ONLY modules, used to understand the code of Tidy. It also includes a NEW sio619.c file, to be able to truly test a 'lang' attribute, and the relevant changes to my MSVC6 build file set ...

<first_differences> see accessd02.txt </first_differences>

top - questions - perl_output - new_testcases_txt - diff_file_compare - first_differences - index

EOF - Tidy-48.doc

top


checked by tidy  Valid HTML 4.01 Transitional