We definitely care! The parser should never produce
invalid HTML output, if it
does that's a bug.
Well, if you're serious about this, I'd like to propose a
mutually-beneficial swap: something small that will help you, in
exchange for something small that will help me.
What I would like is a fix for
http://bugzilla.wikimedia.org/show_bug.cgi?id=3693 to be checked into
the tree (and in particular, to be made available on the English
Wikipedia). This is a small bug/feature request that :
* Already has a patch attached.
* The patch is one line long.
* The patch restores previously existing behaviour (albeit
unintentional behaviour).
* The previous behaviour was useful, and was used (by me at least).
* I am not aware of any security problems introduced by restoring this
behaviour.
What I am offering in return is a bit of testing software that is:
* Written in PHP.
* About 200 lines long.
* That will find invalid HTML output bugs in the Parser code.
To demonstrate that I am serious and that I am not wasting your time,
I have included below five small examples, all found by the test
software, of wiki text that produces invalid XHTML output.
The output was generated using MediaWiki 1.5.6 (which I believe is the
latest version tarball you can get) *without* tidy enabled, using PHP
4.1.2, and MySQL 3.23. Also reproduced when running PHP 4.4.2, and
MySQL 5.0.18. Any browser behaviour I describe was observed in Firefox
1.5.0.1.
* Example 1)
Wiki Text input:
=====================================
<s>a
{|
=====================================
Current invalid XHTML output:
=====================================
<p><s>a
</p>
<table>
</table>
=====================================
Suggested valid XHTML output:
=====================================
<p><s>a</s></p>
=====================================
Result: Invalid XHTML, and when rendered has the effect of striking
all content from the page, including footers, headers, basically
everything, as the <s> tag is not closed.
View output:
http://nickj.org/index.php?title=MediaWiki/Parser1
If you want to see a fun variant of this, with hiding almost all of
the text on the page, then check this out:
=====================================
<font style="visibility:hidden">a
{|
=====================================
This has the potential to seriously confuse less techie people running
small wikis, most especially if applied to the main page.
View output:
http://nickj.org/index.php?title=MediaWiki/Parser1-hidden
* Example 2)
Wiki Text input:
=====================================
a<center>
c
=====================================
Current invalid XHTML output:
=====================================
<p>a<center>
c</center>
</p>
=====================================
Suggested valid XHTML output:
=====================================
<p>a</p>
<center>c</center>
<br />
=====================================
Result: Renders fine, but page is invalid XHTML.
View output:
http://nickj.org/index.php?title=MediaWiki/Parser2
* Example 3)
Wiki Text input:
=====================================
<code>b
;
=====================================
Current invalid XHTML output:
=====================================
<p><code>b
</p>
<dl><dt></code>
=====================================
Suggested valid XHTML output:
=====================================
<p><code>b</code></p>
=====================================
Result: Renders fine, but page is invalid XHTML.
View output:
http://nickj.org/index.php?title=MediaWiki/Parser3
* Example 4)
Wiki Text input:
=====================================
;<div>a
=====================================
Current invalid XHTML output:
=====================================
<dl><dt><div>a</div>
</dt></dl>
=====================================
Suggested valid XHTML output:
=====================================
<dl>
<dd>
<div>a</div>
</dd>
</dl>
=====================================
Result: Renders fine, but page is invalid XHTML.
View output:
http://nickj.org/index.php?title=MediaWiki/Parser4
* Example 5)
Wiki Text input:
=====================================
;<div><blockquote>
=====================================
Current invalid XHTML output:
=====================================
<dl><dt><div><blockquote></blockquote>
</dt></dl>
</div>
=====================================
Suggested valid XHTML output:
=====================================
<!-- Nothing. -->
=====================================
Result: Invalid XHTML, and results in the page appearing to be a bit
stuffed up, with the page's text font several sizes smaller than
usual, and the article/discussion/edit/history links being shifted
upwards and leftwards.
View output:
http://nickj.org/index.php?title=MediaWiki/Parser5
All the best,
Nick.