On Sat, Dec 3, 2011 at 5:37 PM, Jeremy Baron <jeremy(a)tuxmachine.com> wrote:
My first guess is that it's related to the use of
HTML Tidy. Seems to
be enabled currently on the WMF cluster:
see $wgUseTidy @
http://noc.wikimedia.org/conf/highlight.php?file=CommonSettings.php
-Jeremy
Thanks! That's it. Adding $wgUseTidy = true; in LocalSettings.php
addresses the first issue. </sup> is now converted to </sub>
It doesn't quite match the other two behaviors (I detail below)
However, I think this is the right track. All of the behavior I've
seen so far involves changing html elements.
One other follow-up question. Is MediaWiki using the php_tidy.dll that
is part of my PHP 5.2 environment?
My environment is a Windows XP box with apache 2.2 and php 5.2. It's a
relatively clean install.
I've done some debugging and see that MediaWiki is calling
execExternalTidy in parser\Tidy.php.
It seems to spawn a process called "tidy". However, I can't find any
tidy exe/dll on my box. I'm fairly certain I have not downloaded it
manually.
I'm hoping it's php_tidy.dll and that I can swap out other versions of
php_tidy to get (2) and (3) (as well as the other dozen or so)
working.
If not, I'll muck around with the tidy source, and/or look at the
tidy.conf options later.
Thanks!
== differences ==
(2) <span /> is preserved (it used to be escaped). Wikipedia splits to
<span></span>
(3) no change. empty <i> </i> tags still preserved