If someone is looking for a better implementation of string functions
like stuff my WikiCode <http://wiki-tools.com/wiki/WikiCode> extension
is GPL so it'll be fine if you want to extract some of it to build a new
string functions extension or expand parser functions.
Just remember to credit me for the parts of the code you copy and keep
it under GPL.
You can see some code in use on the demo page:
If you'll take note on a few of my string functions there, you'll also
see that the ones I've written quite nicely handle nowiki tags,
multibyte strings, and if you go and cross check with the actual string
functions extension and a bug in bugzilla they are free of known issues
with string functions and multibyte characters in certain places.
The only string functions there that aren't completed is #pad, my
implementation of that isn't as simple because one of my aims was also
to handle nowiki tags in a fairly logical way. Because of that native
functions can't really be used.
Unfortunately for the extension itself, I haven't really been doing any
MediaWiki related development lately.
~Daniel Friesen (Dantman, Nadir-Seen-Fire)
~Profile/Portfolio:
On Thu, Jan 8, 2009 at 1:31 PM, Greg L
<Greg_L_at_Wikipedia(a)comcast.net> wrote:
I'm not a developer so it would be great if
either of you (Aryeh or
Mr.Z-man) could explain whether a character-counting parser function
(or similar tool) is currently available (or could be made) for
template authors to use.
Such tools are available, but none has been written well enough that
it could be used on Wikimedia sites.
As for "we currently have no plans to
enable
StringFunctions or any similar functionality on Wikimedia sites", why
would that be a good plan?
Andrew was mistaken in that statement. Variables are currently off
the table, but string functions aren't. They need someone to write a
good version of them that isn't DOSable and handles things like strip
markers acceptably.
On Thu, Jan 8, 2009 at 1:43 PM, Chad <innocentkiller(a)gmail.com> wrote:
Brion outlined his concerns with
StringFunctions--when it
was merged with ParserFunctions and he reverted it--back
in r39653. Mainly, the overall package is too memory
intensive as currently written.
His exact comment might be more elucidating: "o_O These look like the
least CPU- and memory-efficient implementations of strlen(), strpos()
etc that could possibly be created..." For example, the {{#len:}}
function was implemented as the return value of this:
/**
* Splits the string into its component parts using preg_match_all().
* $chars is set to the resulting array of multibyte characters.
* Returns count($chars).
*/
function mwSplit ( &$parser, $str, &$chars ) {
# Get marker prefix & suffix
$prefix = preg_quote( $parser->mUniqPrefix );
if( isset($parser->mMarkerSuffix) )
$suffix = preg_quote( $parser->mMarkerSuffix );
else if ( strcmp( MW_PARSER_VERSION, '1.6.1' ) > 0 )
$suffix = 'QINU\x07';
else $suffix = 'QINU';
# Treat strip markers as single multibyte characters
$count = preg_match_all('/' . $prefix . '.*?' . $suffix .
'|./su',
$str, $arr);
$chars = $arr[0];
return $count;
}
Rather than, say, replacing strip markers using the appropriate Parser
method, and then returning mb_strlen(). Or whatever would be
appropriate. I'm not sure what would be, but I'm pretty sure it
doesn't involve exploding the string into an array to calculate its
length.