Jay R. Ashworth wrote:
On Mon, Aug 28, 2006 at 12:01:00AM +0100, Buttay cyril
wrote:
not mention
wikitext2docbook, and says that flexbisonparse is
"Intended as an
eventual replacement to the parsing code inside MediaWiki itself",
which is rather promising!
I don't know that that is what Magnus is calling it, but
that's what it does. I forget what language he's doing it
in. Check the list archives; he's mentioned it here in the
last couple of months (and may well chime in here).
As someone who's been playing with alternative parsers (though not Magnus),
I'm pretty sure the flexbisonparse project is currently dead. Magnus moved
to his wiki2xml project (also available in the MediaWiki repository), which
is actually coded in PHP. As far as I know, though, it's the single most
feature-complete alternative parser we have. Not claiming it's perfect, but
it's good... I haven't worked with flexbisonparse, though, so maybe it's
better than I know.
I've actually been working on a Python-based wikitext parser, using some
techniques that should make the system a bit faster and cleaner... With a
lot of luck, I should start making progress on that again in the next month
or so.
For anyone who cares, I'll probably be trying to implement a PEG-based
parser using mxTextTools, since I think that should be able to parse all of
MediaWiki's wikitext, and should be about twice as fast as the current
Parser.php (which is about as fast as wiki2xml)... Or I might just end up
using ANTLR, if I can bully my current semi-grammar into working in that
framework... If anyone knows of a decent PEG parser with a Python API (a
packrat parser might be ideal), that'd be great too. *shrugs*
- Eric Astor
--
No virus found in this outgoing message.
Checked by AVG Free Edition.
Version: 7.1.405 / Virus Database: 268.11.6/428 - Release Date: 8/25/2006