Platonides <platonides(a)gmail.com> wrote:
> Any pointers to things that I overlooked?
Thoughts on in-
> terfaces & Co.? Volunteers? :-)
It's a bit hard for me to understand what your
tool does, since it gives
a blank page when English is selected, and it takes the html source
instead of the wiki source.
Ah! Didn't notice that. It works (solely) on the wiki
source, though.
I get that you look for two kind of bugs: "wiki
text errors" (like an
unclosed tag) and "wikipedia errors" (the date doesn't conform to the
manual of style).
[...]
It does mostly the latter, but I'm not looking for some
grammar to define an article complying with a manual of
style, but for a parser to parse wikitext.
[...]
I have dealt with the parser a bit (see bug 18765) and I don't think we
could make some things remotely sane as they are handled at completely
different steps. But linting completely insane ones shouldn't be too
hard. :)
On the other hand, going into the Parser is probably
quite far from what
you expected when wanting to leave your ugly mess of regexes. Also, I
may have misunderstood your position and it may not be appropiate for
your lint expectations.
I think so :-). My use case with wikilint and some other
tools is:
- Are there more than one and less than x images per arti-
cle?
- Is there more than one link to another article?
- Are there links in a "See also" section that have already
appeared in the article?
- If there are "Main article:" links, do they appear direct-
ly following a section heading indented and italic?
- Does the {{Personendaten}} data have a fuzzy relationship
to the introductory line of the article?
To address these, I'd like to parse the wiki source from a
concatenation of characters to a logical structure. The
MediaWiki parser does not seem to care for that, so I have
not looked further into that (and don't plan to do so).
So, to emphasize: I'm looking for *a* parser, that's a
lowercase "p".
Tim