Re: [Wikitech-l] EBNF grammar project status?

13 Nov 2007

On 11/14/07, Virgil Ierubino &lt;virgil.ierubino(a)gmail.com&gt; wrote:
...

 I'm assuming our problem is this: currently we "parse" wikitext by
 immediately converting, via regex, into XHTML. This is not really
 "parsing",
 because parsing usually means the creation of an abstract Document Object
 Model which is then iterated through to generate XHTML, XML, FooBar or
 whatever (or so I have learnt). Because we're missing this DOM, Wikitext
 can't expand beyond being used by the current parser (so we can't do
 WYSIWYG, etc.). However, there appears to be no way of creating a DOM from
 Wikitext because this would be to standardise the way syntax converts to
 output, but any kind of standardisation will cause backwards
 incompatibility. 

Your "DOM" is usually called an AST ("abstract syntax tree"). But
yes.
However, "backwards incompatibility" is
not so much the issue as "sudden, drastic misrendering of existing wikitext".

I do think it's impossible to produce a meaningful traditional parser
that could replicate exactly the
behaviour of the current parser. I think it's very possible to produce
a good parser that will cover
all the most useful cases.

So our problem is the dilemma: either we standardise, and lose backwards
...
  compatibility, or we don't, and lose
extensibility. And in the long run, I
 think the first option is better. However, in standardising the language
 we'd lose the feature of it that all syntax is valid (useful, as then
 people
 can't ever be presented with error messages on their pages) so ideally the 

The "all syntax is valid" thing really arises because of the nature of
browsers rather than
because of the parser itself. I don't think we're in danger of losing
that - the parser will just
have to fail gracefully when it comes up against malformed wikitext.

...
  On the point of immutable validity, it is perhaps less
useful for all text
 to be valid than for there to be "invalid markup" error messages. Whilst
 the
 former ensures users can never really "go wrong", it is still true that
 bad
 markup will lead to results they very much didn't intend - and it seems to
 me more useful to give them an error message than a wildly unintended
 result.

 Wildly unintended is fine, at least they see that (or someone else does).
What's more dangerous is when stuff silently breaks, making a sentence or
two just disappear off the page.

Steve

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Re: [Wikitech-l] EBNF grammar project status?