I already got it in your reply here:
Maybe you are underestimating the vast differences in implementation
between the current not-really-a-parser and what I am working on.
There is nothing wrong with using a group of templates together, but
there *is* something majorly wrong with patching together one object (a
table, in this case) using pieces from different places. It works with
the current not-really-a-parser because it takes the wiki source texts
from the templates, sticks them together somehow, and then converts them
to HTML. This kind of practice is exactly what leads to all the problems
with our current not-really-a-parser. A proper parser should parse each
template individually, and then use its parse tree in the processing of
the page that uses it.
It's great that you're working on a different way to do it thats not
just dumb-text-includes.
On Mon, 30 Aug 2004 16:55:01 +0100, Timwi <timwi(a)gmx.net> wrote:
Ævar Arnfjörð Bjarmason wrote:
Why would it ever break? I can see it getting
slow because it cannot
be optimized but not breaking, all it's doing is just including one
thing after the other
{{a}} gets Template:A which contains "foo" and {{b}} gets Template:B
which contains "bar" hence
{{a}}{{b}} = foobar
Of course, this simple example would still work. But picture this:
Template:A contains: I ''li
Template:B contains: ke'' hamburgers
currently, {{a}}{{b}} would yield "I <em>like</em> hamburgers", but
only
because it sticks the pieces together and then tries to make sense of it.
Why is this bad? Picture this:
Template:A contains:
{|
| nowrap
Template:B contains:
| Text
|}
Is the "nowrap" a table cell attribute or text in a separate cell? Does
this change depending on whether there is a newline after "nowrap"? ...
And this is just a simple example.
Why would this break in whatever parser you plan
to implement?
Because a parser is not a converter. The current not-really-a-parser is
actually a converter: It looks out for particular syntax elements like
''these'' and turns them into <em>HTML tags</em>. This is bad
because it
means that several of these conversions can interfere with each other:
I ''like [[hamburger|hamburgers'']]
produces invalid HTML. It gets even worse when it tries to locate
{{template inclusions}} and replaces them with some other text, not
knowing what it is or how it fits into the document structure.
A real parser analyses the document's structure. It turns the wiki text
into a data structure in memory that actually bears resemblance to the
structure of the document. It creates a "heading" element where there is
a heading, instead of turning some strategically-placed equals signs
into <h#> tags.
The only reason i can see why that would happen
is if you were to
implement some auto-completion of the table syntax. Sort of like
tidy(html) for wikisyntax and do it before things get fetched from
Template: rather than after everything has been included.
Your terminology "auto-completion" reveals that you are thinking in
terms of conversion. Don't think of it as auto-completion; for example,
if a '' has no matching '', I can tell the parser what to do
independently of what it does when there *is* a matching ''. There are
several possibilities: make an italics element (what you would probably
call auto-completion); make a text element (i.e. pretend the "''" was
actually text); or bail out saying "syntax error". Of course, we don't
want the latter. My parser currently does the second: It turns the ''
into text. I did that because this is also how the current
not-really-a-parser functions. However, I can easily change that.
In our specific case, there would be a document (a template) that has a
{| with no matching |}. What should it do? Unfortunately, none of the
three options make it work the way you have come to expect from the
current not-really-a-parser.
Timwi
_______________________________________________
WikiEN-l mailing list
WikiEN-l(a)Wikipedia.org
http://mail.wikipedia.org/mailman/listinfo/wikien-l