On Thu, Jul 25, 2002 at 12:17:26PM +0200, Lars Aronsson wrote:
On Thu, 25 Jul 2002 lcrocker(a)nupedia.com wrote:
This made
me think: Would it make sense to make a formal BNF
grammar for the Wikipedia text format, so a LALR(1) parser could
be made for it? Would that make any sense at all with PHP, or
just be too hard to code and inflexible?
I'd love to have a formal grammar of some kind (I think regexps
would be fine), and I agree with Jan that a totally wiki-specific
syntax would be far better than out current mish-mash of HTML and
wiki markup. But I'm not sure if it's not already too late to
revisit those decisions.
But if it isn't, I'll be happy to discuss what a syntax might
look like.
Wiki is still a new concept. Think how HTML was based on SGML, then
evolved into HTML 2, 3, 4, 5, and then XML came along, because people
understood from the HTML experience that SGML was overly complex.
There is a big world of PhpWikis out there with [single bracket] link
syntax. There are other wiki implementations with different ideas about
syntax. But no wiki is as big as Wikipedia, so this is the most
concentrated amount of experience. This is where a format standard should
or at least could start to form.
I tried to make formal grammar of Wikipedia, LALR, regexps of
whatever, and I can tell you that it's next to impossible if almost
arbitrary HTML markup is allowed.
Especially HTML tables syntax is difficult to parse,
so maybe we should make our own ?
Without HTML tables I think that we could limit what kind of HTML is
allowed and make some sane formal syntax.
It's not easy to design simple table markup that:
* allows multicolumn and multirow cells
* allows cell attributes
* can nest tables
* allows all constructs that HTML allows inside cells, i.e. multiple
paragraphs, lists etc.
* is readable
* is easy to write
So I suggest that you check
http://sf.net/projects/freetable
I made this a while ago to allow simpler HTML tables.
It seems to be working and is used by WebMake and WebsiteMetaLanguage.
Syntax looks something like this:
<wwwtable border=1>
(1,1)
column 1, row 1
(+,)
the same column, next row
(*,2)
column 2 in any row
(*,3) align=center
columns 3 should be centered
(1,3)
Some centered text
(3,3)
Other centered text
</wwwtable>
What is converted to:
<table border=1>
<tr>
<td>column 1, row 1</td>
<td>column 2 in any row</td>
<td align=center>columns 3 should be centered Some centered text</td>
</tr>
<tr>
<td>the same column, next row</td>
<td>column 2 in any row</td>
<td align=center>columns 3 should be centered</td>
</tr>
<tr>
<td> </td>
<td>column 2 in any row</td>
<td align=center>columns 3 should be centered Other centered text</td>
</tr>
</table>
I'd say it's much better than what wikipedia currently uses.