Hi someone said "show me the code" so I'm diving in. I'm trying to
make parserTests work without the database (without installing
anything). I'm basically just going through the code running it,
fixing and error, then running until the next error.
Now I'm brand new to MediaWiki but I've done a lot of programming in
lots of languages. Including a bit of PHP. Seems to me like there's
an awful lot of global functions which is a bit worrying. Maybe this
is just the normal way to do things in PHP but it seems a bit
worrying to me because it could mean spaghetti code. Umm...
documentation isn't great but I've seen worse (like Mozilla).
But.. the biggest thing seems to be that the database kind of
permeates the whole code... I can see why parserTests' author wrote
separating it from the database as a @todo... and it's a bit tricky.
I'm trying to do that now, writing fake classes and stuff in there.
We'll see if it works out. A good long-term project might be to
refactor so that the database is well isolated from the rest of the
code. So that, for example, you could rip out the database and insert
something else, or like for parserTests, run some parts of the code
without a database at all. That's especially useful for unit tests :-)
(For example I tried to include "commandLine.inc" and it wouldn't go
because there's no database. I made a "commandLineSimple.inc" and
ripped out all the database code.)
I've hit a point now where the Parser access User functions. With a
command line test there's not going to be a user, so I could either
fake one, or I could temporarily modify the parser.
It seems like this might be a good place to insert an intermediate
parsing format (something between the article{wikitext, ...} and the
final HTML.
--simon
--
http://simonwoodside.com
I noticed while validating the HTML output of an extension that some
urls are being generated with a plain & instead of encoded as & and
the W3C validator complains about this as an error. This patch fixes
the line of code that was the source of the uncoded ampersand (and
another line I noticed) if anyone with CVS access chooses to apply it.
--
http://members.dodo.com.au/~netocrat
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Nikola Smolenski wrote:
> On Monday 28 November 2005 03:31, Edward Z. Yang wrote:
>>> Hmmm... I think I'll rename it CoLocus, short for Community
>>> Contingency Locus. What do you think?
>
> CoCoLoco? ;)
Timwi wrote:
> Not wanting to smash every idea here, but "Lokus" in German means loo
> (toilet), and "CoCoLoco" sounds suspiciously like "Kokolores", which
> means nonsense. :-)
No, no, that's absolutely fine. Worst thing that could happen is that no
one answers to this extremely un-wikitech-l conversation and I end up
settling for a subpar name. We probably want a meaningful name, I think.
Unfortunately, I registered a SourceForge project under colocus, and it
was approved, and now I can't change it. Shouldn't have been so
trigger-happy. So, I'll either have to justify my name, or request my
project is deleted and register a new one. >_<
Locus means "the scene of any event or action (especially the place of a
meeting)". Co doubles up meaning as community and contingency (the
community was an after thought).
Is having a good name really that important?
- --
Edward Z. Yang Personal: edwardzyang(a)thewritingpot.com
SN:Ambush Commander Website: http://www.thewritingpot.com/
GPGKey:0x869C48DA http://www.thewritingpot.com/gpgpubkey.asc
3FA8 E9A9 7385 B691 A6FC B3CB A933 BE7D 869C 48DA
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (MingW32)
iD8DBQFDi7x8qTO+fYacSNoRAiBjAJ4+38zmYIzrwm7T36Y8vauD6UOVlwCeL3vb
IDHacjqMM2884ZJPGvzrA44=
=WrSS
-----END PGP SIGNATURE-----
I was pointed to this section of the HTML 4.0 spec ("Notes on helping
search engines index your Web site") at W3C recently:
http://www.w3.org/TR/REC-html40/appendix/notes.html#h-B.4
I was specifically interested in this section, which I quote in full:
Specify language variants of this document
If you have prepared translations of this document into
other languages, you should use the LINK element to
reference these. This allows an indexing engine to offer
users search results in the user's preferred language,
regardless of how the query was written. For instance,
the following links offer French and German alternatives
to a search engine:
<LINK rel="alternate"
type="text/html"
href="mydoc-fr.html" hreflang="fr"
lang="fr" title="La vie souterraine">
<LINK rel="alternate"
type="text/html"
href="mydoc-de.html" hreflang="de"
lang="de" title="Das Leben im Untergrund">
I think it'd be useful for most multilingual MediaWiki installations
that use interlanguage links to have such hidden <link> elements. <link>
elements aren't rendered in most browsers (Mozilla 1.5+, I think, will
show links in a menu on the toolbar), but as mentioned they do provide
some guidance to bots and spiders.
The downside is that they still take up network bandwidth, and for
oft-interwikied pages on big sites (e.g. Wikipedia) this section could
run to the 5-10kB size range. (My off-the-cuff estimate for, say,
articles where each <link> is about 100B, and there are 50-100
interwiki links.)
Barring objections I'm going to add a feature to HEAD to render these
<link> elements, controlled by a variable $wgInterLanguageLinkElement ,
default to false.
~Evan
--
Evan Prodromou <evan(a)wikitravel.org>
Wikitravel (http://wikitravel.org/) -- the free, complete, up-to-date
and reliable world-wide travel guide
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Hmmm... I think I'll rename it CoLocus, short for Community Contingency
Locus. What do you think?
- --
Edward Z. Yang Personal: edwardzyang(a)thewritingpot.com
SN:Ambush Commander Website: http://www.thewritingpot.com/
GPGKey:0x869C48DA http://www.thewritingpot.com/gpgpubkey.asc
3FA8 E9A9 7385 B691 A6FC B3CB A933 BE7D 869C 48DA
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (MingW32)
iD8DBQFDimweqTO+fYacSNoRAgxkAJ0bBzt/LPy6pNmU/UpTUNsn3YQ8/ACfRyZp
XmaX0HwsgPe+0s9PJKSrgWc=
=qlm4
-----END PGP SIGNATURE-----
I am trying to upload the english Wikipedia xml dump from 20051105 xml. I have
Mysql 5, Wikimedia 1.6 phase 3, and a mac with osx 10.3.
I started the loading process six days ago using importDump.php and the process
list shows me that php, bz, and mysql are still working. However, when I go into
Mysql the row counts of the tables do not go up. For example, the user table has
one row, and the text table has approx. 3500 rows. If I call show processes in
Mysql, then I can see that queries are running (admin tells me that approx 200
queries/second are executed) but they seem not to be executed.
My questions: is it normal to take six days to upload the Wikipedia dump? is it
normal that the row count does not go up?
If not, I would highly appreciated any help. Unfortunately I cannot use
mwdumper, because OSX 10.3 only allows java 1.4 and with java 1.4 I get
exception errors.
Thank you!
The Roadsend compiler (a non-Free commercial product) appears to be a
PHP-in-Scheme implementation that then uses the Bigloo Scheme compiler
to generate binary executables. Its authors' benchmarks appear to show a
10% to 150% improvement compared to standard PHP bytecode execution.
Since I'm just about to set to work on a largish Scheme-driven project,
and once wrote a (horrible) compiler using Lisp as a target language
some time ago, I find this quite interesting. This kind of improvement
in PHP execution performance across the Wikipedia cluster, if available
using Free software, would be rather useful.
Does anyone know of any free-as-in-freedom work on anything similar?
I wonder if this would make an interesting MSc project for someone?
-- Neil
Hoi,
We have discussed the subject of single login many times. There are many
scenario's that we can take to get to a solution. There is also the
potential to do some "future proofing". At this moment in time all our
security for users is pretty minimal; it relies on knowing a password or
having a cookie on your system. For gaining read only access we do not
require any authentication. There are several scenario's where
(technically available) additional authentication possibilities will
help us.
* When a range of IP numbers is blocked because of frequent vandalism,
we want to allow access for authenticated editors. These can be schools
or proxies.
* When we host educational content, we want to ensure that it is only
the student who accesses his material
* When we host educational content, we want to give access to a subset
of data to a teacher of a student
* When we collaborate with another web services like Kennisnet, we allow
users authenticated by such an organisation to use our resources as an
authenticated editor
The point that I am trying to make is that future proofing makes sense.
When we have the potential to do this and make use of proven open source
technology, we should consider this as an option in stead of "rolling
our own". A-Select http://a-select.surfnet.nl/ is a project run by
"Surfnet", it is available under a BSD license. Scalability has been
very much part of their existing projects. It is used as the engine for
many big projects; DigiD http://www.digid.nl/ is a project to give
people living in the Netherlands access to their personal information.
Strong authentication like used by banks for on-line transactions are
provided for. The Dutch library system, Dutch education .. they use it.
I will make sure that material about all this will become available on
Meta. I start by posting here because there is a need for discussing the
issues that come up when you introduce the potential for more
authentication to our growing list of services.
Thanks,
GerardM
Hello
I am just reading the handbook and would like to have a link to an
arbitrary anchors within a page. It is not clear to me how to set that
anchor.
The handbook say use <span id=".." />
so I understood
<span id="test" />
[[#test]] would be the trick.
But now it seems not to work. Can anybody please give me a full
example.
Thanks
Uwe Brauer
Would it be possible to write a bot which simply goes through all the
pages, following all the interwiki links that currently exist and
placing them where they don't exist? This seems like it would be very
easy for someone to write, easy to implement, and very effective.
For example, if an article on EN links to an article on ES, but that one
links to an article on DE, then the bot will go back through and add
es/de to en, en/de to es, and es/en to de, etc.
This would eliminate a lot of the problems encountered by the Interwiki
Link Checker, for example, which can't deal with comparing English and
Chinese (provided that at least one page somewhere in the chain links to
the Chinese version).
Of course, it isn't a complete solution, but it seems like it would be a
very effective one. For any chains where there are discrepancies (en ->
de -> a different en), it can just not bother with that chain.
brian0918