On Fri, Aug 15, 2008 at 6:18 PM, Lars Aronsson <lars(a)aronsson.se> wrote:
Is any similar innovation going on in disambiguation
pages and
list articles?
For a few months now i have been thinking about writing an email with
my thoughts about disambiguation, so i guess that now is the time.
I've been doing a lot of work improving interwiki links, mostly
manually. There's a grave problem with adding interwiki links to
disambig pages - very often a word that may be homonym in one language
is not a homonym in another or has a completely different set of
meanings. Examples of different kinds are abundant:
Simple cases would be:
* [[Grossmann]] vs. [[Grossman]] - In English these are separate
disambig pages, but in Hebrew they would be one.
* [[Kirov]] - In English this would be the Soviet politician and a
bunch of things names after him, but in Hebrew (קירוב) this would also
be "approximation".
* Due to the peculiarities of Hebrew spelling, דאון (pron. "daon") can
be interwiki-linked to the various meanings of "down" and "daun" and
also to [[Glider]] and [[Flying fish]].
Harder examples:
* In general, any such link between languages which are written in
different character sets are wild approximations. One extreme example,
which is beyond my comprehension, is the Japanese [[バーンスタイン]] and
[[ベルンシュタイン]], which seem to be different spellings of "Bernstein", but
i can only guess.
* all the variations of John, Johan, Juan, Ivan etc. - how to
interwiki-link them? Different spelling are one problem, and cultural
implications make it harder (think about saints, fair tale heroes,
Catalan Joan vs. Spanish Juan, etc.)
* In Russian, Коса (pron. "kosa") is a disambiguation between [[Queue
(hairstyle)]], [[Scythe]], [[Spit (landform)]], [[Xhosa]], [[Braid
theory]] and a few other things. Should it be linked in any way to the
English "kosa" or "cosa"? Probably not, as it would be completely
arbitrary. Should it be linked to the Ukrainian Коса? The spelling of
Ukrainian is reasonably close to that of Russian and so are its word
meanings and disambiguations ... but where does it stop?
The interim solution that was more or less agreed upon in the Hebrew
Wikipedia is to mark disambig pages which are too specific to the
Hebrew language with an invisible template that would tell the bots
not to add interwiki links to it. The technical details of the
implementation of this solution are still in flux, but you can see a
preliminary list of such pages here:
http://tinyurl.com/6h2wzb
Personally i would go further. Since most often disambiguation has
little encyclopedic meaning and is essentially a feature of each
language, i would put all disambig pages into a new separate namespace
and prevent the adding of interwiki links to it.
The only disadvantage that i can think of is that there are a lot of
links from the article space to disambig pages. This can be solved by
making the "Disambiguation:" space the second option for searching; in
pseudo-code it would be something like:
if (exists(article) or exists("Disambig:" + article)) {
output(blue_link(article));
}
else {
output(red_link(article));
}
There are several other advantages:
* It will make the work of the scripts that prepare the lists for
[[WP:DPL]] much easier. (There are similar projects in several other
Wikipedias.)
* A link to a disambig page can be made in a different color, and thus
help the editors to fix it.
* It will clearly separate between purely technical and homonymic
disambiguations and those that have some encyclopedic meaning. The
latter can go to the article space. ([[Cancer]] is a possible
example.)
Any other thoughts are welcome.
--
Amir Elisha Aharoni
heb:
http://haharoni.wordpress.com | eng:
http://aharoni.wordpress.com
cat:
http://aprenent.wordpress.com | rus:
http://amire80.livejournal.com
"We're living in pieces,
I want to live in peace." - T. Moore