On 6/6/06, Roger Luethi <collector(a)hellgate.ch> wrote:
On Tue, 06 Jun 2006 07:54:27 -0400, Anthony DiPierro
wrote:
That brings up another, longer term, to-do for
categories: they should
be language independent. For instance [[Marie Curie]] is in de: and
en: (they happen to have the same title, but even if they don't they
are linked via interwiki links). [[Kategorie:Pole]] is linked to
[[Category:Polish people]]. So there should be no need to categorize
Marie Curie twice (multiply by the actual number of languages which
have a Polish people category and an article on Marie Curie).
http://meta.wikimedia.org/wiki/Wikidata
From what I know of that project, it's a lot more extravagant than
what I'm
talking about. Infoboxes are a lot more complicated than
categories. Categories are just sets of articles, and the interwiki
links are already there. Infoboxes have multiple types of fields,
with various constraints on each of them, and various translation
issues most of which haven't even been begun.
This is pretty
simple theoretically. The only real problem is getting
the multiple category schemes in sync. Considering your point about
how the German categorization scheme differs from the English one,
this might be a lot harder in practice than it is in theory.
Right. You _could_ make it work on some categories, say "Women" or
"Nobelprize winners".
It cannot possible work for all categories, though, because different
languages know different categories. For instance, security and safety is
the same word in German: Sicherheit. Thus, [[de:Lifebelt]] is in the German
category that is linked to the English category Security.
I'd say then that either the German [[Kategorie:Sicherheit]] should be
disambiguated into two different categories, or that
[[Kategorie:Sicherheit]] shouldn't be linked to [[Category:Security]],
because they don't define the same set. Maybe an English
[[Category:Security and Safety]] could be made, with "Security" and
"Safety" as subcats - then [[Kategorie:Sicherheit]] could link to
[[Category:Security and Safety]].
But maybe this is a common enough thing that that's not going to be
reasonable. At some point someone should look at how the en
categories differ from the de ones. I figure there will be 5 major
points of difference:
1) Things being categorized at different levels ([[Category:Polish
women]] vs. [[Kategorie:Frau]].
2) Interwiki links between articles on different things.
3) Interwiki links between categories defining different sets.
4) Articles categorized where they shouldn't be.
5) Articles missing from categories where they should be.
1) is the reason why I call this a "longer-term solution". 2 and/or 3
are what you describe above. 4 and 5 are the reason why it would be
useful to coordinate things.
It'd be interesting to get a decent size sample and sort the
differences into those 5 categories. If 2 and/or 3 were significant,
then I suppose this idea fails, at least initially. If 1 is
significant, and I think it might be, then the idea rests upon
reaching a consensus among the different Wikipedias as to what level
to put things. And that's probably going to be dependent on having
on-the-fly intersection categories.
Numbers are much easier to share. Population of a
country. Weight of a
molecule.
Personally I don't see the difference between
taxonomies and
attributes, as described. But I suppose one (taxonomies?) could be
described as partitioning (an article can only be in one taxonomy
category) whereas attributes can be mixed. Under that definition
I suspect this is impossible. But it's hard to tell if you're not even sure
what counts as a taxonomy.
Roger
Yeah, I don't know. I'm not the one who made the initial distinction
between taxonomies and attributes. I was just trying to make sense of
it, rather poorly I guess. :)
Anthony