On 6/6/06, Roger Luethi <collector(a)hellgate.ch> wrote:
You ask some good questions here!
Right. But more importantly, the way categories are
used is thorougly
entrenched and I suspect your chances to change it are close to zero even
assuming you could make it a policy.
It depends whether it can be done in an evolutionary rather than
revolutionary way. Slowly migrating hardcoded "Polish scientist"
categories to some more advanced method should work - once critical
mass is achieved (assuming that people can be convinced that it's a
better method), it could become the dominant method. Suddenly wading
in and reorganising category hierarchies is probably doomed, otoh.
It would be easier to create clean trees in a parallel
namespace. Say,
leave [[Category:Bridges]] alone and create [[is a:Bridge]] (or, without
software changes, [[Category:is a Bridge]]).
Or even {{isa|bridge}} ? Can anything useful be done with templates'
"what links here"?
That said, it seems that you are overestimating the
importance of one type
of relationships at the expense of others.
I think I agree. Let me try something: say we make a shallow
"taxonomy" tree (or not even tree?) and allow attributes instead to be
hierarchical:
Paris "isa city" +in France ("isa city" is the taxonomy, +in France
is the attribute)
Now, +in France can be a subattribute of +in Europe (and it could have
been made +in Ile de France or whatever)
Britney Spears "isa person" +female +singer +alive
+singer could be a subattribute of +entertainer
Pont du Gard "isa aqueduct" +in France, +Roman-built
"isa aqueduct" can be a subcategory of "isa bridge" and "isa
construction"
This seems to be relatively clean, despite the fact that the attribute
hierarchies have different meanings: "in" as opposed to "is a
specialisation of".
Basically what I'm proposing now is keeping taxonomies quite strict,
and allowing greater flexibility in attributes. So we'll always know
whether an item is a soccer team or a city, but we may lose
information on the finer details if the attributes aren't managed
carefully. Still a better situation than currently not being able to
distinguish between a rock band and a person..
More examples to try and break things?
Some hierarchies are perfectly natural and useful but
are not "is a"
relationships (Europe - France - Paris, Family - Genus - Species).
I don't quite understnd your second example. "Rattus rattus" "is
a"
"Rattus" is ok isn't it?
Many attributes are perfectly natural and useful but
they tend not to fit
hierarchies well. You starting using them as soon as you sketched out the
supposedly taxonomic category women. There simply is no natural taxonomic
hierarchy for women, just a bunch of attributes.
Yeah, I see that. So, we stop the taxonomy at "person", and instead
have hierarchical attributes? Is this actually far from the current
situation? Hmm.
Now _if_ you want to draw a natural hierarchy with
women in it, try
genealogy. But guess what? That's another type of relationship that we
can't deal with (X is ancester of Y, Y is ancestor of Z).
I think overall, having objects in a hierarchy is not the goal in
itself - the goal is organising information, being able to group
related information, and being able to make meaningful statements such
as "43% of our articles are about people".
Agreed. But the long-term goal should be for
"Made in US" to be dynamically
generated. It's just a bunch of relationships ("made in", "died
in") and a
list of attributes -- hierarchical even, in this case ("New York",
"US",
"North America"). You can have all kinds of fun with that, until someone
adds a relationship like "was named after" and your software concludes
that if people named after London are also named after Great Britain :-).
Heh, yeah attention has to be paid to what meaning can be extrapolated
from a supercategory/superattribute relationship.
But you have to deal with them anyway. Your suggested
something like this:
women
*real women
*-living women
*-dead women
*fictional women
You _are_ using attributes here. So what if I'm looking for the biography
of a female Polish chemist but don't know whether that woman is still
alive? Do I have to check both categories, or do we maitain trees for
every possible order of attributes (which is pretty much what we are
doing right now, manually)?
Yeah, that doesn't work well. Better to use semantic attributes,
possibly with antonym relationships built in (not sure of the
immediate use, but it's probably helpful to distinguish between
living/not living/unknown. So, to look for your female polish chemist,
you simply look for person (or possibly, chemist), +female +polish.
Where's {{Category:Polish chemists}} coming from?
Defined on a separate
Defined on a separate page by someone who thought it was a meaningful
and useful category, and worth spending 2 minutes making.
page? And do we also add {{Category:Female chemists}}
and {{Category:Polish
physicists}} and {{Category:Polish women}} to the same article?
You could, and the software (small matter of programming) would be
smart enough to take the superset of all these things:
Person +chemist +Polish
Person +female +chemist
Person +physicist +Polish
Person +female +Polish
Net result: Person +chemist +physicist +female +Polish
Alternatively if you knew the attributes directly you could just do
{{Polish chemists}} +female +physicist
Steve