[WikiEN-l] Types of categories

Steve Bennett stevagewp at gmail.com
Wed Jun 7 11:56:35 UTC 2006


On 6/7/06, Roger Luethi <collector at hellgate.ch> wrote:
> Because it does. Where it's different is that the category names make it
> much clearer what does and what does not belong into the category, the
> catch-all "thematic" categories are eliminated (or made explicit by saying
> "is related to").

Maybe the way to go is have only attributes, but then make certain
attributes themselves belong to thematic categories. Something like:

The Outsider
+Novel
+By Albert Camus
+Existentialist

Then make a thematic category "Existentialism" into which the
constructed (that is, automatically built by an intersection) category
"Existentialist novels" is grouped.

Not a good example come to think of it.

Should we try and work out, what is the basic point of thematic
categories? The problem is that the articles in them are related in
such different ways, that it would seem much better to have a page
that actually explains the links, and fits them in in a way that makes
sense: existentialist authors, existentialist works, history of
existentialism etc. Even a structured way of displaying items in
subcategorys on the page itself would be handy.

Just to continue this example, look at [[Category:Existentialism]].
There are subcategorical conceptual articles ("Existential desire"),
related conceptual articles (Absurdism), works (Exile and the
kingdom), and miscellaneous (Kierkegaard and Nietzsche comparisons).
Then there's the subcategory "Existenialists" where we find Sartre and
Camus. And even better, a perfect example of a thematic subcategory to
a taxonomic category to a thematic category: "Søren Kierkegaard",
which mostly contains his works, but also mentions a research centre
and the man himself.

>From a navigational point of view it would be so much more helpful
having a single page that quickly defined the topic, linked to the key
articles about the topic, then presented sections "Existentialist
authors" that transcluded the relevant category, "Existentialist
works" that did the same, and finally ended with "related categories".
If we had that, then Wikipedia would be really starting to get
somewhere as a structured encyclopaedia, as opposed to simply a
massive number of articles on various topics.

> > Good, good - why is that cat hard to maintain manually?
>
> Because for every place someone was born in, you have to recreate the
> hierarchy of geography, which is large. An automatic system could infer
> from existing information that [[Category:born in McComb, Mississippi]]
> also means, say, [[Category:born in Mississippi]], or if you want to be
> really fancy, that being born in Prague implies a different country
> depending on the year of birth.

Hmm, well apart from the Prague bit, it would certainly be cool to be
able to hijack a geographical structure that had been built once to
create other attributes/categories. You would need something fancy
like categorical operators though, which is definitely getting more
advanced and complicated.

Category:Bridges in France would become: Category: (is a bridge) (in)
(place:France). Since (place:Lyon) would be defined once and for all
as being a subcategory (subplace?), any bridge in Lyon would
automatically be a bridge in France.

To solve your example, a different "operator" (born in) would be used,
combined with a place. The end result is to replace a number of
parallel hierarchies with a single one combined with several different
operators:

in (place)
born in (place)
died in (place)
famous in (place) ?
buried in (place) ?
banned in (place) ?

One obvious advantage is that people could immediately start assigning
articles to categories like "Born in New York" or even "Born in
Queens", without losing them from the broader categories "American
musicians" etc.


> > > American child actors: ...? "is a child actor" or "was a child actor?"
> >
> > Yes. The fact taht these are hard to express clearly is very telling.
>
> What does it tell you?

That the categories are badly designed and ambiguous.

> > Erm, I mean, people will probably end up being "casual" with
> > attributes...but if we could make the taxonomic classificatins a bit
> > more firm...not sure what I'm getting at (it's late).
>
> The easiest way to make them more firm is by making them self-explanatory.
> Otherwise it's not unreasonable for people to assume that a category is
> thematic.

Yep. If the category doesn't tell you exactly what it's about, it must
be "random stuff with some tangential connection to X" :)

> > Like [[Category:Lasers]]? ;)
>
> Exactly. If it was called [[Category:is a Laser]], there would be less
> confusion.

Actually [[Category:Types of lasers]] would be better - "blue laser"
is not really "a laser". Very few individual lasers would deserve
entries.

Hard to see how "laser" would be a good name for a thematic category
though. But where else would laser eye surgery or light sabers go?

> Of course. The question is not if, but when. Evaluating all relations and
> attributes on-the-fly may be way out there, but you could use it for
> offline-processing today, and that's what _you_ and many other people seem
> to be interested in.

Depends what "on the fly" means. "Each time an attribute is updated"
would be one thing. "Each time a user requests" it would be another.
The former would fit the mediawiki model a lot better.

> Semantic Wiki does have challenges:
>
> * The use needs to extend beyond nice statistics. Editors must see an
> immediate benefit. There are major concerns about hidden metadata that is
> invisible in the Wikipedia proper (unless you check the source, that is).

Clearer groupings, better portals, automatic navigational boxes, more
powerful categories (like "German-born Polish scientists buried in
France (2 articles)")

> * It must try to prevent an ontology mess that we have with categories.
> It can't ever be the same mess by virtue of its very own nature, but you
> can for instance create confusion with "Relation:Is located in",
> "Relation:Located in", and "Relation:Has location".

Redirects solve so many of these problems.

> > Categories: Bridges, in France, built by Romans
> > See other: Bridges in France (200 articles), Bridges built by Romans
> > (137 articles), Bridges in France built by Romans (15 articles).
>
> It does sound cool, but if you have a dozen categories on an article rather
> than three, you have more intersection categories than you want to put at
> the end of the article.

Some way where editors can guide the system would be valuable. Some
way where power users can ignore the editors' recommendations and see
all the possible intersections (ok, within reason) would be nice.

> > Is that feasible?
>
> Not with the current mess, but other than that, I can't see why not.

We seem to have a couple of different proposals bubbling away, each
requiring different amounts of work to implement. I'll see if I can
summarise somewhere and synthesise something palatable, to really work
out feasibility.

> (who would have guessed that the German WP has an article about an American
> girl (born 1996) who played in a bunch of TV shows, including Startrek,
> while the English WP has no article? I am shocked, shocked I tell you!)

And the English Paris article is better than the French one, and the
German New York article is better than the English one.

Steve



More information about the WikiEN-l mailing list