On Sat, Jan 16, 2010 at 10:07 PM, Jesse (Pathoschild)
<pathoschild(a)gmail.com> wrote:
Unfortunately, categories and database queries are
inadequate for our
needs. Someone can indeed navigate to Categories::Works::Works by
genre::Non-fiction::Governmental::Biographies::Ancient biographies,
and they'll find all 5 pages that someone thought to categorize to
this depth. But if someone hopes to find our 1872 American
biographies, they are going to be sorely disappointed.
You can do this with database queries fine -- there are already
several different toolserver tools that will do category intersections
for you, and a couple extensions. In fact, bog-standard search will
do it for you, although AFAIK only for categories added literally (not
by templates):
http://en.wikipedia.org/w/index.php?title=Special:Search&redirs=1&s…
It wouldn't be that hard to allow template-added categories too. I
assume you have categories like "books published in America", "books
published in 1872", and "biographies" -- if not, you can easily add
them via your templates (although that wouldn't work right now with
standard search AFAIK, it would work with things like CatScan).
If we simply extend MediaWiki to support metadata for
works or
authors, the metadata is limited to these types and fields. Public
metadata can be extended and parsed in any way the local community or
our content users feel useful.
Sure, but this is not internal use, so not relevant to my last post.
This is also not possible with database queries, since
the metadata is
not provided to the software except as part of the wiki text.
It is if you use categories. It would also be possible to hack up
some tool to store all template parameter-value pairs, which are
strikingly similar to the idea of RDFa triples: (article,
template+parameter name, parameter value).
There is very little difference between internal and
external use;
it's no easier for a Wikisource editor to find those 1872 American
biographies. Editors are also users.
By "internal use" I mean "use by software designed only to work with
MediaWiki", not "use by Wikimedia users". Standards are only needed
if we want to be useful to software that's also meant to work with
other sites. That way, the software can use the same code to process
both our site and the other sites, since all output the same standard
markup. If the software is only processing MediaWiki sites to begin
with, then standard markup is useless. (Unless it happens to expose
convenient libraries, like with XML or such -- but that's probably not
the case here.)
So, these metadata formats are definitely *not*
useless for internal
community use.
No, they really are. It's almost certainly more work for us to use a
standard of any kind than to make up our own internal format, so if we
only care about internal use, bothering with standards is
counterproductive. The real use-cases are for external users only.