On 12/29/05, Ben Emmel <bratsche1(a)gmail.com> wrote:
On 12/29/05, Bryan Derksen
<bryan.derksen(a)shaw.ca> wrote:
But then there'd have to be a new separate
database dump that _did_
include the user/talk pages. The purpose of the database dumps is not
just to allow someone to toss up a mirror of the current article
versions and make a few bucks from banner ads, it's to allow Wikipedia
as a whole to be researched or recreated or otherwise manipulated in
ways that can't be done just from the existing website. The user and
talk pages are important parts of how Wikipedia functions, they should
be available for historical reasons if nothing else.
I think that is very true: someone who wished to study the whole project
(community and encyclopedia) as a whole would need those user, talk, and
Wikipedia namespaces. However, having two seperate dumps would defeat the
attempt to conceal some of the personal information of Wikipedia editors.
Joe.Wikipedian could simply download the second dump including the
userspace, and then stick some ads with the content on his own server.
Even if there wasn't a dump at all, someone could still scrape all the
user pages and mirror them. You're not going to stop someone who is
intentionally doing these types of things. But you can make it less
likely that someone is going to do so accidently, and you can make
things easier for those who want to include the articles but nothing
else.
The only way I could see a scheme like this working is
if the Foundation
somehow controlled who had access to the second dump. I believe that this
would be too unwieldy, and probably defeat the spirit of the GFDL, if not
the acutal letter of the law.
I really don't see what this has to do with the GFDL. I guess you
could argue that everything on the website is a single GFDL document,
that Wikipedia must therefore distribute a transparent copy of the
entire document, and that HTML isn't a suitable format for that
transparent copy (even though the GFDL specifically says that it is).
But if you're going to get that technical (and even if you aren't),
Wikipedia's database dumps already are horribly non-compliant with the
letter and spirit of the GFDL. As was pointed out in the original
post, they don't even contain a complete list of contributors. And
the distributed copies don't point to the url for the dump anyway.
Anthony