[WikiEN-l] Please remove user data and talk pages from database dumps!

Jake Nelson duskwave at gmail.com
Fri Dec 30 03:39:21 UTC 2005


I just realized that two replies I sent 12/24 on this topic just went to 
Philip, and not to the list. Figured I might as well resend them:

 >1

Philip Sandifer wrote:

 > Yep. Sucks, doesn't it? Unfortunately, if it's a consequence you're
 > unwilling to accept in any circumstances, you should probably think
 > twice about releasing your contributions under the GFDL.


That's not really much of a point: all the GFDL does is say that as long 
as the downstream users of the information comply with the GFDL, they 
won't be violating copyright. Even if my contributions were 100% public 
domain, editing them in such a way to say something about me that was 
untrue and damaging to my reputation would still be libel, and still be 
illegal.

GFDL does not make libel legal. It just makes the libelous edit not a 
violation of copyright. So GFDL is irrelevant to this issue.

What -is- an issue is which portions of the database are made available 
for download for sites wanting a copy of Wikipedia's article base to 
start from. And the User space and pages containing user discussion are 
of little use to Non-Wikipedia sites, while being potentially 
problematic for Wikipedia users.

-- Jake Nelson

 >2

Philip Sandifer wrote:
 > Except that the GFDL mandates that we distribute transparent copies
 > of everything - if we did not distribute transparent copies of the
 > userpages, we would violate the GFDL.


The GFDL only requires you either distribute, or provide a network 
address location for, Transparent copies if you distribute more than 100 
Opaque copies. So besides legal quibbling about how many "copies" of 
Wikipedia we're distributing, there's the matter of whether or not 
Wikipedia user pages on the site are Opaque...

The GFDL says that standards-conforming HTML and XML with a publicly 
available DTD are considered Transparent. So it could legitimately be 
said that the Wikipedia User pages, /on the website/, not in a DB dump, 
are available in a transparent format.

Besides, I never said that we had to make it impossible to download the 
user space, just that it needn't be in the default dump for downstream 
users/content mirrors. If it's argued that the user pages as they are on 
the site aren't Transparent (which they are), providing a link to where 
the full dump can be had would keep us in compliance with the GFDL.

-- Jake Nelson




More information about the WikiEN-l mailing list