[Foundation-l] Re: Hosting scans of the 1911 Britannica on Wikimedia

Tim Starling t.starling at physics.unimelb.edu.au
Wed Nov 9 02:37:17 UTC 2005


Brian wrote:
> For those who don't know, the 1911 Encyclopaedia Britannica is a famous
> public domain encyclopedia, advertised as the "sum of all human
> knowledge" in 1911.
> 
> I recently (today) acquired a DVD containing scans of every page of the
> 1911 Britannica, along with index files for it all, organized by letter
> and page number. I've already talked with avar, TimStarling, and brion
> on IRC, and TimStarling specifically asked me to tell you all that he is
> "confident that the server requirements will be minimal." They would set
> up a domain name, generate some web pages automatically using the index
> files, and host the entire set of 29,700 files totaling about 4 GB.
> 
> One more thing, these are black and white TIFs, and there is discussion
> about whether they should be mass converted to PNGs to be easily viewable.

A few notes on this: firstly it seems that the guy who made the scans has no
intention of claiming any rights to them. He seems to be interested in
disseminating the material widely, for religious reasons. His webpage is here:

http://freierscientologe.netfirms.com/booksbritannica.htm

The CD/DVD sets are apparently quite rare, Brian was lucky to get his hands
on one at a fairly cheap price.

There's the trademark issue -- Britannica may attempt to scare us with legal
threats over this. A disclaimer on every HTML page declaring non-affiliation
with Britannica would probably put us on sound legal footing, although I'd
be willing to hear advice about this from people who are more knowledgeable.
If the "LoveToKnow Free Online Encyclopedia" (1911encyclopedia.org) can host
this content, then we should be able to find a way too. And we can do it
without the abominable license restrictions and "copyright traps" scattered
throughout the work to enforce them.

Wikipedia owes a lot to the 1911 edition -- we've copied many of its
articles. A public, canonical copy will be a valuable tool to deal with
LoveToKnow's frequent OCR errors, its incompleteness, and its specious legal
threats against us based on our use of unspecified copyright material hidden
in their doctored online copy. Hopefully the availability of page images
will spur development of a complete and accurate OCR copy.

The only question in my mind is the domain: should this be under
eb1911.wikipedia.org? We could make it visually distinct, to avoid confusion
with Wikipedia itself. Or would eb1911.wikimedia.org be better? Or
eb1911.wikisource.org?

-- Tim Starling




More information about the foundation-l mailing list