[Foundation-l] Info/Law blog: Using Wikisource as an Alternative Open Access Repository for Legal Scholarship

Anthony wikimail at inbox.org
Sun Jun 21 12:07:44 UTC 2009


On Sun, Jun 21, 2009 at 7:54 AM, John Vandenberg <jayvdb at gmail.com> wrote:

> Whether Google is good or evil is off-topic, and irrelevant to boot.
>

Whether or not they have a right to exclude bots isn't.

Also worth noting, Project Gutenberg has digitised less than 30,000
> books since 1971.  Distributed Proofreaders has done 15,000 of those
> since 2000, so throughput is picking up.  But, there are more than
> enough too keep everyone busy for a very long time.


The interesting thing is, even if you don't use a bot, it's still faster to
copy/paste from Google manually than it is to get the book and scan it in
yourself (assuming you don't want to destroy the original, anyway).

If you're going to make a project out OCRing books that Google has already
OCRed, I don't see any point in reinventing the scanning or first pass
OCRing part.


More information about the foundation-l mailing list