[WikiEN-l] Copyright Violation Bot
Neil Harris
usenet at tonal.clara.co.uk
Fri Dec 22 01:09:42 UTC 2006
Neil Harris wrote:
> geni wrote:
>
> [snip]
>
>> Certain searches of existing content would be useful the most obvious
>> being running a copy of the database against a copy of britianica.
>>
>>
>
> And other databases of copyrighted texts, such as InfoTrac
> (http://www.gale.com/onefile/) or similar, and things like Google Book
> Search.
>
> -- Neil
>
Just a thought: the en: Wikipedia gets about 3 edits a second. I wonder
if it would be possible for us to use special pleading through the
Foundation to get a dedicated search pipe into Google that would allow
us to do, say, 30 searches a second 24 hours a day, (which would only be
a tiny, tiny fraction of their overall capacity), in recognition of the
_very_ substantial benefit in advertising revenue they must surely
currently be receiving as a side effect of having Wikipedia's content
online to draw in search queries.
(Think about it: even if only 20% of Wikimedia's 4000 or so page loads a
second come from Google users who are expecting something like Wikipedia
content, and Google only make $0.25 CPM on serving page ads on searches
for those pages, that comes to an income stream of $0.20 per _second_
from Wikipedia searches, or a total of about $8M a year...)
If so, we could integrate the copyright violation bot into the
toolserver, or into the MW server cluster itself.
-- Neil
More information about the WikiEN-l
mailing list