[Labs-l] Google bot

Magnus Manske magnusmanske at googlemail.com
Mon Oct 20 11:33:27 UTC 2014


So what I imagine happens:
* Someone finds a tool that is useful to run e.g. a query on any category
with a certain template.
* This is added as an auto-filled link in the template, for common
maintenance tasks.
* Googlebot follows all of these pre-filled links from Wikipedia and/or
mirrors.
* This does not just return the empty tool form, but actually runs the tool.
* Horrible things ensue on Tools Labs.

AFAIK, all tools use a default .lighttp configuration by default. Is that
replaced or extended by a local config file?
If it's replaced, the default config could exclude Googlebot, and even a
blank local config file would re-enable Googlebot again, for those who want
it.



On Mon, Oct 20, 2014 at 2:33 AM, Maximilian Doerr <
maximilian.doerr at gmail.com> wrote:

> I disagree.  Google bot has been nothing but a nuisance, continuously
> probing my tool with different queries.  It has drained resources
> needlessly, and was quite glad that tool labs had it blocked.
>
> Cyberpower678
> English Wikipedia Account Creation Team
> Mailing List Moderator
>
>
>
> On Oct 19, 2014, at 21:30, Nuria Ruiz <nuria at wikimedia.org> wrote:
>
> Why would we want to restrict google indexing of the whole cluster? There
> are tools of many different nature deployed there, seems that indexing or
> not should be configured on a instance per instance basis.
>
>
>
> On Sun, Oct 19, 2014 at 4:41 PM, Maximilian Doerr <
> maximilian.doerr at gmail.com> wrote:
>
>> Who protested to that, and why would that be a problem?
>>
>> Cyberpower678
>> English Wikipedia Account Creation Team
>> Mailing List Moderator
>>
>> -----Original Message-----
>> From: labs-l-bounces at lists.wikimedia.org [mailto:
>> labs-l-bounces at lists.wikimedia.org] On Behalf Of Marc A. Pelletier
>> Sent: Sunday, October 19, 2014 7:29 PM
>> To: labs-l at lists.wikimedia.org
>> Subject: Re: [Labs-l] Google bot
>>
>> On 10/19/2014 03:50 PM, Magnus Manske wrote:
>> > I vaguely remember that indexing bots (like the Google one) were
>> > filtered out by Labs already?
>>
>> They were, for some time, but then I got some fairly vehement
>> protestations that tools being unindexed by Google was a problem.
>>
>> -- Marc
>>
>>
>> _______________________________________________
>> Labs-l mailing list
>> Labs-l at lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/labs-l
>>
>>
>> _______________________________________________
>> Labs-l mailing list
>> Labs-l at lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/labs-l
>>
>
> _______________________________________________
> Labs-l mailing list
> Labs-l at lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/labs-l
>
>
>
> _______________________________________________
> Labs-l mailing list
> Labs-l at lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/labs-l
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.wikimedia.org/pipermail/labs-l/attachments/20141020/250451fe/attachment.html>


More information about the Labs-l mailing list