My general view on this is that if somebody sends an e-mail, that is
their problem. If they don't want their employer to find it, they
should have thought of that before they sent it.
Mark
2008/4/27 Brion Vibber <brion(a)wikimedia.org>rg>:
Michael Bimmler wrote:
To put it bluntly, I dare suggest from a
non-technical POV that the "htdig"
(that's the name, isn't it?) experiment has failed. If we can only update
our search index every 6 months or so, it is pointless to have it.
Yeah, it doesn't work as well as advertised.
Instead, I suggest that
http://lists.wikimedia.org/robots.txt be modified as
to allow Google (and other search engines) to crawl /pipermail/ again. I do
not really see the privacy issues of this, nabble, gmane etc. are
google-searchable as well and I really don't see the point in barring Google
from our own archive.
For the meantime, I'm going to have to recommend not doing this (see my
notes below for why).
As you note, it's already possible to search via third-party archives.
It would probably not be difficult to replace the broken htdig search
form with a link to a nice offsite archive, though.
If I am very honest, I do not even remember
anymore, why we decided to bar
Google from
http://lists.wikimedia.org/pipermail.
Because:
a) The current mailman/pipermail system makes it a *huge* pain in the
butt to remove mails from archives on request
b) I got tired of the volume of requests to remove mails from archives,
with the consequent time required in handling them
c) With the wildly popular
wikimedia.org domain out of the running,
third-party list archives aren't as visible in general search engine results
d) Therefore, the volume of requests go down
e) and I don't feel bad turning down most of the remaining requests.
If and when mailman's archiving system is fixed up to make it quick &
easy to take a mail out of archives (eg, *not* involving shutting down
all mail processing, rebuilding an entire list's archives since 2001,
and discovering that all the links are now broken because mailman's
internal behavior has changed in the intervening years and it splits up
messages differently), then I'll be happy to pop us back into general
search engine indexes.
Was it due to privacy concerns? If so, which, and
why is
lists.wikimedia.orgas an archive different from Nabble/Gmane?
That'd be c) above.
-- brion vibber (brion @
wikimedia.org)
_______________________________________________
Wikitech-l mailing list
Wikitech-l(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l