Wikitech-l February 2004

wikitech-l@lists.wikimedia.org

86 participants
164 discussions

Caching, compression, and servers oh my!

by Brion Vibber

For those not spending 24hrs/day in #mediawiki, here are some updates: The file cache has been turned off. It was largely redundant, since the squids handle upstream caching, and it caused congestion on the NFS server. Moving the file cache onto each server's local disk would be possible but so far doesn't look like it'd be worth the effort. Output compression has been enabled globally for browsers that support it. (Formerly this was special-cased in the file cache as a side effect of compressing the cached pages for disk savings.) Some browsers may be briefly confused during the switchover if they get a 304 response for a page that is _now_ compressed but wasn't before. Mozilla is known to have such problems. Close the browser and restart; that will probably clear it up. If not, give a shout. The additional compression doesn't seem to hurt load on the servers, and it'll cut bandwidth usage further. -- brion vibber (brion @ pobox.com)

20 years, 4 months

Please revoke 168..'s sysop status IMMEDIATELY!

by Daniel Mayer

So far, just today, he has reverted [[DNA]] (again) to his version of the article and then protected his version without adding {{msg:protected}} or listing it at [[wikipedia:Protected page]]. When I updated the summary at [[Wikipedia:Requests for comment/168]] to reflect this 168 protected that page and when the protection was lifted by another admin he DELETED the page (10 times so far) and blanked it a couple times. Please see the summary at http://en.wikipedia.org/wiki/Wikipedia:Requests_for_comment/168 (if it still exists). I hereby request that 168...'s sysop status be removed and for the matter to go directly to the arbitration committee. Since I am on the arbitration committee and a party to this dispute I will recuse myself. -- Daniel Mayer (aka mav) __________________________________ Do you Yahoo!? Yahoo! Finance: Get your refund fast by filing online. http://taxes.yahoo.com/filing.html

20 years, 4 months

Squid stats / switch almost done

by Gabriel Wicke

Almost all traffic is hitting browne now (coronelli isn't in the dns table for the main domains), thing work quite well so far. Live stats of browne & coronelli set up by jeronmim via snmp are online at http://wmperf.mine.nu:8043/wmperf/index.org.wikimedia.all-squids.html. You can see the cache grow... Cheers -- Gabriel Wicke

20 years, 4 months

Friday

by Jason Richey

I'm heading to San Diego Tomorrow (Friday the 13th) to remove the servers that are now out of use (most notably Gunther and Geoffrin). I imagine I'll be in contact with Brion before I pullthe machines, but feel free to pipe up if you have concerns... -- "Jason C. Richey" <jasonr(a)bomis.com>

20 years, 4 months

Access Denies

by Nils Kehrein

> While trying to retrieve the URL: http://de.wikipedia.org/wiki/Hauptseite > > The following error was encountered: > Access Denied. > Access control configuration prevents your request from being allowed at > this time. Please contact your service provider if you feel this is incorrect. > Your cache administrator is webmaster. <span style="AYBABTU"> What happen? </span> Nils. -- Created by 100 monkeys with 100 typewriters.

20 years, 4 months

DNS updates

by Brion Vibber

Jason, could you update the DNS entries for the various *.wikipedia.org web servers to point at the new cache servers? Ideally we should have two records so it'll hit both squids round-robin: 207.142.131.235 207.142.131.236 These are virtual addresses aliased by browne and coronelli. For testing or failover one can take on both addresses. -- brion vibber (brion @ pobox.com)

20 years, 4 months

What I support for wiktionary.org

by Jimmy Wales

I've asked Jason to setup a wiktionary-l. I don't know if he has yet, but when he has, I intend for there to be a big notice posted on wikitionary.org, on the wikipedia-l and wikien-l mailing lists, as well as on Wikipedia itself, inviting a sort of "global summit meeting" to discuss some of the things that I outline below. Wiktionary has been cranking along happily in a state of technical neglect for quite some time. There are currently 32,246 entries. That's enough that we must preserve the work that's already done. It also precludes any change of license, whether that's fortunate or unfortunate I don't know. There is an active community there, with a lot of overlap to the broader wikipedia community. The need to be consulted on any changes that we implement. They have an existing schema whereby they are doing in freeform text just what we ought to try to help them formalize with actual database functionality. One can only assume that their scheme is sometimes followed inconsistently because human editing is inevitably inconsistent. However, there appears to me to be enough consistency that a semi-automated conversion process should be possible. Anything that we do should favor the needs of editors over abstract a priori desires for the end product. That is to say, if some fancy and clever thing requires a lot of work from editors, we just skip it. The editors are primary, or any wiki community will be destroyed. At the same time, we should design a "structured wiki" with one eye on campatibility with re-use. If there are existing XML schemas that have prominence in the wider community, we should look to them as a part of our design, even if we deliberately choose not to implement every possible aspect in order to favor ease-of-use for editors. Consider this for an example: http://wiktionary.org/wiki/Vision As a rank amateur database designer, I see several immediate possibilities which would make an instant and easy improvement. Even if we had a simple and less-than-ideal design, we could lay the groundwork now for something better in the future. I'm a huge fan of incremental change in cases like this. We'd like to improve the software for the wiktionarians in a way that conforms to how they like to edit, while laying the groundwork for further revisions down the line. -------- Consider a really bad database design, a 'flat file' design, or nearly so. word AHD pronunciation IPA pronunciation SAMPA pronunciation definition synonym list related terms list translation list This is a horrible design, with multi-valued fields, etc. It can be improved in just a few minutes of work. But even this horrible design would be better than freeform text. Developer time and energy is at a premium (at least, until some clever developer really takes this up as a cause!) and so simplicity is a huge virtue. A little bit of fixing done soon, is better than an imaging hypothetical perfect system that's too intimidating and never gets off the ground. --Jimbo

20 years, 4 months

State of the servers

by Brion Vibber

Here's the current rundown: In California: * larousse is running a squid proxy to the web servers in Florida * gunther is making a backup dump * ursula is not doing much of anything * geoffrin, pliny and susan are very sad, wishing someone would take away their pain In Florida: * suda is running a master database * zwinger is serving mail and is the main fileserver for the other machines * bart, bayle, moreri, and vincent are running identically-configured apaches. They share their work directories by NFS, so uploads etc should remain quite in sync. The squid picks between then round-robin. Right now only bart is running a memcached, but it should work to split the cache over them all, and they'lll ask each other for the bits they don't have. * browne and coronelli have test squid installations that Gabriel is setting up, testing failover systems etc. * isidore isn't doing anything just yet but should soon start syncing over a backup of zwinger's stuff. So far things seem mostly in order. Zwinger's occasionally freezing up for a couple seconds, may be something that does too much IO. Trying to track this down. Eventually we'll want to move the squids over to florida, which means changing DNS. This'll cut out the 80ms round-trip across the continent for every hit, as well as give us bigger beefier caches. -- brion vibber (brion @ pobox.com)

20 years, 4 months

Re: [Wikitech-l] Webwasher User-Agent should be allowed

by engelsAG＠t-online.de

"Axel Boldt" <axelboldt(a)yahoo.com> schrieb: > Do we forbid certain spiders access to the site based on User-Agent? A > user in a German forum reported recently that he couldn't access > Wikipedia at all, always receiving a "Forbidden" message. It turned out > that his webwasher proxy was to blame (an ad banner block). The proxy > sends the User-Agent > > "Mozilla/4.0 (compatible; MSIE 5.0; Windows 98; DigExt) WebWasher 3.0" > > Webwasher cannot be used to spider and download sites. We forbid spiders based on User Agent, but WebWasher seems not to be in the list. according to http://www.wikipedia.org/robots.txt, the following User-Agents are disallowed: UbiCrawler DOC Zao sitecheck.internetseer.com Zealbot MSIECrawler SiteSnagger WebStripper WebCopier Fetch Ofline Explorer Teleport TeleportPro WebZIP linko HTTrack Microsoft.URL.Control Xenu larbin libwww ZyBORG Download Ninja wget grub-client k2spider NPBot HTTrack Furthermore I know that any request without a User Agent is refused. There might be others, but someone who knows more about it than me should check that. Andre Engels

20 years, 4 months

Re: Webwasher User-Agent should be allowed

by Daniel Herding

brion wrote: >I can access pages on de.wikipedia.org using the above user-agent >string, but don't have WebWasher to test with. I'm using WebWasher and haven't had any problems with it yet. Maybe the user misconfigured his custom WebWasher filter rules, but this wouldn't give a "Forbidden" message, but something like "WebWasher is configured to block the requested page 'http://de.wikipedia.org/'". Daniel

20 years, 4 months

← Newer
1
...
8
9
10
11
12
13
14
...
17
Older →

Jump to page:

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Wikitech-l February 2004