Wikitech-l February 2002

wikitech-l@lists.wikimedia.org

10 participants
70 discussions

by Jimmy Wales

Jan said that I have caching turned off, which surprised me because I thought it was on. Now I've looked at the code and I still think it is on. wikiSettings.php: $useCachedPages = true ; wikiLocalSettings.php: # $useCachedPages = false; # Disable page cache (This is commented out, right?) ----------------- Playing with benchmarking, grabbing a normal article 100 times: (I know that this type of benchmarking is not very scientific, since conditions may change on the live server due to someone else doing something big at the same time, etc. But I think it gives an indication.) As the site is running: /apache/bin/ab -n100 -c1 http://www.wikipedia.com/wiki/Alabama Requests per second: 0.95 [#/sec] (mean) Now I will set $useCachedPages to false by uncommenting the line in wikiLocalSettings.php. /apache/bin/ab -n100 -c1 http://www.wikipedia.com/wiki/Alabama Requests per second: 0.97 [#/sec] (mean) So I see no material difference. How can I easily tell if caching is actually on or off? Am I doing something wrong here?

22 years, 2 months

Re: [Wikitech-l] Cache vs. other-language links

by Jan Hidders

From: "Jimmy Wales" <jwales(a)bomis.com> > Jan Hidders wrote: > > Can I suggest we simply stop with the whole caching thing? It complicates > > things unnecesarily. Keeping the code simple should be one of our top > > priorities. Jimbo doesn't have it turned on at the moment anyway, > > I have no strong opinion about this, but I wanted to say that I > thought I did have it turned on. If it's off, that's a mistake. My mistake. I saw that the language links worked, but hadn't realized I was looking at the page just after I edited it. I see now that you do have it on because upon reloading the page the language links are missing, so I get apparently the cached page. > Tell you what, I'll benchmark with and without, on the live server, > and report the numbers. Yes, that would be very very welcome. -- Jan Hidders

22 years, 2 months

Re: [Wikitech-l] Cache vs. other-language links

by Jan Hidders

From: "Magnus Manske" <Magnus.Manske(a)epost.de> > > The parser has to be brought up to speed. I'll also have a look into > connecting the PHP script with the C++ parser I wrote (did I mention 0.05 > secs for rendering "Signal transduction", with fetching it from the > database, searching the database for existing topics, and adding the > "framework"?;) As I said I'd rather keep it in PHP, but it's your project of course. Does your parser put any requirements on the syntax. Should it be LL(1) or LALR(1)? Are you going to use yacc, or is it just a simple recursiev descent parser? What we could improve in PHP for example is that the current parser parses the string paragraph by paragraph. (But please don't use the function explode() for that because that is a memory killer.) Most replace-functions could be limited to only one paragraph and the rest can be dealt with by making the parser a little context-sensitive. Standard Wiki matrup is supposed to be limited to one paragraph anyway. The HTML markup is a bit harder, but there you can remember the nesting depth and type of nesting, and once you see that that tags are not balanced you go back in the string and replace the < and > with entities. This will be expensive but it is an exception, so it won't hurt. -- Jan Hidders

22 years, 2 months

Cache vs. other-language links

by Brion Vibber

I noticed that the other-language links (links in the form [[fr:Japon]] [[en:Japan]] [[eo:Japanio]] etc which are hidden in the article body but listed by language name in the header bar, pointing to the article on the current subject in the other-language wikis) are vanishing on cached pages, because they're scanned and listed during the wiki->html link parsing which of course doesn't occur when loading a cached page. I've a hackish fix for that which explicitly seeks out the other- language links for cached pages, but I don't like it very much. It's inelegant, and two sets of code have to be maintained to do the same thing in different contexts. What I'd like to do is add a column to the cur table, something like cur_links_languages which would be analogous to cur_links_linked and cur_links_unliked. The list of inter-language links for a page would be stored when the page is saved, then easily loaded up again along with the cache. This would also make it easy to provide statistics on the degree of linkage between language wikis. (No change in current user-visible behavior except in fixing the obvious bug of vanishing links, and potentially providing more information in special:Statistics etc.) Alternatively, we might have a separate database which contains nothing but lists of connected articles. This could facilitate keeping the other-language links consistent; if somebody adds an article "Japón" to the Spanish wikipedia, it shouldn't be necessary to separately add [[es:Jap%f3n]] to the English, French, Esperanto, etc. articles. Keeping a central repository would mean that it only needs to be linked in with the others once, and all linked articles will immediately benefit by being able to list it without manual editing. Upside: added simplicity for article writers, who don't have to maintain as many links. Downside: added complexity for site maintainers, who have to run a second database or not get all the other-language links. Also might be more difficult to remove incorrectly linked articles. An alternative to the separate link database might be a robot/automatic process that occasionally looks through all the wikipedias checking for consistency in the other-language links and automatically adding (or alerting a human that one ought to add) new other-language links where needed. So what do people think? Should we try one of these, or should I just check in my hackish fix for the meantime? -- brion vibber (brion @ pobox.com)

22 years, 2 months

Most Wanted

by Magnus Manske

I just tried to refresh the page when I got this : (The page was last refreshed just -161981 minutes ago; please wait another 161986 minutes and try again.) That's about 4 month...

22 years, 2 months

HTML munging code rewritten

by Brion Vibber

I've rewritten wikiPage::removeHTMLtags again. (Checked into CVS, diff: http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/wikipedia/phpwiki/fpw/wikiPa…) Exciting new features: * Removes unwanted tag attributes, such as the scripting attributes (onmouseclick, onmouseout, etc) which can be used to create fake links or automatically redirect the browser to another web site (see the previous version of the [[Goatse.cx]] article for an example) * Makes a more serious attempt to fix mismatched open/close tag pairs. Related, makes some attempts at normalization of tables. ie, <tr> not allowed outside of <table> etc. * Nested tables now work. The function feels more weighty than it ought to be, but it works on everything I've tried throwing at it so far, which is an improvement over the previous versions. I also threw in fixes for: * Character entities in <pre> sections * ISBN numbers with letters in them * == Section headers == at the edges of HTML tags -- brion vibber (brion @ pobox.com)

22 years, 2 months

[neil.harris@tonal.clara.co.uk: Walone2.ico - a re-send of the missing favicon.ico file]

by Jimmy Wales

----- Forwarded message from Neil Harris <neil.harris(a)tonal.clara.co.uk> ----- From: Neil Harris <neil.harris(a)tonal.clara.co.uk> Date: Tue, 12 Feb 2002 09:38:03 +0000 To: jwales(a)bomis.com Subject: Walone2.ico - a re-send of the missing favicon.ico file Dear Jimbo, This is the favicon file which got lost during the software upgrade. It should be the most recent 16x16 version, and should be installed at http://www.wikipedia.com/favicon.ico In an ideal standards-compliant world, generated pages would have <link rel="shortcut icon" href="http://www.wikipedia.com/favicon.ico"> added to their HEAD sections. * M$ browsers are hard-wired to find the favicon at the location given _unless_ the 'link rel' stuff is in the page, in which case they will use that * pure W3C standards-compliant browsers won't find the icon unless the link rel stuff is in the page * Mozilla changes its policy from release to release... I hope this is useful. -- Neil ----- End forwarded message -----

22 years, 2 months

Cache problems?

by Axel Boldt

I'm not sure if the live site uses caching already (or an intermediate proxy), but I have repeatedly noticed that the served version of a page does not always completely agree with the current wiki version. Right now, it is happening at the main page: all previous versions and diffs show "Winter Olymplics" under Current Events, but the served version of the page omits it. I'm pretty sure that nobody deleted that text. Axel

22 years, 2 months

garbled URL

by Lars Aronsson

I was trying today to sort out the [[Swedish monarchs]], when I stumbled on this page: http://www.wikipedia.com/wiki/Bgustav/bus+Adolphus+of+Sweden It looks like this URL is the result of a bug during the conversion. Perhaps this is interesting evidence for Magnus? (BTW, we have a few old kings of Sweden named Magnus...) There is a king of Sweden who's name is sometimes spelled Gustavus Adolphus, but sometimes Gustav Adolf. It seems this URL is the result of someone trying to link to [[<b>Gustav</b>us Adolphus of Sweden]], but this is just my guess. -- Lars Aronsson (lars(a)aronsson.se) Aronsson Datateknik Teknikringen 1e, SE-583 30 Linköping, Sweden tel +46-70-7891609 http://aronsson.se/ http://elektrosmog.nu/ http://susning.nu/

22 years, 2 months

Re: [Wikitech-l] Latest version installed

by Jan Hidders

From: "Magnus Manske" <Magnus.Manske(a)epost.de> > I fixed it already. Look at the user preferences. No, that's not it. It happens when your preference is "ignore minor edits". There you might not get all the changes during a day if your maxCnt is not very high. I know already what the problem is (I should not select the first maxCnt there because I am sorting by title_name, not by timestamp.) I hadn't noticed this because my testing database wasn't that big. I know I can solve this, but it takes some thinking. -- Jan Hidders

22 years, 2 months

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Wikitech-l February 2002