Wikitech-l

wikitech-l@lists.wikimedia.org

22 participants
23076 discussions

by Axel Boldt

I am running the current CVS version locally and updated my database scheme. Search, RecentChanges, edit and preview works fine, but if I try to save my changes, they are all lost and the file is back to the previous version. It doesn't happen for all files, and I don't know what the difference is (two example problem files are the main page and "British Queen"). It's not a browser cache issue, since when I try to edit the page again, the wiki text is also reverted to the old version. I can write to the database, since the page counters change correctly. Does anybody see that problem? Axel

22 years, 3 months

Sourcing in wikiTextEn.php later

by Axel Boldt

With my new checkin permission, I went to work happy as a clam. I added a README file and the GPL license in COPYING. The file wikiTextEn.php uses the $THESCRIPT variable, and it therefore has to be included at the end of wikiSettings.php, after wikiLocalSettings.php has been read. I made a new variable $wikiLanguage which can be set in wikiLocalSettings and which determines which wikiText file is included later. Axel

22 years, 3 months

Re: [Wikitech-l] about changing edited text

by Brion Vibber

On mer, 2002-02-13 at 17:24, Jimmy Wales wrote: > Brion Vibber wrote: > > I'm not sure what advantage would be gotten out of storing a version > > that has had HTML tags worked over, but still needs the wiki code > > converted into HTML every time we load it. We get more speed by caching > > the completely parsed version, or more storage savings by reparsing it > > every time and not storing anything but the the editable text. > > It's worth noting that on the live server, I see no material difference when > I turn caching on or off. Interesting. I have to wonder whether this means caching is for some reason not working at all... It seems to be disabled and/or broken at the moment, unless someone sneaked in and fixed the other-languages bug while I wasn't looking. I ran "ab -n 10" on a couple pages running on my test server with various states: caching on, caching off w/ no removeHTMLtags() call, caching off with the old removeHTMLtags() code, and caching off with my new as-yet unoptimized but more secure version of removeHTMLtags(). The pages per second figures from three trials each: Beryllium (large HTML table, various other tags) * cached 2.06 2.06 2.16 * none 0.94 0.95 0.95 * old 0.90 0.90 0.89 * new 0.47 0.48 0.48 Esperanto-wiki mainpage: (a few <b>, <i>, and <font> tags) * cached 3.26 3.13 3.47 * none 1.84 1.83 1.76 * old 1.82 1.80 1.80 * new 1.58 1.62 1.58 > Also, space is really cheap these days. And we're not in any immediate danger > of running out of it. Very true. -- brion vibber (brion @ pobox.com)

22 years, 3 months

about changing edited text

by Chuck Smith

> > So, if these HTML tags are *never* used anyway, > why can't we replace them > > with < and > just prior to saving an edited > article? > > I just have two objections: > > First, it violates the principal of least surprise; > the user doesn't get > the same thing upon a re-edit that they left during > the last edit. This > is particularly annoying for people who are putting > complicated tables > into articles (cf. [[Beryllium]] and [[Periodic > Table]]) -- if they do > one thing wrong, POOF! Half their table <tags> > suddenly turn into > <tags> and instead of fixing one tiny mistake, > they fix one tiny > mistake AND change a lot of <>s back into <>s. > Conclusion: bad for users. > > Second, enforcing the limited subset HTML is just a > part of the wiki > parsing. Doing that on save is fine, but is > basically doing half the > parsing job and caching that, then doing the other > half when we display > the page. Why stop there, when we could just parse > the wiki-specific > code while we're at it and save the final result? > Conclusion: what exactly is our goal here? To save > processing time on > page load? This is most effectively done by caching > the completely > parsed version, both HTML and wiki -> HTML. My first thought when I read this was to have two seperate versions of the page. One for the server and one for the editor. Basically when someone saves a page, this creates an html page and saves a seperate wiki page. This way all processing would be done only when someone saves a page rather than everytime someone loads a page? Is there a flaw in my reasoning other than the database doubling in size? I guess it depends on which is more important: space or speed? Chuck ===== Venu al la senpaga, libera enciklopedio esperanta reta! http://eo.wikipedia.com/ ==== Junuloj! Filadelfio, Usono 15an-17an de Februaro http://unumondo.com/cgi-bin/wiki.pl?Filadelfia_JES _________________________________________________________ Do You Yahoo!? Información de Estados Unidos y América Latina, en Yahoo! Noticias. Visítanos en http://noticias.espanol.yahoo.com

22 years, 3 months

Faster! Thanks!

by Larry Sanger

Just as the announcements page says, Wikipedia is a lot faster now. Thanks to everyone who made it possible. A quick-loading Recent Changes page is a beautiful thing. Larry

22 years, 3 months

Slow parser

by Magnus Manske

I just ran "ab n=10" for an atricle (with cache turned off) and deactivated some functions to see where the slow parts are. Full rendering : 4.99 sec removeHTMLtags turned off : 3.319 sec It seems removeHTMLtags is responsible for 1/3 of the *total* runtime, which includes apache, php calling, and a thousand other things that can't be avoided. So, if these HTML tags are *never* used anyway, why can't we replace them with < and > just prior to saving an edited article? I'll be gone tomorrow until Saturday, and I doubt I can hack it today, so it's up to you... Magnus

22 years, 3 months

caching question

by Jimmy Wales

Jan said that I have caching turned off, which surprised me because I thought it was on. Now I've looked at the code and I still think it is on. wikiSettings.php: $useCachedPages = true ; wikiLocalSettings.php: # $useCachedPages = false; # Disable page cache (This is commented out, right?) ----------------- Playing with benchmarking, grabbing a normal article 100 times: (I know that this type of benchmarking is not very scientific, since conditions may change on the live server due to someone else doing something big at the same time, etc. But I think it gives an indication.) As the site is running: /apache/bin/ab -n100 -c1 http://www.wikipedia.com/wiki/Alabama Requests per second: 0.95 [#/sec] (mean) Now I will set $useCachedPages to false by uncommenting the line in wikiLocalSettings.php. /apache/bin/ab -n100 -c1 http://www.wikipedia.com/wiki/Alabama Requests per second: 0.97 [#/sec] (mean) So I see no material difference. How can I easily tell if caching is actually on or off? Am I doing something wrong here?

22 years, 3 months

Re: [Wikitech-l] Cache vs. other-language links

by Jan Hidders

From: "Jimmy Wales" <jwales(a)bomis.com> > Jan Hidders wrote: > > Can I suggest we simply stop with the whole caching thing? It complicates > > things unnecesarily. Keeping the code simple should be one of our top > > priorities. Jimbo doesn't have it turned on at the moment anyway, > > I have no strong opinion about this, but I wanted to say that I > thought I did have it turned on. If it's off, that's a mistake. My mistake. I saw that the language links worked, but hadn't realized I was looking at the page just after I edited it. I see now that you do have it on because upon reloading the page the language links are missing, so I get apparently the cached page. > Tell you what, I'll benchmark with and without, on the live server, > and report the numbers. Yes, that would be very very welcome. -- Jan Hidders

22 years, 3 months

Re: [Wikitech-l] Cache vs. other-language links

by Jan Hidders

From: "Magnus Manske" <Magnus.Manske(a)epost.de> > > The parser has to be brought up to speed. I'll also have a look into > connecting the PHP script with the C++ parser I wrote (did I mention 0.05 > secs for rendering "Signal transduction", with fetching it from the > database, searching the database for existing topics, and adding the > "framework"?;) As I said I'd rather keep it in PHP, but it's your project of course. Does your parser put any requirements on the syntax. Should it be LL(1) or LALR(1)? Are you going to use yacc, or is it just a simple recursiev descent parser? What we could improve in PHP for example is that the current parser parses the string paragraph by paragraph. (But please don't use the function explode() for that because that is a memory killer.) Most replace-functions could be limited to only one paragraph and the rest can be dealt with by making the parser a little context-sensitive. Standard Wiki matrup is supposed to be limited to one paragraph anyway. The HTML markup is a bit harder, but there you can remember the nesting depth and type of nesting, and once you see that that tags are not balanced you go back in the string and replace the < and > with entities. This will be expensive but it is an exception, so it won't hurt. -- Jan Hidders

22 years, 3 months

Cache vs. other-language links

by Brion Vibber

I noticed that the other-language links (links in the form [[fr:Japon]] [[en:Japan]] [[eo:Japanio]] etc which are hidden in the article body but listed by language name in the header bar, pointing to the article on the current subject in the other-language wikis) are vanishing on cached pages, because they're scanned and listed during the wiki->html link parsing which of course doesn't occur when loading a cached page. I've a hackish fix for that which explicitly seeks out the other- language links for cached pages, but I don't like it very much. It's inelegant, and two sets of code have to be maintained to do the same thing in different contexts. What I'd like to do is add a column to the cur table, something like cur_links_languages which would be analogous to cur_links_linked and cur_links_unliked. The list of inter-language links for a page would be stored when the page is saved, then easily loaded up again along with the cache. This would also make it easy to provide statistics on the degree of linkage between language wikis. (No change in current user-visible behavior except in fixing the obvious bug of vanishing links, and potentially providing more information in special:Statistics etc.) Alternatively, we might have a separate database which contains nothing but lists of connected articles. This could facilitate keeping the other-language links consistent; if somebody adds an article "Japón" to the Spanish wikipedia, it shouldn't be necessary to separately add [[es:Jap%f3n]] to the English, French, Esperanto, etc. articles. Keeping a central repository would mean that it only needs to be linked in with the others once, and all linked articles will immediately benefit by being able to list it without manual editing. Upside: added simplicity for article writers, who don't have to maintain as many links. Downside: added complexity for site maintainers, who have to run a second database or not get all the other-language links. Also might be more difficult to remove incorrectly linked articles. An alternative to the separate link database might be a robot/automatic process that occasionally looks through all the wikipedias checking for consistency in the other-language links and automatically adding (or alerting a human that one ought to add) new other-language links where needed. So what do people think? Should we try one of these, or should I just check in my hackish fix for the meantime? -- brion vibber (brion @ pobox.com)

22 years, 3 months

Jump to page:

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Wikitech-l