The servers will be rebooted in a couple hours, circa 12:00 UTC, to
install new operating system kernels. These include routine security
upgrades, and hopefully the updated 64-bit kernel will help with the
database server's Mysterious Occasional Data Corruption problems.
This will probably mean downtime of 30 minutes to an hour; if some of
the machines fail to come back up it could take longer, but hopefully
not.
Those of you in Europe -- take a long lunch. :)
-- brion vibber (brion @ pobox.com)
This would mean either that the input is incomplete/corrupt or that my regular expression is not as sound as it seems (it has parsed millions of records so far without falling over, but -like Popper told us- seeing millions of black ravens only is no proof that no white raven will ever be found).
In any case I think I will add some extra checks.
1 when the regexp completes it should have reached nearly end of file.
2 when the number of records read is less than on previous run clearly something is wrong
Is there any way I can make the script alert you and me when something went wrong, like sendmail ?
Erik Zachte
>
> van: Brion Vibber <brion(a)pobox.com>
> datum: 2003/12/22 ma AM 03:07:57 CET
> aan: Wikimedia developers <wikitech-l(a)Wikipedia.org>,
> epzachte(a)chello.nl
> onderwerp: Re: [Wikitech-l] Stats weirdness
>
> On Dec 21, 2003, at 10:24, <epzachte(a)chello.nl> wrote:
>
> > The stats job failed halfway through on the en: 'old' SQl dump.
> > This happened before, since huge en: has been split into 2 files.
> > Second part probably corrupt/incomplete.
>
> ===== WikiCounts / 6:12 Sunday, December 21, 2003 / Wikipedia: EN =====
>
>
> Read sql dump file
> '/home/wikipedia/backups/public/en/cur_table.sql.bz2' (150.2 Mb)
> Extract names and timestamps.
> Data read (Mb):
>
> 06:12 - 10
> 06:13 - 20 30 40 50 60 70 80 90 100 110
> 06:14 - 120 130 140 150 160 170 180 190 200 210 220
> 06:15 - 230 240 250 260 270 280 290 300 310 320
> 06:16 - 330 340 350 360 370 380 390 400 410 420 430 440
> 06:17 - 450 460 470 480 490 500 510 520 530 540 550 560 570
> 06:18 - 580 590 600 610
> Read sql dump file
> '/home/wikipedia/backups/public/en/old_table.sql.bz2' (2665.3 Mb)
> Extract names and timestamps.
> Data read (Mb):
> 10 20 30 40 50 60 70
> 06:19 - 80 90 100 110 120 130 140 150 160 170 180
>
> Parsing SQL files took 6 min, 59 sec.
>
> That's rather mysterious.
>
> Re-running with the new version, let's see what happens...
>
> -- brion vibber (brion @ pobox.com)
>
>
The stats job failed halfway through on the en: 'old' SQl dump.
This happened before, since huge en: has been split into 2 files.
Second part probably corrupt/incomplete.
Brion,
if you rerun, did you receive my mail about new version of perl files,
as usual at http://members.chello.nl/epzachte/Wikipedia/Statistics/Perl.zip
A minor fixes + Catalan version
B new js file (fixes bar heights for small Wp's, rounding happened before scaling)
C please add codes for new phase 3 Wp's
Erik Zachte
Daniel wrote:
> Do you mean to say that the script completely starts over for every
update?
I have considered incremental counts, of course, but given the occasional algorithm updates in the counting scripts, the possibilty that database structure changes and last but not least the fact that actions need to be synchronized between Brion and me, which works well most of the time, but sometimes some confusion/delay creeps in, I decided to play it safe.
Erik Zachte
We still seem to be having to write œ in article text and links.
Is the letter œ (oe digraph) working properly on fr: yet? If not, will
it work at some time in the future?
(I will write up answers to this in the fr: FAQ! :)
A further problem is that the system does not know that Œ is the
upper case of &oelig.
For example, [[œil (typographie)]] shows as an empty link, even
though the page [[Œil (typographie)]] exists.
Could this second problem be fixed at least?
**
Is [[User talk:Evan/Somesubpage]] a subpage of [[User talk:Evan]], or
the talk page for [[User:Evan/Somesubpage]]?
~ESP
--
Evan Prodromou <evan(a)wikitravel.org>
Wikitravel - http://www.wikitravel.org/
The free, complete, up-to-date and reliable world-wide travel guide
Erik wrote:
>The stats job failed halfway through on the en: 'old'
>SQl dump. This happened before, since huge en: has
>been split into 2 files. Second part probably
>corrupt/incomplete.
Yikes! Do you mean to say that the script completely starts over for every
update? Would it not be better for the script to know the last time and place
in the 'old' dump it did its last update and only compute the data that has
been added since that update?
-- Daniel Mayer (aka mav)
Hi,
On the en stats page http://download.wikimedia.org/wikistats/EN/ChartsWikipediaEN.htm,
the current rate of article creation (under " Articles - New articles per day ")
is almost 2000 per day. That doesn't sound right. (60000 new articles per month?)
I think its a tenth of that. Some others also sound fishy: mean edits per
article is nearly 1 ?!
Arvind
--
Its all GNU to me