Hi all!
Since https://gerrit.wikimedia.org/r/#/c/21584/ got merged, people have been
complaining that they get tons of warnings. A great number of them seem to be
caused by the fact the MediaWiki will, if the DBO_TRX flag is set,
automatically start a transaction on the first call to Database::query().
See e.g. https://bugzilla.wikimedia.org/show_bug.cgi?id=40378
The DBO_TRX flag appears to be set by default in sapi (mod_php) mode. According
to the (very limited) documentation, it's intended to wrap the entire web
request in a single database transaction.
However, since we do not have support for nested transactions, this doesn't
work: the "wrapping" transaction gets implicitely comitted when begin() is
called to start a "proper" transaction, which is often the case when saving new
revisions, etc.
So, DBO_TRX sems to be misguided, or at least broken, to me. Can someone please
explain why it was introduced? It seems the current situation is this:
* every view-only request is wrapped in a transaction, for not good reason i can
see.
* any write operation that uses an explicit transaction, like page editing,
watching pages, etc, will break the wrapping transaction (and cause a warning in
the process). As far as I understand, this really defies the purpose of the
automatic wrapping transaction.
So, how do we solve this? We could:
* suppress warnings if the DBO_TRX flag is set. That would prevent the logs from
being swamped by transaction warnings, but it would not fix the current broken
(?!) behavior.
* get rid of DBO_TRX (or at least not use it per default). This seems to be the
Right Thing to me, but I suppose there is some point to the automatic
transactions that I am missing.
* Implement support for nested transactions, either using a counter (this would
at least make DBO_TRX work as I guess it was intended) or using safepoints (that
would give us support for actual nested transactions). That would be the Real
Solution, IMHO.
So, can someone shed light on what DBO_TRX is intended to do, and how it is
supposed to work?
-- daniel
Are you going to FOSDEM? If so (or if you are considering going) please
add yourself to
http://www.mediawiki.org/wiki/Events/FOSDEM
I still don't know. Depends on whether we have a MediaWiki EU critical mass.
--
Quim Gil
Technical Contributor Coordinator
Wikimedia Foundation
Since a few years ago, we have several query [special] pages, also
called "maintenance reports" in the list of special pages, which are
never updated for performance reasons: 6 on all wikis and 6 more only on
en.wiki. <https://bugzilla.wikimedia.org/show_bug.cgi?id=39667#c6>
A proposal is to run them again and quite liberally on all "small wikis"
(to start with); another, to update them everywhere but one at a time
and with proper breathing time for servers.[1]
The problem is, which pages are safe to run an update on even on
en.wiki, and how frequently; and which would kill it? Or, at what point
a wiki is too big to run such updates carelessly?[2]
Can someone estimate it by looking at the queries, or maybe by running
them on some DB where it's not a problem to test?
We only know that originally pages were disabled if taking "more than
about 15 minutes to update". If now such a page took, say, four times
that ie 60 min, would it be a problem to update one such page per
day/week/month? Etc.
Most updates seem to already rely on slave DBs, but maybe this should be
confirmed; on the other hand, writing huge sets of results to DB
shouldn't be a problem because those are limited as well.[3]
Nemo
[1] In (reviewed) puppet terms: <https://gerrit.wikimedia.org/r/#/c/33713/>
[2] Below that limit, a wiki should be "small" for
<https://gerrit.wikimedia.org/r/#/c/33694> and frequently updated for
the benefit of the editors' engagement.
[3] 'wgQueryCacheLimit' => array(
'default' => 5000,
'enwiki' => 1000, // safe to raise?
'dewiki' => 2000, // safe to raise?
),
I'd like to use html comment into raw wiki text, to use them as effective,
server-unexpensive "data containers" that could be read and parsed by a js
script in view mode. But I see that html comment, written into raw wiki
text, are stripped away by parsing routines. I can access to raw code of
current page in view mode by js with a index.php or an api.php call, and I
do, but this is much more server-expensive IMHO.
Is there any sound reason to strip html comments away? If there is no sound
reason, could such a stripping be avoided?
Alex brollo
Hi everyone!
I want to move my wiki from one server (A) to another (B). On server A I
have a lot of files. I don't need them on server B, but I need all my
wikipages.
What I've done is: I removed the 'images' directory and ran:
php maintainance/rebuildall.php
Unfortunately now wiki still thinks that all the files are at the places
where they need to be. This is clearly not true since the images directory
is empty.
Questions:
1) how to property delete the files permanently?
2) is there any script to make the files and their description in MediaWiki
DB consistent?
Cheers,
-----
Yury Katkov, WikiVote
---------- Forwarded message ----------
From: Erik Moeller <erik(a)wikimedia.org>
Date: Tue, Nov 27, 2012 at 6:49 PM
Subject: Wikimedia/mapping event in Europe early next year?
To: maps-l(a)lists.wikimedia.org
Hi folks,
it's been a long time coming, but we're finally gearing up for putting
some development effort into an OSM tileservice running in production
to serve Wikimedia sites. This is being driven by the mobile team but
obviously has lots of non-mobile use cases as well, including the
recent Wikivoyage addition to the Wikimedia familiy. This work will
probably not kick off before January/February 2013; before then, the
mobile team is working to finish up the GeoData extension (
https://www.mediawiki.org/wiki/Extension:Geodata ).
To get broader community involvement and sync up with existing
volunteer efforts in this area, it'd IMO be useful to plan a
face-to-face meetup/hackfest just focused on geodata/mapping related
development work sometime around Feb/March 2013.
WMF is not going to organize this, but we can help sponsor travel and
bring the key developers from our side who will work on this. Are
there any takers for supporting a 20-30 people development event in
Europe focused on mapping/geodata? I'm suggesting Europe because I
know quite a few of the relevant folks are there, but am open to other
options as well.
Cheers,
Erik
--
Erik Möller
VP of Engineering and Product Development, Wikimedia Foundation
Support Free Knowledge: https://wikimediafoundation.org/wiki/Donate
--
Erik Möller
VP of Engineering and Product Development, Wikimedia Foundation
Support Free Knowledge: https://wikimediafoundation.org/wiki/Donate
Hi,
Someone once suggested we create a control panel for bots. I think the
first step would be to create a page where we could see overview of all
bots we are running on projects. If we create some protocol for querying
bot status we could create some central monitoring server which would
either:
* Query actively each bot for a status (on some address and IP)
* Each bot would contact this server delivering the information to it
I would support the second as it's easier to manage - in first case we
would need to configure the "master server" with list of bots to query.
The system could be simply set of a daemon written in any language and a
php script. Bots would contact the server using php script (they would just
pass information whether they are running or having troubles using some
POST data) daemon would periodically flag all bots that didn't respond for
a certain period as having troubles / needing repair.
Thanks to this we would have overview of all active bots on all projects
and their status. What do you think? Is someone interested in working on
that.