Hi!
I am starting this thread because Brion's revision r94289 reverted
r94289 [0] stating "core schema change with no discussion" [1].
Bugs 21860 [2] and 25312 [3] advocate for the inclusion of a hash
column (either md5 or sha1) in the revision table. The primary use
case of this column will be to assist detecting reverts. I don't think
that data integrity is the primary reason for adding this column. The
huge advantage of having such a column is that it will not be longer
necessary to analyze full dumps to detect reverts, instead you can
look for reverts in the stub dump file by looking for the same hash
within a single page. The fact that there is a theoretical chance of a
collision is not very important IMHO, it would just mean that in very
rare cases in our research we would flag an edit being reverted while
it's not. The two bug reports contain quite long discussions and this
feature has also been discussed internally quite extensively but oddly
enough it hasn't happened yet on the mailinglist.
So let's have a discussion!
[0] http://www.mediawiki.org/wiki/Special:Code/MediaWiki/94289
[1] http://www.mediawiki.org/wiki/Special:Code/MediaWiki/94541
[2] https://bugzilla.wikimedia.org/show_bug.cgi?id=21860
[3] https://bugzilla.wikimedia.org/show_bug.cgi?id=25312
Best,
Diederik
The Article class was a problem we've been trying to fix.
Article basically was 3 things rolled all into one:
1) A page generation class that when run would fill up OutputPage
2) A context that references a Title, of course using the global $wg
context for the rest at the time
3) A glorified Title, containing helpers for doing article related things
Taking the place of 1) we have things like SpecialPage, Action, and some
things that are still left inside Article. (I'm hoping to eliminate Action
and Article with the pageoutput branch though.)
Taking the place of 2) we have a RequestContext system. Instead of
hoarding information to itself the classes that fill up OutputPage make
use of a context.
Taking the place of 3) we have WikiPage.
However, I'm starting to see a new pattern in the quest to eliminate
Article that's making a brand new mistake.
function __constructor( Page $page, IContextSource $context = null ) { ...
}
People seam to be changing constructors for (1) type things that fill up
OutputPage that once took an Article into one that takes a WikiPage and a
RequestContext, in situations where a Context should be enough information.
WikiPage is basically a glorified title. It takes a single title, and
provides access to a number of helpers for that title, such as fetching
content, and other information.
Anyone with a title can get a WikiPage perfectly usable for their purposes
by calling WikiPage::factory( $title );
WikiPage has ABSOLUTELY no user state to it. There is no setXYZ on
WikiPage that would make one WikiPage unique from another WikiPage
obtained from the same title and warrant keeping WikiPage around. The only
/state/ WikiPage has is a few member variables to cache information about
the title it's fetched from the database, such as the contents of the
revision so that it doesn't keep re-fetching them.
The ONLY reason I see to pass a WikiPage when you already have a Title is
so that the WikiPage instance you have doesn't lose the cached stuff like
revision that it already fetched from the database.
----
I'd like to propose a different pattern.
Because WikiPage is basically a glorified Title we move our pattern for
getting ahold of a WikiPage to where we get our title. In other words, we
add RequestContext::getWikiPage which will run WikiPage::factory(
$this->getTitle() ); to get the WikiPage relevant to a Title. If setTitle
is used on that RequestContext we unset the WikiPage we have on
RequestContext since the title has changed.
((Unless someone thinks that it would be a better idea for
WikiPage::factory or whatever we call it to only create a singleton
instance per-title. ie: `WikiPage::factory( Title::newFromText( 'Asdf' )
=== WikiPage::factory( Title::newFromText( 'Asdf' );`))
Our page generation classes like Action, etc... accept a Context. If they
need a WikiPage to access the Title helpers they use getWikiPage on the
context they are working in.
In the case where we need to tell a output page generating class to use a
different title than the one we have in our context, we make use of a
proper DerivativeContext instead of handing it a WikiPage for a different
title than the context and utterly confusing it.
For example, if we had Special:History/Foo as our title and wanted to run
HistoryPage with a context of "Foo" as a title:
$pageContext = new DerivativeContext( $context );
$pageContext->setTitle( Title::newFromText( 'Foo' ) );
$... = new HistoryPage( $pageContext );
--
~Daniel Friesen (Dantman, Nadir-Seen-Fire) [http://daniel.friesen.name]
For my research I need to download 3 files:
- [LANGCODE]wiki-[DATE]-pages-articles.xml.bz2 *OR*
[LANGCODE]wiki-[DATE]-pages-meta-current.xml.bz2
- [LANGCODE]wiki-[DATE]-pagelinks.sql.gz
- [LANGCODE]wiki-[DATE]-categorylinks.sql.gz
I downloaded the 2 first ones. Now I can not have an access to all download
pages (eg. http://dumps.wikimedia.org/enwiki/). The browser shows this
message : 403-Fobidden.
I need to know why the access is naw forbidden and how to solve this
problem.
Hi folks,
I got some quick review from Bawolff for the extension OnlineStatusBar I've
been working on past weeks, the extension is allowing users to display
their online status on user pages and have some support on english
wikipedia for its deployment there, since it wasn't reviewed for deployment
yet it's unlike it would happen soon, but there is a chance it could be
installed in the future once it's ready and all issues fixed, (many users
on wikipedia are already using custom statuses by editing their userpage
everytime when they log in or logout so it could be useful as a replacement
for that).
He noticed that it would be cool if there was a js menu for it to let users
pick the status (away etc.) instead of opening preferences. I like it but
unfortunately I don't understand js even a bit. So if there was someone who
was able to help with that, it would be very nice! The extension is in
/trunk/extensions/OnlineStatusBar so if you got an idea how to make it,
valid and secure so it wouldn't cause problems if it was eventually
deployed to wmf wiki(s) please go ahead :) feel free to insert your name in
the about in that case.
Thanks, Peter
The hook I introduced in r94373 ('WebRequestGetPathInfoRequestURI') for
extending path routing has been dropped in r104274 and replaced with some
new code.
We now have a path routing class. This class is instanced inside of
WebRequest, the hook WebRequestPathInfoRouter is called with it as the
argument. And then it's used to parse the REQUEST_URI.
Instead of the previous verbose method an extension can now extend our
REQUEST_URI parsing by adding patterns to the PathRouter in that hook like
so:
$router->add( "/wiki/$1" );
$router->add( array( 'edit' => "/edit/$1" ), array( 'action' => '$key' ) );
$router->add( "/$2/$1" ), array( 'variant' => '$2' ), array( '$2' =>
array( 'zh-hant', 'zh-hans', ... ) ) );
$router->addStrict( "/foo/Bar", array( 'title' => 'Baz' ) );
$router->add( "/help/$1", array( 'title' => 'Help:$1' ) );
Don't worry about the order that you specify patterns. This new path
router parses based on how specific the pattern is, so "/wiki/$1" will
always dominate "/$1", and a "/$2/$1" where $2 is restricted to 'foo' and
'bar' will always dominate a "/$2/$1" with no restrictions.
[r94373]: https://www.mediawiki.org/wiki/Special:Code/MediaWiki/94373
[r104274]: https://www.mediawiki.org/wiki/Special:Code/MediaWiki/104274
--
~Daniel Friesen (Dantman, Nadir-Seen-Fire) [http://daniel.friesen.name]
Hi everyone,
I just posted a
note<http://blog.wikimedia.org/2011/11/18/nobody-notices-when-its-not-broken-new…>on
the blog about our new external store but wanted to add a few details
here. The deploy went smoothly, and I'm very happy with how the project
progressed overall. There are plenty more details on the project itself on
the project wiki
page<http://wikitech.wikimedia.org/view/External_storage/Update_2011-08>and
hiding in RT. there were a few followup things to come out of it, and
I want to talk through those in hopes that someone either picks them up or
has suggestions on what to do.
The project originally included recompressing all of the object types in
the external store databases, continuing the work that was started in
2010. I spent some time doing verification that things were behaving as
expected and it turns out they weren't. Upon examining the count of
different data types in the external store content, I found that some types
that are no longer supposed to be used were still getting created. I've
filed https://bugzilla.wikimedia.org/show_bug.cgi?id=32478 to track the
investigation and resolution of those differences.
During the deploy there was a brief (about 10 minute) period during which
article saves failed due to the external store databases being in read-only
mode. As expected, some folks showed up in IRC telling us of the
'problem'. After the migration was complete we brainstormed a bit in IRC
about good ways of informing editors of planned maintenance such as this
migration. The regular databases (s3, etc.) have a read-only mode flag so
that the affected wikis show a reasonable error, but the external store
databases are a little different. Because of the way they're spread out,
the outage of a specific database cluster does not affect specific language
projects, but instead affects a specific time range for all wikis.
Additionally, the currently writable external store database affects
article edits on all wikis.
There were a few suggestions thrown around:
1) use central notice. This would certainly have the effect of alerting
all wikis that there was some maintenance, but it has the disadvantage of
telling all *readers* about the outage, rather than only the people that
would actually be interested (those editing pages).
2) make mediawiki cache the change to conceal the outage from editors. The
idea here is that mediawiki would notice that the backend database is
currently in read-only mode and would cache the change and write it to the
DB when it returns to read-write mode. There are a number of technical
challenges here, as well as the introduction of another system (the change
cache), but it's an interesting way around the problem, since rather than
addressing how to inform editors of impending maintenance it simply
eliminates the necessity for that communication.
3) throw up a banner on the edit page itself. The time when we want to
inform someone that there is going to be maintenance that will impede
editing is when the user begins an edit. (at the moment we inform them
when they try to save the edit in the form of an error message.) If there
was a banner on all edit pages that informed the user not to save their
document during a specific time period, they could choose to postpone the
edit or finish quickly. The text would be something like "There will be
planned maintenance starting in 23 minutes and lasting for 30 minutes. You
will be unable to save edits during the maintenance period. Please save
your work before maintenance begins." During the maintenance, we could
change the message to be more visible, or we could take more drastic action
such as disabling the edit or save buttons.
4) don't make any change from what we do now. The external store databases
rarely fail or undergo maintenance. Increasing the complexity of the
system to protect against their outage will be more likely cause harm than
the outages themselves. Instead, just announce it on the blog before and
apologize to anybody affected afterwards.
I'm sure there are some more ideas on what we should do, as well as
opinions about these various options out there. Discuss! :) I haven't
filed a bug yet, but will do so if this conversation comes to some
consensus about a specific thing that should be done.
Thanks,
-ben
People who get mail from Bugzilla may have noticed some unusual
activity at around midnight UTC today. Someone created an account with
email address "tim.starling(a)rocketmail.com", using my name, and
proceeded to vandalise 88 bugs in 4 minutes. The vandalism consisted
of changes to random fields, such as status, component, CC, keywords,
dependencies, etc.
The scale of the problem was not immediately apparent since outgoing
email (and hence wikibugs IRC) was backlogged for about half an hour.
The vandalism has now been reverted. Some statuses and resolutions
were reverted by direct SQL queries. These reversions did not result
in a log entry or email.
The user's IP address was a Tor exit node. I blocked the IP address in
iptables, but when I found out it was an exit node, I also disabled
account creation entirely, so that we could stop the vandalism by user
account locks. It remains disabled for now.
About an hour before this incident, a user from a different Tor exit
node (jacob-craddy(a)mail.com) filed 21 new bugs in 40 seconds. All were
closed as "invalid".
Reverting hundreds of bug property changes was labour intensive. It
points to the need for better tools to deal with malicious behaviour
in Bugzilla. I looked into the possibility of writing an automated
revert tool as a command-line perl script integrated with Bugzilla,
but it looked like it would be fairly complicated:
* The bug history log has an irregular free form text format which is
not designed for automated reversions.
* The update interface (Bugzilla::Bug->set_*()) is not ideal for this
application, it mixes backend access methods such as database field
name maps with business logic such as agreement between product and
component.
-- Tim Starling
we currently have a problem with host "singer", which did not come
back after upgrading it. :(
The following services are not accessible :/
secure.wikimedia.org (SSL proxy), ocs.wikimania2009.wikimedia.org,
contacts, outreachcivi, planet, racktables, secure, survey wm09schols,
wm10reg, wm10schols.
http://wikitech.wikimedia.org/view/Singer
will send another update...we are still looking at it..
--
--
Daniel Zahn <dzahn(a)wikimedia.org>
TO: All Wikimedia Project Administrators
As a follow-up to this original blog post about the new mobile site:
http://blog.wikimedia.org/2011/09/14/new-mobile-site-launched-on-wikipedia-…
Please note that the conversion to the new Mobile Frontend extension will
be completed in the week of November 28.
This means that by default any user of a mobile device will see the mobile
interface rather then the desktop version. Users will no longer have to add
the extra .m by hand.
This also means any home pages not yet designed for mobile viewing will
appear with a search bar only - unless you create a home page in the next
three weeks, which is very easy to do!
Instructions for creating a home page are here:
http://meta.wikimedia.org/wiki/Mobile_Projects/Mobile_Gateway#Mobile_homepa…
Please forward this email as necessary.
As always, the mobile-l and mobile-feedback-l mailing lists are available.
There will be many more announcements in the coming months on mobile-l, and
always feel free to send comments to mobile-feedback-l.
Thank you.
Phil
--
Phil Inje Chang
Product Manager, Mobile
Wikimedia Foundation
415-812-0854 m
415-882-7982 x 6810