As of r73971 ResourceLoader no longer works like a global static
object. This affects a bunch of internals, but more to the point of this
message, this affects the hook that some of you have started making use
of: "ResourceLoaderRegisterModules".
Instead of using ResourceLoader::register() within your hook, now you
will need to accept &$resourceLoader as an argument and call
$resourceLoader->register().
The documentation in docs/hooks.txt has been updated in r73972 to
reflect this change.
- Trevor
Greetings All,
My name is Zak Greant. For the last few months, I've been working as a
contractor for the WikiMedia Foundation.
My focus is on improving the MediaWiki developer documentation and the
processes around it.
Later today, I will be running an IRC office hours session [0] to talk
about what I've been working on and to find out what people would like
to see from me.
The session will be quite informal – I'll provide a bit of background
on what I've been doing and why I've been doing it, and I'm hoping
that participants will share the issues and ideas that they have about
the developer documentation with me.
The session is scheduled for 04:00 to 05:00 UTC on Thursday. For some
of us (myself included) the session will be on Wednesday night. See
the list below to find the time of the session relative to a city near
year.
If you'd like to prepare for the session, reading the log from the
past session (which I neglected to promote) will help get you up to
speed:
http://meta.wikimedia.org/wiki/IRC_office_hours/Office_hours_2010-09-29
I hope that many of you can make it!
==Local Times==
San Fran. Wed 21:00 - 22:00
New York Thu 00:00 - 01:00
London Thu 05:00 - 06:00
Bern Thu 06:00 - 07:00
New Delhi Thu 09:30 - 10:30
Beijing Thu 12:00 - 13:00
Tokyo Thu 13:00 - 14:00
Canberra Thu 14:00 - 15:00
[0] As per http://meta.wikimedia.org/wiki/IRC_office_hours
--
Zak Greant (Wikimedia Foundation Contractor)
Plans, reports + logs at http://mediawiki.org/wiki/User:Zakgreant
Want to talk about the Mediawiki developer docs?
Catch me on irc://irc.freenode.net#wikimedia-office Wed. from
16:00-18:00 UTC & Thu. from 04:00-06:00 UTC
Dear all,
Emmanuel has organised our 3rd openZIM Developers Meeting in Strasbourg:
We will meet on October 15th in the evening at:
Hotel PAX
24-36 rue du Faubourg National
6700 Strasbourg, Alsace
OpenStreetMap:
http://www.openstreetmap.org/export/embed.html?bbox=7.73142,48.58123,7.7396…
The Hotel provides us with accommodation, breakfast, coffee breaks,
meeting room, video projector, flipchart and lunch until Sunday evening.
Dinner for Friday and Saturday will be organised spontaneously on-site.
Costs for all this is covered with our openZIM budget sponsored by
Wikimedia CH, only travel costs cannot be covered.
Please register on the wiki if not yet as Emmanuel has already booked
the rooms of those who already registered, so he knows soon how many
additional rooms need to be booked.
http://openzim.org/Developer_Meetings/2010-2
Please also have a look at the agenda and add whatever you miss.
See you soon in Strasbourg!
Manuel
--
Regards
Manuel Schneider
Wikimedia CH - Verein zur Förderung Freien Wissens
Wikimedia CH - Association for the advancement of free knowledge
www.wikimedia.ch
_______________________________________________
dev-l mailing list
dev-l(a)openzim.org
https://intern.openzim.org/mailman/listinfo/dev-l
In response to recent comments in our code review tool about whether
some extensions should be merged into core MediaWiki, or not. I would
like to try and initiate a productive conversation about this topic in
hopes that we can collaboratively define a set of guidelines for
evaluating when to move features out of core and into an extension or
out of an extension and into core.
<unintended bias>
Arguments I have made/observed *against* merging things into core include:
1. Fewer developers have commit access to core, which pushes people
off into branches who would have otherwise been able to contribute
directly to trunk, inhibiting entry-level contribution.
2. Extensions encourage modularity and are easier to learn and work
on because they are smaller sets of code organized in discreet
bundles.
3. We should be looking to make core less inclusive, not more. The
line between Wikipedia and MediaWiki is already blurry enough as
it is.
Arguments I have made/observed *for* merging things into core include:
1. MediaWiki should be awesome out-of-the-box, so extensions that
would be good for virtually everyone seem silly to bury deep
within the poorly organized depths of the extensions folder.
2. When an extension is unable to do what it needs to do because it's
dependent on a limited set of hooks, none of which quite do what
it needs.
3. Because someone said so.
</unintended bias>
<obvious bias>
I will respond to these three pro-integration points; mostly because I
am generally biased against integration and would like to state why. I
realize that there are probably additional pro-integration points that
are far less biased than the three I've listed, but I am basing these on
arguments I've actually seen presented.
1. This is a very valid and important goal, but am unconvinced and
merging extensions into core is the only way to achieve it. We
can, for instance, take advantage the new installer that demon is
working on which has the ability to automate the installation of
extensions at setup-time.
2. This seems like a call for better APIs/a more robust set of hooks.
Integration for this sort of reason is more likely to introduce
cruft and noise than improve the software in any way.
3. Noting that "so-and-so said I should integrate this into core" is
not going to magically absolve anyone of having to stand behind
their decision to proceed with such action and support it with
logic and reason.
</obvious bias>
If we are to develop guidelines for when to push things in/pull things
out of core, it's going to be important that we reach some general
consensus on the merits of these points. This mailing list is not
historically known for its efficacy in consensus building, but I still
feel like this conversation is going to be better off here than in a
series of disjointed code review comments.
- Trevor
Hi,
I have written a parser for MediaWiki syntax and have set up a test
site for it here:
http://libmwparser.kreablo.se/index.php/Libmwparsertest
and the source code is available here:
http://svn.wikimedia.org/svnroot/mediawiki/trunk/parsers/libmwparser
A preprocessor will take care of parser functions, magic words,
comment removal, and transclusion. But as it wasn't possible to
cleanly separate these functions from the existing preprocessor, some
preprocessing is disabled at the test site. It should be
straightforward to write a new preprocessor that provides only the required
functionality, however.
The parser is not feature complete, but the hard parts are solved. I
consider "the hard parts" to be:
* parsing apostrophes
* parsing html mixed with wikitext
* parsing headings and links
* parsing image links
And when I say "solved" I mean producing the same or equivalent output
as the original parser, as long as the behavior of the original parser
is well defined and produces valid html.
Here is a schematic overview of the design:
+-----------------------+
| | Wikitext
| client application +---------------------------------------+
| | |
+-----------------------+ |
^ |
| Event stream |
+----------+------------+ +-------------------------+ |
| | | | |
| parser context |<------>| Parser | |
| | | | |
+-----------------------+ +-------------------------+ |
^ |
| Token stream |
+-----------------------+ +------------+------------+ |
| | | | |
| lexer context |<------>| Lexer |<---+
| | | |
+-----------------------+ +-------------------------+
The design is described more in detail in a series of posts at the
wikitext-l mailing list. The most important "trick" is to make sure
that the lexer never produce a spurious token. An end token for a
production will not appear unless the corresponding begin token
already has been produced, and the lexer maintains a block context to
only produce tokens that makes sense in the current block.
I have used Antlr for generating both the parser and the lexer, as
Antlr supports semantic predicates that can be used for context
sensitive parsing. Also I am using a slightly patched version of
Antlr's C runtime environent, because the lexer needs to support
speculative execution in order to do context sensitive lookahead.
A Swig generated interface is used for providing the php api. The
parser process the buffer of the php string directly, and writes its
output to an array of php strings. Only UTF-8 is supported at the
moment.
The performance seems to be about the same as for the original parser
on plain text. But with an increasing amount of markup, the original
parser runs slower. This new parser implementation maintains roughly
the same performance regardless of input.
I think that this demonstrates the feasability of replacing the
MediaWiki parser. There is still a lot of work to do in order to turn
it into a full replacement, however.
Best regards,
Andreas
Hi everyone,
As many of you know, the results of the poll to keep Pending Changes
on through a short development cycle were approved for interim usage:
http://en.wikipedia.org/wiki/Wikipedia:Pending_changes/Straw_poll_on_interi…
Ongoing use of Pending Changes is contingent upon consensus after the
deployment of an interim release of Pending Changes in November 2010,
which is currently under development. The roadmap for this deployment
is described here:
http://www.mediawiki.org/wiki/Pending_Changes_enwiki_trial/Roadmap
An update on the date: we'd previously scheduled this for November 9.
However, because that week is the same week as the start of the
fundraiser (and accompanying futzing with the site) we'd like to move
the date one week later, to November 16.
Aaron Schulz is advising us as the author of the vast majority of the
code, having mostly implemented the "reject" button. Chad Horohoe and
Priyanka Dhanda are working on some of the short term development
items, and Brandon Harris is advising us on how we can make this
feature mesh with our long term usability strategy.
We're currently tracking the list of items we intend to complete in
Bugzilla. You can see the latest list here:
https://bugzilla.wikimedia.org/showdependencytree.cgi?id=25293
Many of the items in the list are things we're looking for feedback on:
Bug 25295 - "Improve reviewer experience when multiple simultaneous
users review Pending Changes"
https://bugzilla.wikimedia.org/show_bug.cgi?id=25295
Bug 25296 - "History style cleanup - investigate possible fixes and
detail the fixes"
https://bugzilla.wikimedia.org/show_bug.cgi?id=25296
Bug 25298 - "Figure out what (if any) new Pending Changes links there
should be in the side bar"
https://bugzilla.wikimedia.org/show_bug.cgi?id=25298
Bug 25299 - "Make pending revision status clearer when viewing page"
https://bugzilla.wikimedia.org/show_bug.cgi?id=25299
Bug 25300 - "Better names for special pages in Pending Changes configuration"
https://bugzilla.wikimedia.org/show_bug.cgi?id=25300
Bug 25301 - "Firm up the list of minor UI improvements for the
November 2010 Pending Changes release"
https://bugzilla.wikimedia.org/show_bug.cgi?id=25301
Please provide your input in Bugzilla if you're comfortable with that;
otherwise, please remark on the feedback page:
http://en.wikipedia.org/wiki/Wikipedia:Pending_changes/Feedback
Thanks!
Rob
Hey everybody, just a quick heads-up --
With Tim expected to be super busy in the following weeks, I'll be pitching
in an extra hour or two a day to help with code review and patch advice.
I've still got a pretty full plate over at StatusNet so I won't be available
all day, but what time I do have will be blocked out and dedicated to review
for MediaWiki.
My provisional 'code review office hours' today will be 7-9pm Pacific
(02:00-4:00 UTC); tomorrow I'll try balancing it with some morning time
which may be more accessible for some folks!
-- brion
Hi all,
just a quick status update: The dump is currently running at 2req/s
and ignores all pages which have is_redirect set; also I changed the
storage method: the new files are appended to
/mnt/user-store/dewiki_static/articles.tar, as I noticed I was filling
up the inodes of the file system; doing the storage inside a tarball
will prevent this and I don't have to waste time downloading the tons
of files to my PC, only one huge tarball when its done.
I also managed to get a totally stripped down version of the Vector
skin file loading an article via JSON (I won't release it now though,
it's a damn hack - nothing except loading works, as I have removed
every JS file... should be pretty till Sunday).
Current dump position is at 92927, stripping out the redirects 53171
articles have really been downloaded, resulting in 770MB of
uncompressed tar (I expect gzip or bz2 compression to save lots of
space though).
For the redirects: how do I get the redirect target page (maybe even
the #section)?
Marco
PS: Are there any *fundamental* differences between the Vector skin
files of different languages except the localisation? Could this maybe
be converted to Javascript, maybe $("#footer-info-lastmod").html("page
was last changed at foobar")?
--
VMSoft GbR
Nabburger Str. 15
81737 München
Geschäftsführer: Marco Schuster, Volker Hemmert
http://vmsoft-gbr.de
Seems to me playing the role of the average dumb user, that
en.wikipedia.org is one of the rather slow websites of the many websites
I browse.
No matter what browser, it takes more seconds from the time I click on a
link to the time when the first bytes of the HTTP response start flowing
back to me.
Seems facebook is more zippy.
Maybe Mediawiki is not "optimized".
I've recently put up a site that uses coordinate information from
Freebase and Dbpedia, and I'm starting to think about how to clean up
certain data quality problems I'm encountering, for instance, see:
http://ookaboo.com/o/pictures/topic/209440/Oakville_Assembly
In this particular case, I've only got data from dbpedia, which drops
the point a few hundred km from where it really is... It's obvious that
this is a bad one because it's right in the middle of Lake Erie.
Freebase doesn't have any coordinate for this thing (seems to me that it
should), and at the moment, Wikipedia has the right coordinates (at
least on Google maps I see a big factory building) My guess is that
wikipedia might have been wrong at one time, and has had it corrected.
It's also possible that the conversion wasn't done right in dbpedia,
since coordinates are represented differently in a few hundred different
infoboxes.
It seems to me that both the number of points and the quality of points
in Wikipedia has been improving dramatically over the last two years...
About a year ago I plotted the points for Staten Island Railroad
stations and found that the railroad was displaced a few km east and ran
right under the middle of the Tapan Zee bridge... Now it's much better.
I can find examples where:
(a) dbpedia is right and freebase is wrong (for instance, a town in
continental Europe gets its longitude sign flipped and ends up with the
wrecked ships west of the UK -- maybe here the point got fixed in
wikipedia but not in freebase)
(b) dbpedia is wrong and freebase is right
(c) a point is missing from dbpedia but is in freebase (I see a lot of
these in Switzerland), and
(d) a point is missing from freebase but in dbpedia
An analysis of this is is tricky because there are a lot of things where
the coordinates are iffy: the location of 'Russia' could vary within a
few thousand kilometers, 'Tompkins County' could vary by ten or so
kilometers, etc.
Looking at a handful of points that have diverged, I get the impression
that freebase is more accurate than dbpedia, but that I get better
results just looking at the coordinates on the human interface of
wikipedia -- currently, it seems like a scan of the current coordinates
in wikipedia (however wikipedia extracts them from the infoboxes)
benefits the most from the human labor being done to fix points and also
avoids errors & missed points from other people's extraction pipelines.
From my viewpoint, I'd like to make a map that doesn't have
embarassing errors in it... What's the best way to clean up this mess?