Wikitech-l September 2010

wikitech-l@lists.wikimedia.org

102 participants
93 discussions

by Trevor Parscal

As of r73971 ResourceLoader no longer works like a global static object. This affects a bunch of internals, but more to the point of this message, this affects the hook that some of you have started making use of: "ResourceLoaderRegisterModules". Instead of using ResourceLoader::register() within your hook, now you will need to accept &$resourceLoader as an argument and call $resourceLoader->register(). The documentation in docs/hooks.txt has been updated in r73972 to reflect this change. - Trevor

13 years, 7 months

Office Hours Session on the MediaWiki Developer Documentation

by Zak Greant (Foo Associates)

Greetings All, My name is Zak Greant. For the last few months, I've been working as a contractor for the WikiMedia Foundation. My focus is on improving the MediaWiki developer documentation and the processes around it. Later today, I will be running an IRC office hours session [0] to talk about what I've been working on and to find out what people would like to see from me. The session will be quite informal – I'll provide a bit of background on what I've been doing and why I've been doing it, and I'm hoping that participants will share the issues and ideas that they have about the developer documentation with me. The session is scheduled for 04:00 to 05:00 UTC on Thursday. For some of us (myself included) the session will be on Wednesday night. See the list below to find the time of the session relative to a city near year. If you'd like to prepare for the session, reading the log from the past session (which I neglected to promote) will help get you up to speed: http://meta.wikimedia.org/wiki/IRC_office_hours/Office_hours_2010-09-29 I hope that many of you can make it! ==Local Times== San Fran. Wed 21:00 - 22:00 New York Thu 00:00 - 01:00 London Thu 05:00 - 06:00 Bern Thu 06:00 - 07:00 New Delhi Thu 09:30 - 10:30 Beijing Thu 12:00 - 13:00 Tokyo Thu 13:00 - 14:00 Canberra Thu 14:00 - 15:00 [0] As per http://meta.wikimedia.org/wiki/IRC_office_hours -- Zak Greant (Wikimedia Foundation Contractor) Plans, reports + logs at http://mediawiki.org/wiki/User:Zakgreant Want to talk about the Mediawiki developer docs? Catch me on irc://irc.freenode.net#wikimedia-office Wed. from 16:00-18:00 UTC & Thu. from 04:00-06:00 UTC

13 years, 7 months

Re: [Wikitech-l] [openZIM dev-l] [STRASBOURG, 16/17 September] Next developer Meeting

by Manuel Schneider

Dear all, Emmanuel has organised our 3rd openZIM Developers Meeting in Strasbourg: We will meet on October 15th in the evening at: Hotel PAX 24-36 rue du Faubourg National 6700 Strasbourg, Alsace OpenStreetMap: http://www.openstreetmap.org/export/embed.html?bbox=7.73142,48.58123,7.7396… The Hotel provides us with accommodation, breakfast, coffee breaks, meeting room, video projector, flipchart and lunch until Sunday evening. Dinner for Friday and Saturday will be organised spontaneously on-site. Costs for all this is covered with our openZIM budget sponsored by Wikimedia CH, only travel costs cannot be covered. Please register on the wiki if not yet as Emmanuel has already booked the rooms of those who already registered, so he knows soon how many additional rooms need to be booked. http://openzim.org/Developer_Meetings/2010-2 Please also have a look at the agenda and add whatever you miss. See you soon in Strasbourg! Manuel -- Regards Manuel Schneider Wikimedia CH - Verein zur Förderung Freien Wissens Wikimedia CH - Association for the advancement of free knowledge www.wikimedia.ch _______________________________________________ dev-l mailing list dev-l(a)openzim.org https://intern.openzim.org/mailman/listinfo/dev-l

13 years, 7 months

Balancing MediaWiki Core/Extensions

by Trevor Parscal

In response to recent comments in our code review tool about whether some extensions should be merged into core MediaWiki, or not. I would like to try and initiate a productive conversation about this topic in hopes that we can collaboratively define a set of guidelines for evaluating when to move features out of core and into an extension or out of an extension and into core. <unintended bias> Arguments I have made/observed *against* merging things into core include: 1. Fewer developers have commit access to core, which pushes people off into branches who would have otherwise been able to contribute directly to trunk, inhibiting entry-level contribution. 2. Extensions encourage modularity and are easier to learn and work on because they are smaller sets of code organized in discreet bundles. 3. We should be looking to make core less inclusive, not more. The line between Wikipedia and MediaWiki is already blurry enough as it is. Arguments I have made/observed *for* merging things into core include: 1. MediaWiki should be awesome out-of-the-box, so extensions that would be good for virtually everyone seem silly to bury deep within the poorly organized depths of the extensions folder. 2. When an extension is unable to do what it needs to do because it's dependent on a limited set of hooks, none of which quite do what it needs. 3. Because someone said so. </unintended bias> <obvious bias> I will respond to these three pro-integration points; mostly because I am generally biased against integration and would like to state why. I realize that there are probably additional pro-integration points that are far less biased than the three I've listed, but I am basing these on arguments I've actually seen presented. 1. This is a very valid and important goal, but am unconvinced and merging extensions into core is the only way to achieve it. We can, for instance, take advantage the new installer that demon is working on which has the ability to automate the installation of extensions at setup-time. 2. This seems like a call for better APIs/a more robust set of hooks. Integration for this sort of reason is more likely to introduce cruft and noise than improve the software in any way. 3. Noting that "so-and-so said I should integrate this into core" is not going to magically absolve anyone of having to stand behind their decision to proceed with such action and support it with logic and reason. </obvious bias> If we are to develop guidelines for when to push things in/pull things out of core, it's going to be important that we reach some general consensus on the merits of these points. This mailing list is not historically known for its efficacy in consensus building, but I still feel like this conversation is going to be better off here than in a series of disjointed code review comments. - Trevor

13 years, 7 months

Parser implementaton for MediaWiki syntax

by Andreas Jonsson

Hi, I have written a parser for MediaWiki syntax and have set up a test site for it here: http://libmwparser.kreablo.se/index.php/Libmwparsertest and the source code is available here: http://svn.wikimedia.org/svnroot/mediawiki/trunk/parsers/libmwparser A preprocessor will take care of parser functions, magic words, comment removal, and transclusion. But as it wasn't possible to cleanly separate these functions from the existing preprocessor, some preprocessing is disabled at the test site. It should be straightforward to write a new preprocessor that provides only the required functionality, however. The parser is not feature complete, but the hard parts are solved. I consider "the hard parts" to be: * parsing apostrophes * parsing html mixed with wikitext * parsing headings and links * parsing image links And when I say "solved" I mean producing the same or equivalent output as the original parser, as long as the behavior of the original parser is well defined and produces valid html. Here is a schematic overview of the design: +-----------------------+ | | Wikitext | client application +---------------------------------------+ | | | +-----------------------+ | ^ | | Event stream | +----------+------------+ +-------------------------+ | | | | | | | parser context |<------>| Parser | | | | | | | +-----------------------+ +-------------------------+ | ^ | | Token stream | +-----------------------+ +------------+------------+ | | | | | | | lexer context |<------>| Lexer |<---+ | | | | +-----------------------+ +-------------------------+ The design is described more in detail in a series of posts at the wikitext-l mailing list. The most important "trick" is to make sure that the lexer never produce a spurious token. An end token for a production will not appear unless the corresponding begin token already has been produced, and the lexer maintains a block context to only produce tokens that makes sense in the current block. I have used Antlr for generating both the parser and the lexer, as Antlr supports semantic predicates that can be used for context sensitive parsing. Also I am using a slightly patched version of Antlr's C runtime environent, because the lexer needs to support speculative execution in order to do context sensitive lookahead. A Swig generated interface is used for providing the php api. The parser process the buffer of the php string directly, and writes its output to an array of php strings. Only UTF-8 is supported at the moment. The performance seems to be about the same as for the original parser on plain text. But with an increasing amount of markup, the original parser runs slower. This new parser implementation maintains roughly the same performance regardless of input. I think that this demonstrates the feasability of replacing the MediaWiki parser. There is still a lot of work to do in order to turn it into a full replacement, however. Best regards, Andreas

13 years, 7 months

Pending Changes development update: September 27

by Rob Lanphier

Hi everyone, As many of you know, the results of the poll to keep Pending Changes on through a short development cycle were approved for interim usage: http://en.wikipedia.org/wiki/Wikipedia:Pending_changes/Straw_poll_on_interi… Ongoing use of Pending Changes is contingent upon consensus after the deployment of an interim release of Pending Changes in November 2010, which is currently under development. The roadmap for this deployment is described here: http://www.mediawiki.org/wiki/Pending_Changes_enwiki_trial/Roadmap An update on the date: we'd previously scheduled this for November 9. However, because that week is the same week as the start of the fundraiser (and accompanying futzing with the site) we'd like to move the date one week later, to November 16. Aaron Schulz is advising us as the author of the vast majority of the code, having mostly implemented the "reject" button. Chad Horohoe and Priyanka Dhanda are working on some of the short term development items, and Brandon Harris is advising us on how we can make this feature mesh with our long term usability strategy. We're currently tracking the list of items we intend to complete in Bugzilla. You can see the latest list here: https://bugzilla.wikimedia.org/showdependencytree.cgi?id=25293 Many of the items in the list are things we're looking for feedback on: Bug 25295 - "Improve reviewer experience when multiple simultaneous users review Pending Changes" https://bugzilla.wikimedia.org/show_bug.cgi?id=25295 Bug 25296 - "History style cleanup - investigate possible fixes and detail the fixes" https://bugzilla.wikimedia.org/show_bug.cgi?id=25296 Bug 25298 - "Figure out what (if any) new Pending Changes links there should be in the side bar" https://bugzilla.wikimedia.org/show_bug.cgi?id=25298 Bug 25299 - "Make pending revision status clearer when viewing page" https://bugzilla.wikimedia.org/show_bug.cgi?id=25299 Bug 25300 - "Better names for special pages in Pending Changes configuration" https://bugzilla.wikimedia.org/show_bug.cgi?id=25300 Bug 25301 - "Firm up the list of minor UI improvements for the November 2010 Pending Changes release" https://bugzilla.wikimedia.org/show_bug.cgi?id=25301 Please provide your input in Bugzilla if you're comfortable with that; otherwise, please remark on the feedback page: http://en.wikipedia.org/wiki/Wikipedia:Pending_changes/Feedback Thanks! Rob

13 years, 7 months

Helping with code review

by Brion Vibber

Hey everybody, just a quick heads-up -- With Tim expected to be super busy in the following weeks, I'll be pitching in an extra hour or two a day to help with code review and patch advice. I've still got a pretty full plate over at StatusNet so I won't be available all day, but what time I do have will be blocked out and dedicated to review for MediaWiki. My provisional 'code review office hours' today will be 7-9pm Pacific (02:00-4:00 UTC); tomorrow I'll try balancing it with some morning time which may be more accessible for some folks! -- brion

13 years, 7 months

Static dump of German Wikipedia

by Marco Schuster

Hi all, just a quick status update: The dump is currently running at 2req/s and ignores all pages which have is_redirect set; also I changed the storage method: the new files are appended to /mnt/user-store/dewiki_static/articles.tar, as I noticed I was filling up the inodes of the file system; doing the storage inside a tarball will prevent this and I don't have to waste time downloading the tons of files to my PC, only one huge tarball when its done. I also managed to get a totally stripped down version of the Vector skin file loading an article via JSON (I won't release it now though, it's a damn hack - nothing except loading works, as I have removed every JS file... should be pretty till Sunday). Current dump position is at 92927, stripping out the redirects 53171 articles have really been downloaded, resulting in 770MB of uncompressed tar (I expect gzip or bz2 compression to save lots of space though). For the redirects: how do I get the redirect target page (maybe even the #section)? Marco PS: Are there any *fundamental* differences between the Vector skin files of different languages except the localisation? Could this maybe be converted to Javascript, maybe $("#footer-info-lastmod").html("page was last changed at foobar")? -- VMSoft GbR Nabburger Str. 15 81737 München Geschäftsführer: Marco Schuster, Volker Hemmert http://vmsoft-gbr.de

13 years, 7 months

wikipedia is one of the slower sites on the web

by jidanni＠jidanni.org

Seems to me playing the role of the average dumb user, that en.wikipedia.org is one of the rather slow websites of the many websites I browse. No matter what browser, it takes more seconds from the time I click on a link to the time when the first bytes of the HTTP response start flowing back to me. Seems facebook is more zippy. Maybe Mediawiki is not "optimized".

13 years, 7 months

Accuracy of coordinates in dbpedia/wikipedia & freebase

by Paul Houle

I've recently put up a site that uses coordinate information from Freebase and Dbpedia, and I'm starting to think about how to clean up certain data quality problems I'm encountering, for instance, see: http://ookaboo.com/o/pictures/topic/209440/Oakville_Assembly In this particular case, I've only got data from dbpedia, which drops the point a few hundred km from where it really is... It's obvious that this is a bad one because it's right in the middle of Lake Erie. Freebase doesn't have any coordinate for this thing (seems to me that it should), and at the moment, Wikipedia has the right coordinates (at least on Google maps I see a big factory building) My guess is that wikipedia might have been wrong at one time, and has had it corrected. It's also possible that the conversion wasn't done right in dbpedia, since coordinates are represented differently in a few hundred different infoboxes. It seems to me that both the number of points and the quality of points in Wikipedia has been improving dramatically over the last two years... About a year ago I plotted the points for Staten Island Railroad stations and found that the railroad was displaced a few km east and ran right under the middle of the Tapan Zee bridge... Now it's much better. I can find examples where: (a) dbpedia is right and freebase is wrong (for instance, a town in continental Europe gets its longitude sign flipped and ends up with the wrecked ships west of the UK -- maybe here the point got fixed in wikipedia but not in freebase) (b) dbpedia is wrong and freebase is right (c) a point is missing from dbpedia but is in freebase (I see a lot of these in Switzerland), and (d) a point is missing from freebase but in dbpedia An analysis of this is is tricky because there are a lot of things where the coordinates are iffy: the location of 'Russia' could vary within a few thousand kilometers, 'Tompkins County' could vary by ten or so kilometers, etc. Looking at a handful of points that have diverged, I get the impression that freebase is more accurate than dbpedia, but that I get better results just looking at the coordinates on the human interface of wikipedia -- currently, it seems like a scan of the current coordinates in wikipedia (however wikipedia extracts them from the infoboxes) benefits the most from the human labor being done to fix points and also avoids errors & missed points from other people's extraction pipelines. From my viewpoint, I'd like to make a map that doesn't have embarassing errors in it... What's the best way to clean up this mess?

13 years, 7 months

← Newer
1
2
3
4
5
6
7
8
9
10
Older →

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Wikitech-l September 2010