Wikitech-l January 2010

wikitech-l@lists.wikimedia.org

91 participants
68 discussions

by Aryeh Gregor

Duesentrieb checked in RDFa support for MediaWiki in r58712: http://www.mediawiki.org/wiki/Special:Code/MediaWiki/58712 I discussed this with him at some length, and Tim commented on how it ties into the parser. I'd like to discuss this a bit more broadly because we're talking about extending wikitext -- whatever markup we allow on Wikipedia (and in this case, particularly on Commons) at the next scap is probably going to have to be allowed forever by default in MediaWiki, because everyone will start using it and pages will break if we disable it. RDFa is a way to embed data in HTML more robustly than with attributes like class and title, which are reserved for author use or have existing functionality. It allows you to specify an external vocabulary that adds some semantics to your page that HTML is not capable of expressing by itself. RDFa is based on the RDF standard, and is relatively old. Microdata is a new competing standard that was created last year as part of HTML5, which aims to be much simpler to use. The major use case we have is marking up Commons image licenses. Either RDFa or Microdata could allow machines to more easily tell what licenses the images we use are under. But in the long term, it seems likely that only one of these technologies will win, and the other will die. We don't want to have to support the loser forever. So IMO we should choose the better one and go with that alone. Now, which to choose? RDFa is better-established, and the W3C is still attached to it, but Microdata has much greater support among the parties that matter, including Google, Mozilla, Apple, and Opera (as judged from discussions in the WHATWG and W3C). It's a lot more concise and simpler to use, is better integrated into HTML, and can represent any semantics we'd want. At the bottom of this post is an example exhibiting how much simpler microdata is. Both RDFa+HTML and Microdata are Working Drafts at the W3C right now, although RDFa in XHTML1 (which we won't be using for much longer) is a Recommendation. I should note that currently Google and a couple of others support RDFa but not Microdata. But come on -- we're Wikipedia. Google already screen-scrapes our templates to figure out what licenses we use anyway, parsing microdata has got to be easier. We shouldn't let existing market shares deter us from picking the better technology. My personal opinion on this is that we should enable Microdata by default (which is much less intrusive than enabling RDFa -- just whitelist a few extra attributes) and encourage Commons to use that instead of RDFa. We can leave RDFa support in, but disabled by default. What does everyone else think? == Example of RDFa vs. Microdata == Suppose we have the following markup right now: [[ <div id="bodyContent"> ... <img src="http://upload.wikimedia.org/wikipedia/commons/e/ef/EmeryMolyneux-terrestria…" width="640" height="480"> ... EmeryMolyneux-terrestrialglobe-1592-20061127.jpg by Bob Smith is licensed under a <a href="http://creativecommons.org/licenses/by-sa/3.0/us/">Creative Commons Attribution-Share Alike 3.0 United States License</a>. ]] Sample RDFa code to say an image is under a CC-BY-SA 3.0 license seems to be something like this, based off the license generator on the CC website: [[ <div id="bodyContent"> ... <img src="http://upload.wikimedia.org/wikipedia/commons/e/ef/EmeryMolyneux-terrestria…" width="640" height="480" id="mw-image"> ... EmeryMolyneux-terrestrialglobe-1592-20061127.jpg by Bob Smith is licensed under a <a rel="license" href="http://creativecommons.org/licenses/by-sa/3.0/us/">Creative Commons Attribution-Share Alike 3.0 United States License</a>. ]] This adds an id to the image, rel="license" to the license link, and two extra tags with lots of lengthy attributes. To be valid RDFa, we would need to add further markup somewhere, at least a version tag in the <html> tag on every page AFAIK. Equivalent microdata is this: <div id="bodyContent" itemscope="" itemtype="http://n.whatwg.org/work"> ... <img src="http://upload.wikimedia.org/wikipedia/commons/e/ef/EmeryMolyneux-terrestria…" width="640" height="480" itemprop="work"> ... EmeryMolyneux-terrestrialglobe-1592-20061127.jpg by Bob Smith is licensed under a <a itemprop="license" href="http://creativecommons.org/licenses/by-sa/3.0/us/">Creative Commons Attribution-Share Alike 3.0 United States License</a>. This adds two attributes to an ancestor to indicate that the contents form a work -- these could be moved to lower elements if desired, AFAICT, but then they'd have to be duplicated. Instead of adding an id to the <img>, it uses itemprop="work" to directly say it's the work being referred to. Instead of , we have . Instead of , we have . Overall, I think it's clear from this example that microdata is much more concise and also more coherent. It's easy to see from this example exactly how the microdata model works: you have a bunch of stuff grouped as an item using itemscope, itemtype tells you what type of item it is, and then itemprop tells you what each role each piece has. It's barely longer than the un-annotated markup. RDFa, by contrast, is a mess of boilerplate that's impossible to understand unless you actually read the specs. Microdata's syntax has actually been refined by a usability study run on it by Google.

14 years, 3 months

Log of failed searches

by Apoc 2400

Would it be possible to generate a log or statistics of searches on Wikipedia using the "Go" button that did not immediately reach an article? Properly anonymized of course. I think it would be useful for finding missing articles and redirects to create. There would be a lot of crap of course, but probably also very useful information on what people have trouble finding.

14 years, 3 months

Re: [Wikitech-l] FW: [FOSDEM] News : Call For Developer Rooms

by Brion Vibber

On 10/26/09 10:05 AM, Siebrand Mazeland wrote: > P: I would like to try and make a proposal with other CMS projects (like > Joomla, Drupal, Typo3, Wordpress, etc.) for a dev room. Reason for not > applying for a "MediaWiki dev room' is that I expect that this will not be > honored because it has too tight a scope. I spoke to one of the people of the > program committee last year, and I was advised to find a broader scope. > Alternatively we could request a dev room with other wiki engines (tikiwiki, > docuwiki, etc.). Personally I have no preference on which projects we would > cooperate with, just as long as we will make a proposal that will get us the > best chance to have a presence at FOSDEM 2010. Open to suggestions... There's been some talk on trying to pair with OpenStreetMaps as well; has anybody attempted to contact any partner yet? -- brion

14 years, 3 months

Some suggestions about the edit page.

by Tei

%% Not multiple edits on same template %% I think the edit page sould be more smart. If a user open the same page two times, the second time sould be warned that the page is already opened. This may need some trans-window comunication, that is not something browser love to do, but I guest is possible with DOMStorage/Cookies or something else. %% Autosave feature for edit %% Again, in this day and age, losing data because a computer crash is a problem that sould never ocurr. The edit page sould save the latest version of the edited text to a local persistent area ( DOMStorage? ). That way, I can 'accidentally close' the edit page, and wen I reopen the edit, the page sould detect "He.. the page is on the same revision, this mean my edited version is interesting", and let the user continue editing. "You closed the edit page withouth saving. Do you want to continue with the old version?". Somewhat like the "drafts" feature on gmail, or the autosave feature of OpenOffice. %% Edit in the browser any fileformat wen webeditors for such fileformats are available %% Wen possible, documents sould be editable inside the browser. This mean as possible, add a SVG editor or a Math editor or a HTML editor or a DOC editor or a PNG editor. The PNG is wrong -> download -> edit -> upload is lame. The PNG is wrong -> edit -> save is cool. The current edit page only supports wikicode :-( %% Un-cruft mode %% There seems to be a lot of "META" information on the wiki, all this metainformation like "stub", etc.. sould be optional. There must be a single checkbox option to disable it all. I don't want to read even 1 more "citation needed" or "this is a stub" bloat. Maybe default sould be to "show cruft". %% Wikipedia Green, Blue and Orange %% A way to fight Deletionism, could be to have something like "different levels" on the wikipedia (a wiki). Set a group of pages on "Book Blue", for pages with a maintainer, and pages approved by a superior committee of quality. Set a group of pages on "Green Book" for pages that serve the merits of notability. "Yellow Book" for pages that don't fulfill a notability criteria. Deletionism is binary, computers can work with more values than 1, 0. Hell.. you can make Green Book and Yellow Book invisible for the un-loged users, only available for loged users. For this thing to work, Templates sould use different colors for differents books. Colors will also account for quality. The german will be mostly blue, while others will have more green pages. There will be a wiki with 90.000 green and 10.000 blue. And other one with 10.000 green and 90.000 blue. %% The Death of Wiki %% Ultimatelly, al wikis lose the war against entropy and are abandoned. This will hit all wikipedia wikis, and all based on mediawiki. While you can't stop that, you can code something so the resulting dead body of wiki is not pure shit. A possible idea could be to "auto-protect" pages without edit in N years (4 years), and save id of the "last know good version", this can be done flagging the pages as "dirty" after a edit, and flag pages as "clean" wen a logged editor say so. Something like "Page Milestones". Maybe the "History" view of a page sould list first the last 10 milestones (newer first), then the "dump" of all edits. There are probably about more than 10 years to think about this issue.

14 years, 3 months

The future of action=ajax

by Bryan Tong Minh

Hi, As you may know there are currently two entry points in MediaWiki for javascript that wants to perform certain actions, action=ajax and api.php. Only the following features still use action=ajax: ajax watch, upload license preview and upload warnings check. I don't really see much point for two entry points where one would suffice. These could all be readily migrated to the API. However, this would mean that they will become unavailable if the API is disabled. Would that considered to be a problem? Bryan

14 years, 3 months

[Mobile Wikipedia] 404 for 6 hours

by Christophe Henner

Hi, I've just been notified that http://m.wikipedia.org is 404 for hours. Can anyone check this out ? -- Christophe

14 years, 3 months

New Subversion committers

by Tim Starling

Bawolff: various Wikinews-related extensions Jonathan Williford: extensions developed for http://neurov.is/on Ning Hu: Semantic NotifyMe Rob Lanphier and Conrad Irwin have been added to the core committer group. -- Tim Starling

14 years, 3 months

Status dashboard creation

by Fred Vassard

I was wondering if anybody would have the time to create some sort of application status dashboard, similar to the ones found on google (http://www.google.com/appsstatus#hl=en) or amazon http://status.aws.amazon.com/). Essentially, something that acts like a simplified external facing blog, where people could update the different pieces as problems are detected. Eventually, it would be nice to be able to tie it with our different monitoring softwares, but to begin with, it would be very convenient for our partners to be able to see the overall status of our different layers. For instance, if someone detects that m.wikipedia.org is not working, the first step would be to update said dashboard to inform the less technically-savy people who are not necessarily on IRC of the problem and that someone is looking at it / in the process of fixing it. Another important feature would be to keep history on problems that have happened in the past, much like google does. I know this is already done to some extent with the server admin log, but having an easy to read interface would in my opinion prove beneficial. Anyway, any suggestion on additional features, or requirements are welcomed. --Fred.

14 years, 3 months

ops + MediaWiki assessment for Wikimedia strategy

by Eugene Eric Kim

HI all, As part of the Wikimedia strategic planning process, we're trying to get a sense of the current state of Wikimedia ops and MediaWiki. You can see some claims on the strategy wiki right now: http://strategy.wikimedia.org/wiki/Wikimedia_technology_infrastructure http://strategy.wikimedia.org/wiki/MediaWiki (Note: These are not my personal opinions; I just copied them over from other places.) Would love to get people's feedback here. If you could edit these pages with your thoughts on the current state of both Wikimedia ops and MediaWiki, that would be great. Analysis and links to existing docs would be much appreciated. This information will help Wikimedia make good recommendations as to what to do invest in to improve these things. Thanks! =Eugene -- ====================================================================== Eugene Eric Kim ................................ http://xri.net/=eekim Blue Oxen Associates ........................ http://www.blueoxen.com/ ======================================================================

14 years, 4 months

svn switch for wmf-deployment

by Tim Starling

The procedure Brion used for the last couple of wmf-deployment updates was: * Create a new branch wmf-deployment-<date>, copied from trunk * Spend a day or two merging in all the WMF-specific hacks and merging out anything that's experimental or buggy * Delete wmf-deployment * Move wmf-deployment-<date> to wmf-deployment I'm considering instead creating a permanent numbered branch for each wmf-deployment major update. To deploy a major update, we would use svn switch to change the checked-out directory. Then subsequent minor updates would be merged to the most recent numbered branch. The main advantage would be that the logs would be easier to navigate. Currently, if you want to know what happened in the wmf-deployment branch before the last major update, you have to specify path revisions to svn log, which is tedious and potentially confusing. Navigating the history in viewvc is also difficult. Another advantage is that we'd have a major version number that we can say Wikimedia is running on. We had three deployments of 1.16 and there may be a fourth before we branch it and start on 1.17, but these branch points are unnamed. Instead we could call them 1.16wmf1, 1.16wmf2 and 1.16wmf3 in Special:Version, and we could keep track of when those updates were deployed, so that users would have a better idea of how to talk about the software that we're running. I've used svn switch a few times in other situations, and haven't encountered any problems with it. As long as there are no uncommitted patches in the live working copy, it should go ahead without a hitch. Any thoughts on that? -- Tim Starling

14 years, 4 months

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Wikitech-l January 2010