I'm jumping back into the abstract schema project, and was
wondering what the workflow is for something like this. Should
I simply create a new branch and have everybody (assuming other
people have an interest) collaborate on that branch until we
are ready to start involving gerrit? Obviously, this will
be a large, all-at-once patch once finished.
Also, I'm still not clear on which mailing list would be more
approriate for discussion of a feature like this. The descriptions
of mediawiki-l and wikitech-l both say "features and development".
I lean towards this list (wikitech) due to the higher traffic.
Thanks.
--
Greg Sabino Mullane greg(a)endpoint.com
End Point Corporation
PGP Key: 0x14964AC8
Hi all,
I'm in the process of developing a media handling extension for MediaWiki
that will allow users with WebGL-enabled browsers to manipulate 3D models
of large biological molecules, like proteins and DNA. I'm new to MediaWiki
development, and I've got some questions about how I should go forward with
development of this extension if I want to ultimately get it into official
Wikimedia MediaWiki deployments.
My initial goal is to put the kind of interactive model available at
http://webglmol.sourceforge.jp/glmol/viewer.html into infoboxes like the
one in http://en.wikipedia.org/wiki/FOXP2. The library enabling this
interactivity is called GLmol -- it's licensed under LGPL and described at
http://webglmol.sourceforge.jp/index-en.html. There is some more
background discussion on the extension at
http://en.wikipedia.org/wiki/Portal:Gene_Wiki/Discussion#Enabling_molecular….
I have a prototype of the extension working on a local deployment of
MediaWiki 1.18.1. I've tried to organize the extension's code roughly
along the lines of http://www.mediawiki.org/wiki/Extension:OggHandler. The
user workflow to get an interactive protein model into an article is to:
1) Upload a PDB file (e.g. http://www.rcsb.org/pdb/files/2A07.pdb)
representing the protein structure through MediaWiki's standard file upload
UI.
2) Add a wikilink to the resulting file, very similar to what's done with
images. For example, [[File:2A07.pdb]].
If the user's browser has WebGL enabled, an interactive model of the
macromolecule similar to one in the linked GLmol demo is then loaded onto
the page via an asynchronous request to get the 3D model's atomic
coordinate data. I've done work to decrease the time needed to render the
3D model and the size of the 3D model data (much beyond gzipping), so my
prototype loads faster than the linked demo.
A main element of this extension -- which I haven't yet developed -- is how
it will gracefully degrade for users without WebGL enabled. IE8 and IE9
don't support WebGL, and IE10 probably won't either. Safari 5.1.5 supports
WebGL, but not by default. WebGL is also not supported on many smartphones.
One idea is to fall back to a 2D canvas representation of the model,
perhaps like the 3D-to-2D examples at https://github.com/mrdoob/three.js/.
I see several drawbacks to this. First, it would not be a fall-back for
clients with JavaScript disabled. Second, the GLmol molecular viewer
library doesn't currently support 2D canvas fall-back, and it would
probably take substantial time and effort to add that feature. Third,
there are browser plug-ins for IE that enable WebGL, e.g.
http://iewebgl.com/.
Given that, my initial plan for handling browsers without WebGL enabled is
to fall back to a static image of the corresponding protein/DNA structure.
A few years ago I wrote a program to take in a PDB file and output a
high-quality static image of the corresponding structure. This resulted in
PDBbot (http://commons.wikimedia.org/wiki/User:PDBbot,
http://code.google.com/p/pdbbot/). That code could likely be repurposed in
this media handling extension to generate a static image upon the upload of
a PDB file. The PDBbot code is mostly Python 3, and it interacts with GIMP
(via scripts in scheme) and PyMOL (http://en.wikipedia.org/wiki/PyMOL,
freely licensed:
http://pymol.svn.sourceforge.net/viewvc/pymol/trunk/pymol/LICENSE?revision=…
).
Would requiring Python, GIMP and PyMOL to be installed on the server be
workable for a WMF MediaWiki deployment? If not, then there is a free web
service developed for Wikipedia (via Gene Wiki) available from the European
Bioinformatics Institute, which points to their pre-rendered static images
for macromolecules. The static images could thus be retrieved from a
remote server if it wouldn't be feasible to generate them on locally on the
upload server. I see a couple of disadvantages to this approac, e.g.
relying on a remote third-party web service, but I thought I'd put the idea
out for consideration. If generating static images on the upload server
wouldn't be possible, would this be a workable alternative?
After I get an answer on the questions above, I can begin working on that
next major part of the extension. This is a fairly blocking issue, so
feedback would definitely be appreciated.
Beyond that, and assuming this extension seems viable so far, I've got some
more questions:
1. Once I get the prototype more fully developed, what would be the
best next step to presenting it and getting it code reviewed? Should I set
up a demo on a random domain/third-party VPN, or maybe something like
http://deployment.wikimedia.beta.wmflabs.org/wiki/Main_Page? Or maybe the
former would come before the latter?
2. PDB (.pdb) is a niche file type that has a non-standard MIME type of
"chemical/x-pdb". See
http://en.wikipedia.org/wiki/Protein_Data_Bank_%28file_format%29 for more.
To upload files with this MIME type, in my local MediaWiki deployment I had
to relax a constraint in the 'image' database table on what MIME types are
allowed. If I recall correctly there was an enum that allowed only a small
handful of MIME types to be uploaded. I also had to adjust some other
configuration settings in Apache and MediaWiki so that .pdb files were
properly handled. Would these things be doable in an official WMF
deployment? If not, what are some possible workarounds?
3. If at all possible, I'd like to have the molecular models be
interactive by default, i.e. be manipulable when the page loads without the
WebGL-enabled user having to click some control to replace a static image
with a model to enable interactivity. The advantage of this is that it
would make the feature easier to discover and quicker to use. Talking
around, the main potential disadvantage I've heard of this approach is that
it might take long to load. However, with the optimizations I've made to
GLmol I think it would be possible to have the interactive models load on
the same order of time that images take to load on pages. Does having
model interactivity by default for WebGL-enabled users sound feasible?
Thanks in advance for any answers to these questions or general feedback.
Best,
Eric
http://en.wikipedia.org/wiki/User:Emw
>Hi all,
>
>I'm in the process of developing a media handling extension for MediaWiki
>that will allow users with WebGL-enabled browsers to manipulate 3D models
>of large biological molecules, like proteins and DNA. I'm new to MediaWiki
>development, and I've got some questions about how I should go forward with
>development of this extension if I want to ultimately get it into official
>Wikimedia MediaWiki deployments.
[..]
Awesome!
> 2. PDB (.pdb) is a niche file type that has a non-standard MIME type of
>"chemical/x-pdb". See
>http://en.wikipedia.org/wiki/Protein_Data_Bank_%28file_format%29 for more.
>To upload files with this MIME type, in my local MediaWiki deployment I had
>to relax a constraint in the 'image' database table on what MIME types are
>allowed. If I recall correctly there was an enum that allowed only a small
>handful of MIME types to be uploaded. I also had to adjust some other
>configuration settings in Apache and MediaWiki so that .pdb files were
>properly handled. Would these things be doable in an official WMF
>deployment? If not, what are some possible workarounds?
This is totally the least of your concerns regarding this extension,
but wouldn't the mime type "model/x-pdb" be more appropriate (or
failing that "application/x-pdb". But pdb sounds like a model format).
I'm not sure why a non-standard major mime type is needed. Then again
I guess we aren't the people who determine the mime type people use.
I also think static images would probably be the best fallback if
WebGL is unavailable.
>ImageMagick seems like it might also have the ability to programmatically
>autocrop an image and add a certain padding around the subject.
ImageMagick would definitely be good since its already in use. I know
netpbm programs can also programatically crop things, but image magick
is definitely the best choice if possible.
-bawolff
Forwarding in behalf of the original user:
> Platonides wrote:
> K. Peachey wrote:
>> Why would we not want blocked accounts to be processed?
>
> Presumably because an indefinetely blocked name (or worse, oversighted)
> would then need a global block.
Of course blocked accounts anywhere must not be processed.
Is there a need to unify old blocked sock/vandal accounts to globally
block them later? No.
Is there a need to unify oversighted accounts via the wpHideUser
function to globally oversight later? No.
Do you know that before CentralAuth had the global oversight function
accounts had to be locked at meta and then you need to go wiki by wiki
manually blocking with 'wpHideUser', then manually oversight meta logs?
- Creating global accounts for those users, most of them with personal
information, would be appalling.
Creating global accounts for blocked accounts creates no benefits but
would be an absolute mistake and would indeed increase the level of work
stewards (I am one) would have to do to fix the problems.
So, I beg *not* to include blocked accounts anywhere into this proposal.
Vandals, socks and abusive names does not need to be unified so they can
continue vandalizing anywhere, or exposing personal information/libel, etc.
Best regards.
Following up on the earlier thread by Rob [1], Rob and I kicked around
the question what metrics/targets for code review we want to surface
on an ongoing basis. We're not going to invest in a huge dashboard
project right now, but we'll try to get at least some of the key
metrics generated and visualized automatically. Help is appreciated,
starting with what the metrics are that we should look at.
Here's what we came up with, by priority:
1) Most important: Time series graph of # of open changesets
Target: Numer of open changesets should not exceed 200.
Optional breakdown:
- mediawiki/core
- mediawiki/extensions
- WMF-deployed extensions
- specific repos
2) Important: Aging trends.
- Time series graph of # open changesets older than a, b, c days
(to indicate troubling aging trends, e.g. a=3, b=5, c=7)
- Target: There should be 0 changes that haven't been looked at
all for more than 7 days.
- Including only: Changes which have not received a -1 review, -1
verification, or -2
- Optional breakdown as above
- Rationale: We're looking for tendencies of complete neglect of
submissions here, which is why we have to exclude -1s or -2s.
3) Possibly useful:
- Per-reviewer or reviewee(?) statistics regarding merge activity,
number of -1s, neglected code, etc.
Any obvious thinking errors in the above / do the targets make sense /
should we look at other metrics or approaches?
Erik
[1] http://lists.wikimedia.org/pipermail/wikitech-l/2012-April/059940.html
--
Erik Möller
VP of Engineering and Product Development, Wikimedia Foundation
Support Free Knowledge: https://wikimediafoundation.org/wiki/Donate
TL;DR: we are looking for solution that can post messages to user talk
pages on any WMF wiki.
We (l10n team) are working on TranslationNotifications extension [1],
which allows translators to sign up and choose how they want
notifications of new translation requests to be delivered to them. One
of the available delivery methods is the user talk page on another
wiki.
We have some plans how to implement this. But we are uncertain which
solution we should use.
I request your feedback. Please let us know what in your opinion is
the best solution, what are the possible problems in each solution or
you can even propose another solution.
Here is what we came up with:
Solution 1:
Use the LoadBalancer to open connection to the other wiki.
* Does WikiPage support editing pages on another databases like this?
* This way we culd bypass things like blocked user/protected page
unless we check for those
Solution 2:
Use the api.php to post to the user talk page.
* Has lots of failure cases:
** Connection error/Timeout
** May need to create user account separately
** The account may be protected
** Talk page might be protected
Solution 3:
''Suggest another solution.''
The idea is that we will create a new job in the jobqueue for each
user that has to be notified (translation administrator has a special
page for sending out the translation requests) and that the job will
notify the user by email or posting to his/her talk page depending on
what the user wants.
[1] https://gerrit.wikimedia.org/r/gitweb?p=mediawiki/extensions/TranslationNot…
-Niklas
--
Niklas Laxström
Chad wrote:
> I think we should probably work on mediawiki.org (subpage of
> RfC?). This is something that needs to be thought out properly,
> and I think on-wiki would be more productive than on-list.
Okay. I will create an RFC page. Keep in mind that my time is limited
and I've only just started to write the proof of concept for it,
so please don't hold up version 12.0 to wait for me, ha ha ha!
Platonides wrote:
> It took me a while to remember that idea of replacing the
> sql files with an abstract schema.
Right. I've been thinking about this a lot, and the big picture plan
is to remove all the tables.sql files and create a single text file,
mediawiki.schema, which will have a generic schema, very similar to
the existing main tables.sql, but with a stricter syntax and more
generic definitions. This will be parsed and put into a php structure.
Each database will walk through this structure to build it's own
'create' statements, using common methods and attributes whenever
possible. Similarly, we can use this for updates as well by simply
walking through and adding missing tables, rather than having to
create patch files or other shenanigans. Most of the above is written:
the tricky part is actually figuring out exactly what the intent of
each column definition in the current tables.sql is! There is a lack
of consistency and some questionable choices. This will also be a great
time to introduce some standards, e.g. naming of indexes.
Krinkle wrote:
> I'd say put it in Gerrit from the start (in a branch) so that everyone
> can check it out and send suggestions (either as a commit or through the
> feedback channels on the mailing list, wiki or Gerrit comments).
>
> Gerrit reviews are also enabled for branches, so you don't have to worry
> much about clashing with others, a commit to the branch on gerrit will
> not end up in the actual branch until it is reviewed.
I don't think this is a good idea: the having to wait for a reviewer before
your changes show up elsewhere is great for the core code, but not so
great for fast-n-furious early prototyping of a big feature. Perhaps we
can add it to gerrit once everyone thinks we have a mostly stable
prototype?
Thanks for all the feedback. I will post here when I add a branch and/or
create an RFC page.
--
Greg Sabino Mullane greg(a)endpoint.com
End Point Corporation
PGP Key: 0x14964AC8
Is there a schedule for the 1.19 release?
It seems we're already rolling out 1.20 in production, so normally we'd
expect 1.19 to be done and out the door already. :)
-- brion