Wikitech-l September 2017

wikitech-l@lists.wikimedia.org

90 participants
74 discussions

by Legoktm

Hello! MediaWiki-Codesniffer 13.0.0 is now available for use in your MediaWiki extensions and other projects. This release features new sniffs and improvements in existing ones. And as you may have noticed, we jumped from 0.12.0 to 13.0.0. This is to be compliant with semantic versioning (new releases will just bump the major version) and indicate the maturity of the project. The changelog for this release: * Add sniff for @cover instead of @covers (James D. Forrester) * Add sniff to find and replace deprecated constants (Kunal Mehta) * Add sniff to find unused "use" statements (Kunal Mehta) * Add space after keyword require_once, if needed (Umherirrender) * Fix @returns and @throw in function docs (Umherirrender) * Prohibit some globals (Max Semenik) * Skip function comments with @deprecated (Umherirrender) * Sniff & fix lowercase @inheritdoc (Gergő Tisza) Libraryupgrader is submitting patches for most Gerrit repositories. :) Thanks, -- Legoktm

6 years, 7 months

Wikidata revisions mail

by יגאל חיטרון

Hello. Since yesterday, I started to get a lot of letters about unexisting revisions in ruwiki articles. I did not changed something relevant in preferences, I think. A little investigation gave me the cause - these are edits of items that are connected by the the item of my watchlist articles in Wikidata. I know this is a probem in special:watchlist, this is why I do not turn on the wikidata in watchlist. But it's the first time I can see them in emails. How can I turn off these, and only these letters, please? I couldn't find something fit in preferences. At least, good it's on ruwiki only, I almost do not work there, have about a dozen of pages in watchlist, so I get these mails once an hour. If it would happen in my home wiki with thousands of pages in watchlist, it can be a letter every second. What can I do? Thank you, Igal (User:Ikhitron)

6 years, 7 months

TechCom Radar, 2017-09-20

by Daniel Kinzler

Hello all! Find below the minutes of the last meeting of the Technical Committee. * TechCom Radar Newsletter: if you would like to be notified about the TechCom radar on-wiki, subscribe at <https://www.mediawiki.org/wiki/Newsletter:TechCom_Radar>! * Last Call: Introduce InterruptMutexManager. While the intent seems uncontroversial, some details may need further discussion. <https://phabricator.wikimedia.org/T161749>. Should no pertinent concerns remain unaddressed by October 5, this RFC will be approved for implementation. * Deadline for applications for the Developer Summit is approaching: <https://www.mediawiki.org/wiki/Wikimedia_Developer_Summit/2018> * New Special Interest Group launched: Technical Debt: <https://www.mediawiki.org/wiki/Technical_Debt_SIG> * New RFC about safely executing user-provided regular expressions. Lots of options to discuss <https://phabricator.wikimedia.org/T176312> * New RFC about consolidating the rpc/RunJobs.php entry point with MW core <https://phabricator.wikimedia.org/T175146> * Last week’s discussion about the JobQueue turned up some interesting points particularly about de-duplication and scheduling, in particular with respect to the upcoming switch to a EventBus/Kafka based job queue. <https://www.mediawiki.org/wiki/User:Daniel_Kinzler_(WMDE)/Job_Queue> * A draft of the database schema for MCR is up for discussion: <https://gerrit.wikimedia.org/r/#/c/378724/> and <https://www.mediawiki.org/wiki/Multi-Content_Revisions/Database_Schema> * Next week’s IRC discussion: Open discussion about how long we need/want to keep support for HHVM. Thread on wikitech: <https://lists.gt.net/wiki/wikitech/844711>. RFC about bumping the PHP version (for context) <ttps://phabricator.wikimedia.org/T172165> As always, the discussion will take place in the IRC channel #wikimedia-office on Wednesday 21:00 UTC (2pm PDT, 23:00 CEST). You can also find our meeting minutes at <https://www.mediawiki.org/wiki/Wikimedia_Technical_Committee/Minutes> See also the TechCom RFC board <https://phabricator.wikimedia.org/tag/mediawiki-rfcs/>. -- Daniel Kinzler Principal Platform Engineer Wikimedia Deutschland Gesellschaft zur Förderung Freien Wissens e.V.

6 years, 7 months

Re: [Wikitech-l] Can we drop revision hashes (rev_sha1)?

by Daniel Kinzler

Am 21.09.2017 um 17:18 schrieb Federico Leva: > (Offlist) > > Daniel Kinzler, 21/09/2017 17:24: >> Hashing is a lot faster than loading the content. Since Special:Export needs to >> load the content anyway, the extra cost of hashing is negligible. > > I trust you, but really? Even when exporting 5000 revisions? Exporting 5000 revisions is likely to time out due to the time it takes to even load all the data. If we can load the data, we can probably also hash it in time. SHA1 is not that slow. Hashing all 1269 PHP files in the includes directory takes half a second of CPU time on my system (about 2 seconds wall clock time). Hashing does put considerable load on the CPU though (on an otherwise I/O bound operation), so it may cause problems if a lot of people do it. But since we have a lot more edits than exports, and every edit needs hashing, I don't think thiat makes much of a difference either. -- Daniel Kinzler Principal Platform Engineer Wikimedia Deutschland Gesellschaft zur Förderung Freien Wissens e.V.

6 years, 7 months

Mapping Hiragana and Katakana

by Trey Jones

We recently got a suggestion via Phabricator[1] to automatically map between hiragana and katakana when searching on English Wikipedia and other wiki projects. As an always-on feature, this isn't difficult to implement, but major commercial search engines (Google.jp, Bing, Yahoo Japan, DuckDuckGo, Goo) don't do that. They give different results when searching for hiragana/katakana forms (for example, オオカミ/おおかみ "wolf"). They also give different *numbers* of results, seeming to indicate that it's not just re-ordering the same results (say, so that results in the same script are ranked higher).[2] I want to know what they know that I don't! Does anyone have any thoughts on whether this would be useful (seems that it would) and whether it would cause any problems (it must, or otherwise all the other search engines would do it, right?). Any idea why it might be different between a Japanese-language wiki and a non-Japanese-language wiki? We often are more aggressive in matching between characters that are not native to a given language--for example, accents on Latin characters are generally ignored on English-language wikis. So it might make sense to merge hiragana and katakana on English-language wikis but not Japanese-language wikis. Thanks very much for any suggestions or information! —Trey [1] https://phabricator.wikimedia.org/T176197 [2] Details of my tests at https://phabricator.wikimedia.org/T173650#3580309 Trey Jones Sr. Software Engineer, Search Platform Wikimedia Foundation

6 years, 7 months

FYI: MediaWiki 1.30 branched; master is now 1.31

by James Forrester

Hey all, Just a quick note that MediaWiki 1.30 has been branched (REL1_30 started at c3a0f25c040e4f69daba5223393d99a03074b0a0), with version bumped from alpha to 1.30.0-rc.0. As normal, extensions, skins and vendor have also been branched. The RC0 will be released for testing in the coming weeks, once people have had time to back-port fixes. This means that the train next week will be 1.31.0-wmf.1, master calls itself "1.31.0-alpha", and there's now a RELEASE-NOTES-1.31 file. But please don't merge lots of waiting-for-1.31 patches just yet, because back-porting important changes is a pain, and doing so over the top of RELEASE-NOTES if nothing else is not fun. Our thanks as always to the brilliant Release Engineering team. :-) J. -- James D. Forrester Lead Product Manager, Contributors Wikimedia Foundation, Inc. jforrester at wikimedia.org <https://lists.wikimedia.org/mailman/listinfo/wikimedia-l> | @jdforrester

6 years, 7 months

Is it possible to edit scribunto data module content through template edit popup of visual editor?

by mathieu stumpf guntz

Saluton ĉiuj kundisvolvantoj, I think the subject summarize it all, so here are more details on what I'm trying to do and what I'm looking for. # Context You might skip this section if you are not interested in contextual verbiage. If you would like to react to anything stated in this section, please change the email subject to reflect that. So I'm currently meditating ways to improve factorization of knowledge stored in Wiktionary. I'm taking a multi-approach experimentation there. On the one hand, I just began a Wikiversity project <https://fr.wikiversity.org/wiki/Recherche:Recueil_lexicologique_%C3%A0_l%E2…> (in French) to establish a specification of how a DBMS should be structured to be useful for Wiktionaries. It mainly emerged from my point of view that the current data model <https://www.mediawiki.org/wiki/Extension:WikibaseLexeme/Data_Model> proposed for the wikidata for wiktionary <https://www.wikidata.org/wiki/Wikidata:Wiktionary> does not fit needs of Wiktionary contributors. I did made some alternative proposals <https://www.mediawiki.org/wiki/Extension_talk:WikibaseLexeme/Data_Model>, and tried to gather a first feedback from the French wiktionary <https://fr.wiktionary.org/wiki/Wiktionnaire:Wikid%C3%A9mie/septembre_2017#V…> on this model too, which led me to the creation Wikiversity research project because I was pointed to the lack of "specify extensively the needs before you model". Now, on an other hand, I'm also trying to factorize some data within the Wikitionary with current available tools. One driving topic for that is fixing gender gap <https://fr.wiktionary.org/wiki/Discussion_Projet:Parit%C3%A9_des_genres>, and more broadly inflection-form gap. That is a feminine form will generally be summarized in a laconic "feminine form of *some-term*", rather than being treated as an entry of it's own. That's all the more problematic in cases where a word only share a subset of relevant definitions depending on which gender(/inflection-form) it applies to. # What I'm trying to do I am trying to factorize data which pertains to several inflection-forms. This way each form can use it to build a stand-alone article about a term. The current approach tends to be gathering everything under a single lemme, although some statements will only pertains to some specific forms. So far I experimented with transclusion of subpages to share definitions, examples and so on between inflection-forms. Well, from a consultation point of view it works. But from an editing point of view, it's all but fine. What I would think interesting, is to store this data in a scribunto data module (at least for now), and enable user to change them while editing an lexical entry article. That might be, when using the visual editor, through something like a model popup. Wikitext editors will probably be skilled enough to edit the relevant module, but for the sake of convenience, it might be interesting to allow to give a parameter to the model, which would at publishing time modify the data module and remove the parameter from the wikitext generated. Let's take an example to make a bit clearer. Let's take the French pair "contributeur/contributrice". In both article, I would like that the definition could be generated from transclusion with something like {{definition|vocable=contributrice|lang=French|gloss=contributor}}. Note that this template might, by default, take into account the name of the calling page, thus avoiding the "vocable" parameter. Also, the lang would be required in contributrice, as this is a vocable which exist in at least in French and Italian. But it would not be required in "contributeur", nor "contributore". Finaly gloss is a string whose purpose is to distinguish a given term in case of homonymy. When no homonym exist, it might be skiped. So in "contributeur", one might simply use {{definition}}, but in "contributrice", one should at least use {{definition|lang=French}}. Now, that's for the purely consultative side of the data. On the backend side, my idea would be to store this data, at least for now, in scribunto data module. So for example in "Module:Vocable/contributrice", one might store all descriptive data about this vocable. I didn't thought yet about the exact structure of what would be stored in this kind of module, but the idea is that the misc. templates such as *definition*, *example*, and so on would serve as interface for this modules, so most contributors would not need to care about this structure. So, precisely, in case of {{definition}}, one should be able to wikitext edit the "contributeur" article and to write something like {{definition|value=A person who contribute}}. And on publish, it would store the given value in appropriate module and change the wikicode so it will only retain {{definition}}. Also, if someone would write {{definition|lang=French|value=Someone who [[contribute]]}}, then the same module entry should be changed and the resulting wikicode should substitute the template invocation with {{definition|lang=French}}. The same parameter conservation should be applied for the gloss parameter. Of course from a visual editor point of view, all that should be even more easy, with the "value" parameter being mandatory and always filled with the matching module value. Well, at least all that is my current goal. If you have other suggestions, I would be glad to read them. Anyway, I would be also very interested to know if what I just described is currently possible. If it is, what documentation/existing module I should look at in order to achieve it. If it's not, what about adding software support in order to make it possible? Kind regards, mathieu

6 years, 7 months

Re: [Wikitech-l] HHVM vs. Zend divergence

by Alex Monk

On 19 Sep 2017 5:40 pm, "C. Scott Ananian" <cananian(a)wikimedia.org> wrote: I'm suggesting to proceed cautiously and have a proper discussion of all the factors involved instead of over-simplifying this to "community" vs "facebook". For example, the top-line github stats are: hhvm: 504 contributors (24,192 commits) php-src: 496 contributors (104,566 commits) HHVM seems to have a larger community of contributors despite a much shorter active life. By a difference of 8 contributors? But note that the PHP github mirror has been broken since Jul 29 (!). I'm not convinced an exclamation mark in brackets is required here.

6 years, 7 months

Re: [Wikitech-l] HHVM vs. Zend divergence

by C. Scott Ananian

On Sep 19, 2017 9:45 PM, "Tim Starling" <tstarling(a)wikimedia.org> wrote: Facebook have been inconsistent with HHVM, and have made it clear that they don't intend to cater to our needs. I'm curious: is this conclusion based on your recent meeting with them, or on past behavior? Their recent announcement had a lot of "we know we haven't been great, but we promise to change" stuff in it ("reinvest in open source") and I'm curious to know if they enumerated concrete steps they planned to take, or whether even in your most recent meeting with them they failed to show actual interest. --scott

6 years, 7 months

Technical Debt SIG - Agenda

by Jean-Rene Branaa

Hello All, Over the past week there's been a significant increase in the number of folks interested in participating in the upcoming Technical Debt SIG sessions. As a result, there's also been a fair amount of discussion on the challenges and value of having a large number of participants in a meeting. Despite these potentially large numbers, I've decided to move forward with the sessions. However, we will be pivoting a bit on the intent of the meeting. I've also decided that we will offer up an IRC meeting following the two Hangout/Bluejeans sessions for those that prefer that platform. That being said, I think it's important to note that the Technical Debt SIG is more than a meeting. The plan is to provide many avenues of engagement in an attempt to be as inclusive as possible. What this means is that you need not worry about "missing out" if you don't attend a SIG session. You'll have access to the same information through other collaborative channels such as Wiki pages, newsletters, Google docs, etc... For the this week's sessions, consider them a Kickoff for the Technical Debt SIG and general information sharing about the Technical Debt program. Again, all this information is or will be available via other channels as well. We encourage you to participate in a way that suits your style best. Agenda - - Purpose of the Tech Debt SIG - Overview of Tech Debt program - What to expect moving forward - Q&A Cheers, JR

6 years, 7 months

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Wikitech-l September 2017