Hello!
MediaWiki-Codesniffer 13.0.0 is now available for use in your MediaWiki
extensions and other projects. This release features new sniffs and
improvements in existing ones. And as you may have noticed, we jumped
from 0.12.0 to 13.0.0. This is to be compliant with semantic versioning
(new releases will just bump the major version) and indicate the
maturity of the project. The changelog for this release:
* Add sniff for @cover instead of @covers (James D. Forrester)
* Add sniff to find and replace deprecated constants (Kunal Mehta)
* Add sniff to find unused "use" statements (Kunal Mehta)
* Add space after keyword require_once, if needed (Umherirrender)
* Fix @returns and @throw in function docs (Umherirrender)
* Prohibit some globals (Max Semenik)
* Skip function comments with @deprecated (Umherirrender)
* Sniff & fix lowercase @inheritdoc (Gergő Tisza)
Libraryupgrader is submitting patches for most Gerrit repositories. :)
Thanks,
-- Legoktm
Hello.
Since yesterday, I started to get a lot of letters about unexisting
revisions in ruwiki articles. I did not changed something relevant in
preferences, I think. A little investigation gave me the cause - these are
edits of items that are connected by the the item of my watchlist articles
in Wikidata. I know this is a probem in special:watchlist, this is why I do
not turn on the wikidata in watchlist. But it's the first time I can see
them in emails. How can I turn off these, and only these letters, please? I
couldn't find something fit in preferences. At least, good it's on ruwiki
only, I almost do not work there, have about a dozen of pages in watchlist,
so I get these mails once an hour. If it would happen in my home wiki with
thousands of pages in watchlist, it can be a letter every second. What can
I do?
Thank you,
Igal (User:Ikhitron)
Am 21.09.2017 um 17:18 schrieb Federico Leva:
> (Offlist)
>
> Daniel Kinzler, 21/09/2017 17:24:
>> Hashing is a lot faster than loading the content. Since Special:Export needs to
>> load the content anyway, the extra cost of hashing is negligible.
>
> I trust you, but really? Even when exporting 5000 revisions?
Exporting 5000 revisions is likely to time out due to the time it takes to even
load all the data. If we can load the data, we can probably also hash it in
time. SHA1 is not that slow. Hashing all 1269 PHP files in the includes
directory takes half a second of CPU time on my system (about 2 seconds wall
clock time).
Hashing does put considerable load on the CPU though (on an otherwise I/O bound
operation), so it may cause problems if a lot of people do it. But since we have
a lot more edits than exports, and every edit needs hashing, I don't think thiat
makes much of a difference either.
--
Daniel Kinzler
Principal Platform Engineer
Wikimedia Deutschland
Gesellschaft zur Förderung Freien Wissens e.V.
We recently got a suggestion via Phabricator[1] to automatically map
between hiragana and katakana when searching on English Wikipedia and other
wiki projects. As an always-on feature, this isn't difficult to implement,
but major commercial search engines (Google.jp, Bing, Yahoo Japan,
DuckDuckGo, Goo) don't do that. They give different results when searching
for hiragana/katakana forms (for example, オオカミ/おおかみ "wolf"). They also give
different *numbers* of results, seeming to indicate that it's not just
re-ordering the same results (say, so that results in the same script are
ranked higher).[2] I want to know what they know that I don't!
Does anyone have any thoughts on whether this would be useful (seems that
it would) and whether it would cause any problems (it must, or otherwise
all the other search engines would do it, right?).
Any idea why it might be different between a Japanese-language wiki and a
non-Japanese-language wiki? We often are more aggressive in matching
between characters that are not native to a given language--for example,
accents on Latin characters are generally ignored on English-language
wikis. So it might make sense to merge hiragana and katakana on
English-language wikis but not Japanese-language wikis.
Thanks very much for any suggestions or information!
—Trey
[1] https://phabricator.wikimedia.org/T176197
[2] Details of my tests at https://phabricator.wikimedia.org/T173650#3580309
Trey Jones
Sr. Software Engineer, Search Platform
Wikimedia Foundation
Hey all,
Just a quick note that MediaWiki 1.30 has been branched (REL1_30 started at
c3a0f25c040e4f69daba5223393d99a03074b0a0), with version bumped from alpha
to 1.30.0-rc.0. As normal, extensions, skins and vendor have also been
branched. The RC0 will be released for testing in the coming weeks, once
people have had time to back-port fixes.
This means that the train next week will be 1.31.0-wmf.1, master calls
itself "1.31.0-alpha", and there's now a RELEASE-NOTES-1.31 file. But
please don't merge lots of waiting-for-1.31 patches just yet, because
back-porting important changes is a pain, and doing so over the top of
RELEASE-NOTES if nothing else is not fun.
Our thanks as always to the brilliant Release Engineering team. :-)
J.
--
James D. Forrester
Lead Product Manager, Contributors
Wikimedia Foundation, Inc.
jforrester at wikimedia.org
<https://lists.wikimedia.org/mailman/listinfo/wikimedia-l> |
@jdforrester
Saluton ĉiuj kundisvolvantoj,
I think the subject summarize it all, so here are more details on what
I'm trying to do and what I'm looking for.
# Context
You might skip this section if you are not interested in contextual
verbiage. If you would like to react to anything stated in this section,
please change the email subject to reflect that.
So I'm currently meditating ways to improve factorization of knowledge
stored in Wiktionary.
I'm taking a multi-approach experimentation there. On the one hand, I
just began a Wikiversity project
<https://fr.wikiversity.org/wiki/Recherche:Recueil_lexicologique_%C3%A0_l%E2…>
(in French) to establish a specification of how a DBMS should be
structured to be useful for Wiktionaries. It mainly emerged from my
point of view that the current data model
<https://www.mediawiki.org/wiki/Extension:WikibaseLexeme/Data_Model>
proposed for the wikidata for wiktionary
<https://www.wikidata.org/wiki/Wikidata:Wiktionary> does not fit needs
of Wiktionary contributors. I did made some alternative proposals
<https://www.mediawiki.org/wiki/Extension_talk:WikibaseLexeme/Data_Model>,
and tried to gather a first feedback from the French wiktionary
<https://fr.wiktionary.org/wiki/Wiktionnaire:Wikid%C3%A9mie/septembre_2017#V…>
on this model too, which led me to the creation Wikiversity research
project because I was pointed to the lack of "specify extensively the
needs before you model".
Now, on an other hand, I'm also trying to factorize some data within the
Wikitionary with current available tools. One driving topic for that is
fixing gender gap
<https://fr.wiktionary.org/wiki/Discussion_Projet:Parit%C3%A9_des_genres>,
and more broadly inflection-form gap. That is a feminine form will
generally be summarized in a laconic "feminine form of *some-term*",
rather than being treated as an entry of it's own. That's all the more
problematic in cases where a word only share a subset of relevant
definitions depending on which gender(/inflection-form) it applies to.
# What I'm trying to do
I am trying to factorize data which pertains to several
inflection-forms. This way each form can use it to build a stand-alone
article about a term. The current approach tends to be gathering
everything under a single lemme, although some statements will only
pertains to some specific forms.
So far I experimented with transclusion of subpages to share
definitions, examples and so on between inflection-forms. Well, from a
consultation point of view it works. But from an editing point of view,
it's all but fine.
What I would think interesting, is to store this data in a scribunto
data module (at least for now), and enable user to change them while
editing an lexical entry article. That might be, when using the visual
editor, through something like a model popup. Wikitext editors will
probably be skilled enough to edit the relevant module, but for the sake
of convenience, it might be interesting to allow to give a parameter to
the model, which would at publishing time modify the data module and
remove the parameter from the wikitext generated.
Let's take an example to make a bit clearer. Let's take the French pair
"contributeur/contributrice". In both article, I would like that the
definition could be generated from transclusion with something like
{{definition|vocable=contributrice|lang=French|gloss=contributor}}. Note
that this template might, by default, take into account the name of the
calling page, thus avoiding the "vocable" parameter. Also, the lang
would be required in contributrice, as this is a vocable which exist in
at least in French and Italian. But it would not be required in
"contributeur", nor "contributore". Finaly gloss is a string whose
purpose is to distinguish a given term in case of homonymy. When no
homonym exist, it might be skiped. So in "contributeur", one might
simply use {{definition}}, but in "contributrice", one should at least
use {{definition|lang=French}}. Now, that's for the purely consultative
side of the data.
On the backend side, my idea would be to store this data, at least for
now, in scribunto data module. So for example in
"Module:Vocable/contributrice", one might store all descriptive data
about this vocable. I didn't thought yet about the exact structure of
what would be stored in this kind of module, but the idea is that the
misc. templates such as *definition*, *example*, and so on would serve
as interface for this modules, so most contributors would not need to
care about this structure.
So, precisely, in case of {{definition}}, one should be able to wikitext
edit the "contributeur" article and to write something like
{{definition|value=A person who contribute}}. And on publish, it would
store the given value in appropriate module and change the wikicode so
it will only retain {{definition}}. Also, if someone would write
{{definition|lang=French|value=Someone who [[contribute]]}}, then the
same module entry should be changed and the resulting wikicode should
substitute the template invocation with {{definition|lang=French}}. The
same parameter conservation should be applied for the gloss parameter.
Of course from a visual editor point of view, all that should be even
more easy, with the "value" parameter being mandatory and always filled
with the matching module value.
Well, at least all that is my current goal. If you have other
suggestions, I would be glad to read them. Anyway, I would be also very
interested to know if what I just described is currently possible. If it
is, what documentation/existing module I should look at in order to
achieve it. If it's not, what about adding software support in order to
make it possible?
Kind regards,
mathieu
On 19 Sep 2017 5:40 pm, "C. Scott Ananian" <cananian(a)wikimedia.org> wrote:
I'm suggesting to proceed cautiously and have a proper
discussion of all the factors involved instead of over-simplifying this to
"community" vs "facebook".
For example, the top-line github stats are:
hhvm: 504 contributors (24,192 commits)
php-src: 496 contributors (104,566 commits)
HHVM seems to have a larger community of contributors despite a much
shorter active life.
By a difference of 8 contributors?
But note that the PHP github mirror has been broken since Jul 29 (!).
I'm not convinced an exclamation mark in brackets is required here.
On Sep 19, 2017 9:45 PM, "Tim Starling" <tstarling(a)wikimedia.org> wrote:
Facebook have been inconsistent with
HHVM, and have made it clear that they don't intend to cater to our needs.
I'm curious: is this conclusion based on your recent meeting with them, or
on past behavior? Their recent announcement had a lot of "we know we
haven't been great, but we promise to change" stuff in it ("reinvest in
open source") and I'm curious to know if they enumerated concrete steps
they planned to take, or whether even in your most recent meeting with them
they failed to show actual interest.
--scott
Hello All,
Over the past week there's been a significant increase in the number of
folks interested in participating in the upcoming Technical Debt SIG
sessions. As a result, there's also been a fair amount of discussion on
the challenges and value of having a large number of participants in a
meeting.
Despite these potentially large numbers, I've decided to move forward with
the sessions. However, we will be pivoting a bit on the intent of the
meeting. I've also decided that we will offer up an IRC meeting following
the two Hangout/Bluejeans sessions for those that prefer that platform.
That being said, I think it's important to note that the Technical Debt SIG
is more than a meeting. The plan is to provide many avenues of engagement
in an attempt to be as inclusive as possible. What this means is that you
need not worry about "missing out" if you don't attend a SIG session.
You'll have access to the same information through other collaborative
channels such as Wiki pages, newsletters, Google docs, etc...
For the this week's sessions, consider them a Kickoff for the Technical
Debt SIG and general information sharing about the Technical Debt program.
Again, all this information is or will be available via other channels as
well. We encourage you to participate in a way that suits your style best.
Agenda -
- Purpose of the Tech Debt SIG
- Overview of Tech Debt program
- What to expect moving forward
- Q&A
Cheers,
JR