I think this is one of the most awesome things I've seen in a long time.
Has my full support! Thank you for building this and sharing it with everyone!
I've been asked whether we should be concerned about this gadget in terms of site load
or site performance. I'll share my analysis here also for Felipe's benefit and for
transparency to others.
*== Performance ==*
**
Firstly, I'll look at potential database and web server load. What is the exposure of
this feature? How many queries does it make for a typical interaction? Are those queries
expensive? Are they well-cached within MediaWiki? Are they cachable at the HTTP level (e.g
CDN/Varnish cache)?
Secondly, I'll look at how it makes edits. Does it always attribute edits to the
actor? What security risk might there be? Does it act on stale data? Does it prevent
conflicts under race conditions? Does it handle rate limits?
*== Server load ==*
The gadget looks well-written and well-guarded. It was easy to figure out what it does and
how it does it. For a typical interaction, the gadget initially makes ~100 requests to the
Action API, which one be seen in the browser DevTools under the Network tab. It uses the
*MediaWiki* *Action API* to fetch page metadata and raw page content.
There's a couple of things Synchronizer does that I like:
* The user interface requires specific input in the form of a Wikidata entity and a wiki
ID. This means upon casually stumbling on the feature, it won't do anything until and
unless the person has understood what needs to be entered and enters that. A good example
is shown through placeholder text, but you have to actually enter something before
continuing.
* Once submitted, before anything else, Synchronizer checks whether the client is
logged-in and thus likely to be able to later make edits through this gadget. If
you're not logged-in, it won't make that burst of 100 API requests. Even though a
logged-out user could successfully make those API requests to render the initial overview,
this avoids significant load that would only leave the user stranded later on. This
removes the vast majority of risk right here. In the event of viral linking or
misunderstanding of what the feature is for, it would have little to no impact.
* When gathering data, it uses *Wikidata* to discover which wikis are known to have a
version of what the same template. This greatly limits the fan-out to "only"
10-100 wikis as opposed to ~1000 wikis. It also makes sense from a correctness standpoint
as templates don't have the same name on all wikis, and those that do share a name may
not be compatible or otherwise conceptually the same. Wikidata is awesome for this.
Synchronizer uses the Action API to fetch page metadata and raw page content. This kind of
request is fairly cheap and benefits from various forms of caching within MediaWiki. These
API modules don't currently offer CDN caching, but, I don't think that's
warranted today given they're fast enough. If this feature were accessible to
logged-out users and if it made these queries directly when merely visiting a URL, then
we'd need to think about HTTP caching to ensure that any spikes in traffic can be
absorbed by our edge cache data centres.
There is one improvement I can suggest, which is to limit the number of concurrent
requests. It works fine today but it does technically violate *MediaWiki API Etiquette*
https://www.mediawiki.org/wiki/API:Etiquette, and may get caught in API throttling. To
mitigate this, you could tweak the for-loop in "getMasterData". This currently
starts each `updateStatus` call in parallel to each other. If updateStatus returned a
Promise, then this loop could instead chain together the calls in a linear sequence. For
example, `sequence = Promise.resolve(); for (…) { sequence = sequence.finally(
updateStatus.bind( null, module ) ); }`.
*== Editing ==*
For editing, Synchronizer uses all the built-in affordances correctly. E.g. mw.Api and
mw.ForeignApi <https://doc.wikimedia.org/mediawiki-core/master/js/#!/api/mw.Api> as
exposed by the *mediawiki.api ResourceLoader module*. This gives the gadget a stable
surface, greatly reduces complexity of the implementation, and automatically does all the
right things.
I really like that the gadget goes the extra mile of cleverly figuring out why a local
template that differs from the central one. For example, does the local copy match one of
the previous versions of the central template? Or was it locally forked in a new
direction? It then also allows you to preview a diff before syncing any given wiki. This
is really powerful and empowers people to understand their actions before doing it.
To show what could happen if someone didn't use the provided mw.Api JS utility and
naively made requests directly to the API endpoint:
* edit() takes care of preventing unresolved *edit conflicts*. It uses the Edit API's
`basetimestamp` parameter, so that if someone else edited the template between when
Synchronizer fetches metadata and when the actor saves their edit, it will not overwrite
that edit but instead fail safely.
* edit() takes care to ensure the user *remains logged-in* so that if cross-wiki logins
expire or become invalid for any reason, the edit won't be saved as an anon but fail
safely.
* edit() takes care to *not (re-)create a page* if the page in question was deleted in the
mean time. It uses the Edit API's `createonly` and `nocreate` parameters.
https://www.mediawiki.org/wiki/API:Edit
--
Timo Tijhof,
Principal Engineer,
Wikimedia Foundation.
On Wed, 26 Jul 2023, at 14:08, Felipe Schenone wrote:
Hi! As many of you know, a central global repository
for Lua modules and templates has been a frequent request since the early days of the
movement.
This year, I programmed a JavaScript tool called Synchronizer
https://www.mediawiki.org/wiki/Synchronizer
(inspired on a previous tool called DiBabel by User:Yurik)
The tool allows to synchronize (that is, automatically copy) Lua modules across Wikimedia
wikis, and provides other features to help developers update and maintain global modules.
I also re-wrote the documentation at
https://www.mediawiki.org/wiki/Multilingual_Templates_and_Modules to account for the new
tool. It basically describes how to develop a Lua module that can be copied unchanged to
any wiki, by abstracting things like user-readable strings and config.
Admittedly, this is a "poor man's version" of a proper solution to the
problem, but one I find invaluable while developing and maintaining modules. Hopefully
some of you may find it useful too!
_______________________________________________
Wikitech-l mailing list -- wikitech-l(a)lists.wikimedia.org
To unsubscribe send an email to wikitech-l-leave(a)lists.wikimedia.org
https://lists.wikimedia.org/postorius/lists/wikitech-l.lists.wikimedia.org/