I suggested it on T20209#3535024 back in August, thanks Brad for taking
care for it :)
Just to add a sidenote regarding user=0 and user_text with some non IP
value - I saw it was quite common in Wikidata recentchanges table few
months ago with rc_type=5 (RC_EXTERNAL), though I can't see such anymore.
On Thu, Nov 30, 2017 at 7:31 PM, Brad Jorsch (Anomie) <bjorsch(a)wikimedia.org
wrote:
> The proposal was approved by TechCom, the code has been merged, and it's
> live now on the Beta Cluster. I'm running the maintenance script now.
> Please test things there and report any bugs you encounter, either by
> replying to this message or by filing it in Phabricator and adding me as a
> subscriber. Assuming no major errors turn up that can't be quickly fixed,
> I'll probably start running the maintenance script on the production wikis
> the week of December 11 (and perhaps on
mediawiki.org and testwiki the
> week
> before).
>
> If you're curious as to what the history of an existing imported page might
> look like after the maintenance script is run, see
>
https://commons.wikimedia.beta.wmflabs.org/wiki/
> Template:Documentation?action=history
> for an example.
>
> On Tue, Oct 31, 2017 at 10:52 AM, Brad Jorsch (Anomie) <
> bjorsch(a)wikimedia.org
wrote:
>
> > Handling of usernames in imported edits in MediaWiki has long been weird
> > (T9240[1] was filed in 2006!).
> >
> > If the local user doesn't exist, we get a strange row in the revision
> > table where rev_user_text refers to a valid name while rev_user is 0
> which
> > typically indicates an IP edit. Someone can later create the name, but
> > rev_user remains 0, so depending on which field a tool looks at the
> > revision may or may not be considered to actually belong to the
> > newly-created user.
> >
> > If the local user does exist when the import is done, the edit is
> > attributed to that user regardless of whether it's actually the same
> user.
> > See T179246[2] for an example where imported edits got attributed to the
> > wrong account in pre-SUL times.
> >
> > In Gerrit change 386625[3] I propose to change that.
> >
> > - If revisions are imported using the "Upload XML data" method, it
> > will be required to fill in a new field to indicate the source of the
> > edits, which is intended to be interpreted as an interwiki prefix.
> > - If revisions are imported using the."Import from another wiki"
> > method, the specified source wiki will be used as the source.
> > - During the import, any usernames that don't exist locally (and
can't
> > be auto-created via CentralAuth[4]) will be imported as an
> > otherwise-invalid name, e.g. an edit by User:Example from source
'en'
> would
> > be imported as "en>Example".[5]
> > - There will be a checkbox on Special:Import to specify whether the
> > same should be done for usernames that do exist locally (or can be
> created)
> > or whether those edits should be attributed to the
> existing/autocreated
> > local user.
> > - On history pages, log pages, and the like, these usernames will be
> > displayed as interwiki links, much as might be generated by wikitext
> like "
> > [[:en:User:Example|en>Example]]". No parenthesized 'tool'
links
> (talk,
> > block, and so on) will be generated for these rows.
> > - On WMF wikis, we'll run a maintenance script to clean up the
> > existing rows with valid usernames and rev_user = 0. The current plan
> there
> > is to attribute these edits to existing SUL users where possible and
> to
> > prefix them with a generic prefix otherwise, but we could as easily
> prefix
> > them all.
> > - Unfortunately it's impossible to retroactively determine the
> > actual source of old imports automatically or to automatically do
> anything
> > about imports that were misattributed to a different local user in
> pre-SUL
> > times (e.g. T179246[2]).
> > - The same will be done for CentralAuth's global suppression
> > blocks. In this case, on WMF wikis we can safely point them all at
> Meta.
> >
> > If you have comments on this proposal, please reply here or on
> >
https://gerrit.wikimedia.org/r/#/c/386625/.
> >
> >
> > Background: The upcoming actor table changes[6] require some change to
> the
> > handling of these imported names because we can't have separate
> attribution
> > to "Example as a non-registered user" and "Example as a
registered user"
> > with the new schema. The options we've identified are:
> >
> > 1. This proposal, or something much like it.
> > 2. All the existing rows with rev_user = 0 would have to be attributed
> > to the existing local user (if any), and in the future when a new
> user is
> > created any existing edits attributed to that name will be
> automatically
> > attributed to that new account.
> > 3. All the existing rows with rev_user = 0 and an existing local user
> > would have to be re-attributed to different *valid* usernames,
> > probably randomly-generated in some manner, and in the future when a
> new
> > user is created any existing edits for that name would have to be
> similarly
> > re-attributed.
> > 4. Like #2, except the creation (including SUL auto-creation) of the
> > same-named account would not be allowed. Thus, an import before the
> local
> > name exists would forever block that name from being used for an
> actual
> > local account.
> > 5. Some less consistent combination of the "all the existing rows"
and
> > "when a new user is created" options from #2–4.
> >
> > Of these options, this proposal seems like the best one.
> >
> > [1]:
https://phabricator.wikimedia.org/T9240
> > [2]:
https://phabricator.wikimedia.org/T179246
> > [3]:
https://gerrit.wikimedia.org/r/#/c/386625/
> > [4]:
https://phabricator.wikimedia.org/T111605
> > [5]: ">" was chosen rather than the more typical ":"
because the former
> is
> > already invalid in all usernames (and page titles). While a colon is
> *now*
> > disallowed in new usernames, existing names created before that
> restriction
> > was added can continue to be used (and there are over 12000 such
> usernames
> > in WMF's SUL) and we decided it'd be better not to suddenly break them.
> > [6]:
https://phabricator.wikimedia.org/T167246
> >
> > --
> > Brad Jorsch (Anomie)
> > Senior Software Engineer
> > Wikimedia Foundation
> >
>
>
>
> --
> Brad Jorsch (Anomie)
> Senior Software Engineer
> Wikimedia Foundation
> _______________________________________________
> Wikitech-l mailing list
> Wikitech-l(a)lists.wikimedia.org
>
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>