Am 19.09.2017 um 20:48 schrieb Gergo Tisza:
On Tue, Sep 19, 2017 at 6:42 AM, Daniel Kinzler
<daniel.kinzler(a)wikimedia.de
Can't you just split it into a separate table? Core would only need to
touch it on insert/update, so that should resolve the performance concerns.
Yes, we could put it into a separate table. But that table would be exactly as
tall as the content table, and would be keyed to it. I see no advantage. But if
DBAs prefer a separate table with a 1:1 relation to the content table, that's
fine with me.
Note that the content table is indeed touched a lot less than the revision table.
Also, since content is supposed to be deduplicated (so
two revisions with
the exact same content will have the same content_address), cannot that
replace content_sha1 for revert detection purposes?
Only if we could detect and track "manual" reverts. And the only reliable way
to
do this right now is by looking at the sha1.
--
Daniel Kinzler
Principal Platform Engineer
Wikimedia Deutschland
Gesellschaft zur Förderung Freien Wissens e.V.