[Labs-l] Lag reporting on lab db replicas

Ricordisamoa ricordisamoa at openmailbox.org
Sat Nov 28 20:03:28 UTC 2015


Il 28/11/2015 20:58, Bryan Davis ha scritto:
> On Sat, Nov 28, 2015 at 12:28 PM, Ricordisamoa
> <ricordisamoa at openmailbox.org> wrote:
>> Il 26/11/2015 09:19, Jaime Crespo ha scritto:
>>>> So even if the replicas don't get updated the heartbeat will report them
>>>> as up to date?
>>> Not sure exactly what you mean with that. The masters will be updated
>>> continuously every 0.5 seconds (all slaves are read only- no writes are done
>>> there). If replication works, and slaves get updated, that will mean that
>>> they will receive the heartbeat with the same replication channel than the
>>> rest of the updates. If replication doesn't work, and replicas do not get
>>> updated, they will not receive the heartbeat either, as it comes from
>>> replication in order. If replication stops/fails, heartbeat update will stop
>>> (from the slave perspective), and lag will start to increase from your
>>> perspective (difference between last timestamp written and current time).
>>>
>>> This measures the replication lag (aka difference with the master), not
>>> the last time an edit was done by a user, which was what the first link I
>>> sent measured. In other words, if jaimewiki receives only user edits every
>>> hour, heartbeat will still do a write to its master every half a seconds,
>>> thus proving that it is up to date with that resolution. You can still check
>>> the last user edit by checking recentchanges.
>>>
>>> The only reason this could fail (heartbeat updated but wiki not) is if
>>> there was a specific filter denying replication but allowing hearbeat, only
>>> done for specific tables and private wikis. Also the production master could
>>> have a problem, but that would affect the wikis itselves, not only labs.
>>>
>>> To give you an idea of the accuracy of this method, we (will) use it on
>>> production to decide if a slave is usable or not to return up-to-date data.
>>>
>>> For more information on how this works, check
>>> <https://www.percona.com/doc/percona-toolkit/2.1/pt-heartbeat.html#description>
>>>
>> I don't understand, please explain to a 5 years old :-)
> I'll try:
>
> * Each master server has a "heartbeat" table where it updates a row
> every 0.5 seconds with a timestamp value.
> * Each replica server has a copy of this heartbeat table that only
> receives updated timestamps via replication.
> * Each replica server also has a view that shows the difference
> between the current system time and the heartbeat table's timestamp.
> * This difference (current system time - last timestamp seen from
> master) is the true replication delta between the master and the
> replica.
> * The only way that a replica server could see an updated tiemstamp
> from the master without also having all other changes locally would be
> via explicit configuration that allowed heartbeat updates but excluded
> others.
>
> Bryan

Thank you :-)



More information about the Labs-l mailing list