[Labs-l] db replica query weirdness

Jaime Crespo jcrespo at wikimedia.org
Mon Aug 3 07:44:48 UTC 2015


MariaDB [(none)]> SELECT CAST(now() AS int);
+--------------------+
| CAST(now() AS int) |
+--------------------+
|     20150803094329 |
+--------------------+
1 row in set (0.00 sec)

On Mon, Aug 3, 2015 at 9:34 AM, Golden Ring <goldenring.wp at gmail.com> wrote:

> Silly me, assuming that the subtraction operator would do the right
> thing.  Er, any idea what the subtraction operator _does_ do?
>
> On 3 August 2015 at 05:33, Huji Lee <huji.huji at gmail.com> wrote:
>
>> You are seeing those gaps because the method you use to calculate account
>> age is not correct. Run this query:
>>
>> select user_id, user_registration,
>> (now() - str_to_date(user_registration, '%Y%m%d%H%i%s'))/86400 as
>> registered_time
>> from user
>> where user_id >= 25650026
>> and user_id <= 25650027;
>>
>> And you will see that those two accounts which are created back to back,
>> one at the end of 6/30/2015 and one at the beginning of 7/1/2015 will
>> return an account age of ~2000 and ~1181 respectively (two numbers that are
>> not that close).
>>
>> Instead, run this:
>>
>> select user_id, user_registration,
>> datediff(now(), str_to_date(user_registration, '%Y%m%d%H%i%s')) as
>> acct_age
>> from user
>> where user_id >= 25650026
>> and user_id <= 25650027;
>>
>> And you will get the account age in days (33 and 34 days in this example).
>>
>>
>> On Sun, Aug 2, 2015 at 1:12 PM, Golden Ring <goldenring.wp at gmail.com>
>> wrote:
>>
>>> I'm hoping that someone here can explain some query weirdness to me.
>>>
>>> I'm trying to get a list of the n most-recently-registered users who
>>> have done page moves, and how many page moves they've done.  To do this,
>>> I'm using this query:
>>>
>>> select
>>>        user_id,
>>>        user_name,
>>>        (now() - str_to_date(user_registration, '%Y%m%d%H%i%s'))/86400 as
>>> registered_time,
>>>        count(log_id) as moves
>>> from
>>>        user,
>>>        logging_userindex
>>> where
>>>        user_id = log_user
>>>   and (log_action = 'move' or log_action = 'move_redir')
>>> group by user_id
>>> order by user_id desc
>>> limit {};
>>>
>>> For a limit of about 100 rows, this gives plausible-looking results.
>>> But increasing the limit to 1000 gives some pretty bizarre results.  There
>>> are big gaps in the registered_time variable.  The range of the
>>> registered_time variable is about 0 - 4,000 days, but there are no results
>>> with registered times between approx 350 - 1160 days, 1500 - 2300 days, or
>>> 2670 to 3480 days.
>>>
>>> July 2013 is about 700 days ago, but a query to find users with page
>>> moves who registered on July 15, 2013 shows that such users do exist:
>>>
>>> select count(*) from user, logging_userindex where user_id = log_user
>>> and log_action='move' and user_registration like '20130715%';
>>>
>>> Result: 63 move actions from users registered on 20130715.
>>>
>>> What's going on here?  I assume I've simply got the first query wrong in
>>> some way, but I can't see how.
>>>
>>> Note that all of the above is against the enwiki replica.
>>>
>>> Thanks for any help,
>>> GoldenRing
>>>
>>>
>>> _______________________________________________
>>> Labs-l mailing list
>>> Labs-l at lists.wikimedia.org
>>> https://lists.wikimedia.org/mailman/listinfo/labs-l
>>>
>>>
>>
>> _______________________________________________
>> Labs-l mailing list
>> Labs-l at lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/labs-l
>>
>>
>
> _______________________________________________
> Labs-l mailing list
> Labs-l at lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/labs-l
>
>


-- 
Jaime Crespo
<http://wikimedia.org>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.wikimedia.org/pipermail/labs-l/attachments/20150803/31bbfbef/attachment-0001.html>


More information about the Labs-l mailing list