[Engineering] [Analytics Cluster] Downtime announcement for Oozie/Hive - Dec 7 10AM CET

Goran Milovanovic goran.milovanovic_ext at wikimedia.de
Thu Dec 7 11:45:02 UTC 2017


Hi Luca,

well, given that you are already have to deal with Hive today, just to
report back that I have had a few situations with the HS2 server rejecting
my queries in the previous days, reporting back that the most likely reason
is the number of open connections. I guess some defensive programming in my
R scripts will take care of running the queries when the rush is not that
dense, however, nothing similar has ever happened in the previous months,
so I wanted to report back.

You know that I'm not in Data Engineering so I don't have a clue whether
this has or does not have to do with the HS2 settings as they were planned
by Analytics-Engineering. Maybe nothing needs to be changed. Just wanted to
let you know.

Good luck with the daemon.

Best,

Goran S. Milovanović, PhD
Data Scientist, Software Department
Wikimedia Deutschland

------------------------------------------------
"It's not the size of the dog in the fight,
it's the size of the fight in the dog."
- Mark Twain
------------------------------------------------

On Thu, Dec 7, 2017 at 12:36 PM, Luca Toscano <ltoscano at wikimedia.org>
wrote:

> Hi everybody,
>
> we are experiencing some issues with the Hive daemon, so currently Hive
> queries are not available. I am going to update this thread as soon as the
> issue is over.
>
> For more info, please contact me (elukey) on IRC (#wikimedia-analytics).
>
> Sorry for the trouble!
>
> Luca
>
> 2017-12-06 19:47 GMT+01:00 Luca Toscano <ltoscano at wikimedia.org>:
>
>> Hi everybody,
>>
>> we'd need to reboot the analytics1003 host for Linux kernel and openjdk
>> updates tomorrow Dec 07 at 10 AM CET. Hive and Oozie will stop for a
>> (hopefully) brief amount of time, but since they'll need to stop before the
>> reboot it might happen that in flight jobs/queries fail. We'll try to avoid
>> the reboot if too many jobs are running, but at some point we'll need to
>> pull the trigger.
>>
>> Please let me know on IRC (#wikimedia-analytics, elukey) or via email if
>> you have any issue with this maintenance.
>>
>> Thanks and sorry for the trouble!
>>
>> Luca (on behalf of the Analytics team)
>>
>
>
> _______________________________________________
> Engineering mailing list
> Engineering at lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/engineering
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.wikimedia.org/pipermail/engineering/attachments/20171207/57105d80/attachment.html>


More information about the Engineering mailing list