Discovery April 2016

discovery@lists.wikimedia.org

17 participants
29 discussions

by Guillaume Lederrey

Hello! Fyi, I just added 2 new panels to the the Elasticsearch Grafana dashboard [1] (located at the bottom). 1) Nodes with < 25% disk free The idea is to be able to catch cluster imbalanced earlier. There is filter on the the data to show only nodes with < 25% disk free. At the moment there are no nodes satisfying the criteria, so that graph is empty. You can play with it by increasing the threshold... 2) Request rate (nginx) This is mostly the same data as the QPS graphs we already have, but at a lower level. I like having different measurements of similar things, so that when things are not as expected, we might have a chance to understand why. For example, this show ~ 500 request / second in equiad, which are probably the Translate extension, and the index updates (I need to check). Let me know if you have other ideas of things that make sense... MrG [1] https://grafana.wikimedia.org/dashboard/db/elasticsearch-percentiles -- Guillaume Lederrey Operations Engineer, Discovery Wikimedia Foundation

8 years

Discovery Weekly Update for the week starting 2016-04-11

by Chris Koerner

Here's a fresh batch of hyperlinks from the world of Discovery. * We've been discussing <https://lists.wikimedia.org/pipermail/discovery/2016-April/001043.html> how to deal with data collection for the upcoming "TextCat A/B/C <https://phabricator.wikimedia.org/T121542>" test. This test is on enwiki, and the goal is to recognize queries that get fewer than three results that are in languages other than English, and provide results from the appropriate wiki. There are some technical hurdles with tracking clicks, dwell time, and other indicators of user satisfaction across wikis. * Geospatial search prototype for WDQS: https://lists.wikimedia.org/pipermail/wikidata/2016-April/008461.html * Query results for WDQS are now cached by default for 60 seconds <https://phabricator.wikimedia.org/T126730>. * Portal team released numerous bug fixes and small enhancements <https://phabricator.wikimedia.org/T130755> on Thursday, April 14. * Portal team disabled the latest A/B test for language detection <https://phabricator.wikimedia.org/T124116>, analysis will be completed soon. * On Hackathon, we did partial implementation for non-exact title search to Special:Undelete <https://phabricator.wikimedia.org/T109561> - feedback welcome! ---- Feedback and suggestions on this weekly update are welcome. The full update, and archive of past updates, can be found on Mediawiki.org: https://www.mediawiki.org/wiki/Discovery/Status_updates -- Yours, Chris Koerner Community Liaison - Discovery Wikimedia Foundation

8 years

Upcoming TextCat A/B Test & meeting overview

by Mikhail Popov

Erik, Trey, David, Kevin, and I met this morning to discuss how we're going to handle data collection for the upcoming TextCat test [1]. A big problem in this particular case is that the system wasn't designed/engineered in a way that's conducive for cross-wiki logging / session tracking. And recently we even lost the ability to use the referrer info to see which page the user came from when visiting another wiki page when going between wikis. (I was told this was done for user privacy reasons.) Erik said he had recently implemented a click event in the TestSearchSatisfaction2 schema that we might be able to hook into to measure clickthrough rate for users who are eligible for TextCat language detection & get shown results in the language their non-English query probably is written in. Whether we use this and how much we rely on this particular method of measuring whether TextCat is successful (beyond just measuring how it impacts the zero results rate) depends on the validation [2] of the click events and how they compare to page visit events (which cannot be fired in an interwiki context). We also discussed an alternative approach which uses web requests with the caveat being that if a user is selected for the test once, they'll be selected every time. So if a particular IP+UA combination is part of the test and performs 2 million searches (as is sometimes the case), then we'll have to do some very careful filtering which will also exclude some completely valid use cases (a computer lab in a school or a country with only 2 public IP addresses). But we're shooting for being able to use TestSearchSatisfaction2 :) [1] https://phabricator.wikimedia.org/T121542 [2] https://phabricator.wikimedia.org/T132706 -- *Mikhail Popov* // Count Logula, Discovery <https://www.mediawiki.org/wiki/Wikimedia_Discovery> https://wikimediafoundation.org/ *Imagine a world in which every single human being can freely share in the **sum of all knowledge. That's our commitment.* Donate <https://donate.wikimedia.org/>.

8 years, 1 month

Re: [discovery] Action items from March Search retrospective

by Guillaume Lederrey

On Wed, Apr 13, 2016 at 6:16 PM, Kevin Smith <ksmith(a)wikimedia.org> wrote: > This is a ping to remind you about action items we ended up with from our > March retrospective, in case you haven't acted on them. Sorry this is so > late! > > Mikhail: Check whether meta Discovery/Testing page is up-to-date > Guillaume talk to robh to understand how procurement works -> Done. I have probably not understood everything yet, but I'm getting there (slowly). One of the things I've understood that might interest some of you (you probably know it, but we never know): There are 2 workboards worth watching if you want to know the status of hardware requests: Hardware-request board [1]: Tracks all hardware requests. If you are waiting for hardware, but the task does not show up in that board, your task is probably lost and Robh is not working on it (and I'm probably not going to find it either). Columns are self explanatory and give you some visibility on the status. Procurement board [2]: Once the hardware request is approved, a procurement task is created and will move forward in the procurement board. As those tasks contains price information that we are not allowed to share, the access is restricted. The full procurement process is documented on Office wiki (again, some private info there, so not public). > Should probably have an automatic task to announce each test > Think more about velocity question: Hire more? Change process? Is it OK as > is? Start doing guesstimations? Before changing the way we do things (hire, process improvement, estimation, ...) I think we should put in place a few metrics, so that we know if our changes improve the situation or not. I had a quick look into the reports available out of the box from Phabricator, and they seem a bit lightweight (to say the least). I might just not be looking at the right place (I have been known to do that). The first metric I'd like to see is something about cycle time (how long do we take to finish a task once we started working on it). Or a Cumulative Flow Diagram, which should give us visibility on our cycle time and might give more insight on its evolution over time. As you see, I'm a big fan of metrics. Now that I've made my point, I'm not actually sure it make sense to invest in better visibility in our team. So far I have not felt like we have an issue in delivering value. Digging into how we work, creating metrics, following them does not come for free. So the usual "if it's not broken, don't fix it" probably applies here. You know that better than I do... > Announce the past test(s) > > I think Dan already did that third item. The fourth was probably on my > plate, but I haven't had a chance to do it, so I'll probably leave it as an > action item coming out of the April retro. > > Who would be best to handle that last item, or has someone already done it? > > > Kevin Smith > Agile Coach, Wikimedia Foundation > [1] https://phabricator.wikimedia.org/project/view/1014/ [2] https://phabricator.wikimedia.org/project/view/1155/ [3] https://office.wikimedia.org/wiki/Operations/Procurement -- Guillaume Lederrey Operations Engineer, Discovery Wikimedia Foundation

8 years, 1 month

Search box update gets press

by Moiz Syed

Hey all, Just came across this: http://thenextweb.com/insider/2016/03/11/wikipedia-spruces-search-autocompl… WOOT!!!!

8 years, 1 month

Re: [discovery] [Ops] Kitchen Hackathon

by Guillaume Lederrey

@Antoine: you'll need to give a bit more context. There was an idea thrown by JustinO on #wikimedia-discovery to provide an easy way to use the dumps of our elasticsearch indices. This idea itself was coming from a SO post [1]. At the moment, we do provide the dumps [2], but while it is not rocket science to import them, it isn't as straightforward as we could wish. Providing a Vagrant project that take care of using the correct version of elasticsearch, has the correct scripts, ... would be nice. As far as I know there are no Phabricator ticket filled for this idea yet. [1] http://stackoverflow.com/questions/36485614/import-wikipedias-indices-into-… [2] https://dumps.wikimedia.org/other/cirrussearch/ On Tue, Apr 12, 2016 at 5:23 PM, Antoine Boegli <antoine.boegli(a)gmail.com> wrote: > vagrant boxes are not finished for the moment, and I think they will not be > available until next week (not enough time) > > Q : is it okay if I put the boxes themselves in the Atlas service by > Hashicorp ? If yes, is there already some WMF account ? > > 2016-04-12 16:57 GMT+02:00 Giuseppe Lavagetto <glavagetto(a)wikimedia.org>: >> >> On Mon, Apr 11, 2016 at 12:06 PM, Guillaume Lederrey >> <glederrey(a)wikimedia.org> wrote: >> > Short status about the micro-hackathon that took place in my kitchen >> > last Saturday: >> > >> > First of all, thanks to Joe for his support! And loads of thanks to >> > Jan, Alex, Nicko and Antoine for participating! >> > >> >> [CUT] >> >> This is seriously great and sorry for not being around more - I'd have >> loved to actually help instead of delivering some random advice on IRC >> in the evening. >> >> Thanks to everyone involved, and please bug me on irc/phabricator if >> you need help/feedback with your patches :) >> >> Cheers, >> >> Giuseppe >> -- >> Giuseppe Lavagetto, Ph.d. >> Senior Technical Operations Engineer, Wikimedia Foundation > > > > > -- > Antoine Boegli > software engineer & linux expert -- Guillaume Lederrey Operations Engineer, Discovery Wikimedia Foundation

8 years, 1 month

Kitchen Hackathon

by Guillaume Lederrey

Short status about the micro-hackathon that took place in my kitchen last Saturday: First of all, thanks to Joe for his support! And loads of thanks to Jan, Alex, Nicko and Antoine for participating! The main goal was to have fun and introduce a few people to what we are doing. At the same time, we did manage to do get some actual work done. We did have a look at the following: * T128786 [1]: Improve robustness of es-tool Implementation is done, needs to be tested somewhere and deployed. Thanks Alex! * T78342 [2]: Create a basic RSpec unit test for operations/puppet Some work has been done, but it is not yet in a state where it can be merged. Nicko will continue to look into it and let us know. * T130861 [3]: Investigate possible simplification of Cassandra Logstash filtering Implementation is done, needs to be tested somewhere and deployed. Thanks Jan! * T131760 [4]: Add icinga monitoring for varnish statistics daemons Implementation is done, needs to be tested somewhere and deployed. Thanks Alex! Antoine also had a look into offering a Vagrant image with a fully working Elasticsearch (with indices from our dumps). Not yet working. We might see Jan, Alex, Nicko and Antoine on IRC, trying to push their changes up to completion... Note: you know you have a great job when you are happy to continue doing it on weekend AND you have friends coming over to do it with you just for fun! [1] https://phabricator.wikimedia.org/T128786 [2] https://phabricator.wikimedia.org/T78342 [3] https://phabricator.wikimedia.org/T130861 [4] https://phabricator.wikimedia.org/T131760 -- Guillaume Lederrey Operations Engineer, Discovery Wikimedia Foundation

8 years, 1 month

wdqs1002 - reinstall

by Guillaume Lederrey

So wdqs1002 is not going well. Long story short: we're going to need to reinstall it. That's a first for me, so I'm going to need some help. I'll ping the Ops side to get the low level stuff, but I'm probably going to need a bit of time from Stas and a few pointers in the right direction. @Stas: expect to hear from me in your morning ... I'll give you all the details tomorrow, but right now I need some sleep... Good night. -- Guillaume Lederrey Operations Engineer, Discovery Wikimedia Foundation

8 years, 1 month

Wikidata Query Service server not rebooting

by Guillaume Lederrey

WDQS servers were schedule for a reboot during the weekly deployment window. While the first server rebooted without issue, things did not go as well with the second one. We do not know yet what the issue is, but we are investigating [1]. This has no direct impact to the end user, we can run on a single server, as long as we don't loose it as well... [1] https://phabricator.wikimedia.org/T132387 -- Guillaume Lederrey Operations Engineer, Discovery Wikimedia Foundation

8 years, 1 month

Discovery Weekly Update for the week starting 2016-04-04

by Chris Koerner

Hello again, Here is the Discovery department's weekly status update for the week starting 04 April. * CirrusSearch now uses HTTPS for communication between Mediawiki and the Elasticsearch backend. This both increases the privacy of Wikipedia users and helps ensure that we can continue to serve our content even with a major failure of our main datacenter. * April 6, 2016 CREDIT Showcase[edit | edit source] Watch the whole thing <https://www.youtube.com/watch?v=guwVwTRAVsQ> or jump to specific parts: ** Guillaume gives an overview of SonarQube <https://www.youtube.com/watch?v=guwVwTRAVsQ>. ** Max gives an overview if his work to compare diff algorithms in MediaWiki <https://www.youtube.com/watch?v=guwVwTRAVsQ&t=4m45s>. ** Trey gives an update on new functionality in Relevance Forge <https://www.youtube.com/watch?v=guwVwTRAVsQ&t=11m12s>. ** Yuri shows some work he did at the recent Hackathon to store structured data on Wiki <https://www.youtube.com/watch?v=guwVwTRAVsQ&t=19m35s>, and consuming it from Graphs. **More links and Q&A on the CREDIT etherpad <https://etherpad.wikimedia.org/p/CREDIT>. Feedback and suggestions on this weekly update are welcome. The full update, and archive of past updates, can be found on Mediawiki.org: https://www.mediawiki.org/wiki/Discovery/Status_updates -- Yours, Chris Koerner Community Liaison - Discovery Wikimedia Foundation

8 years, 1 month

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

Discovery April 2016