Discovery March 2016

discovery@lists.wikimedia.org

27 participants
58 discussions

Fwd: [opensource-107] Seattle to host the 2016 OpenStreetMap State of the Map US Conference

by Pine W

For Wikimedia folks who are interested in possible collaborations with OSM, now seems like a good time to start thinking about possible presentations. Staff from the Wikimedia Foundation, and/or Wikimedia volunteers from around the US outside of the Seattle area, may want to start thinking about travel plans. For Wikimedia volunteers outside of Cascadia Wikimedians territory, you might consider applying for WMF Travel and Participation Support grants [1]. If you're inside of Cascadia Wikimedians territory and would like to attend the conference, we may have funds in our budget that can support your attendance. Contact me off-list for details. Regards, Pine [1] https://meta.wikimedia.org/wiki/Grants:TPS ---------- Forwarded message ---------- From: Clifford Snow <clifford(a)snowandsnow.us> Date: Tue, Mar 1, 2016 at 5:18 PM Subject: [opensource-107] Seattle to host the 2016 OpenStreetMap State of the Map US Conference To: opensource-107-announce(a)meetup.com I am excited to announce that Seattle was chosen to host the OpenStreetMap 2016 State of the Map US Conference. The conference will take place July 23-25 on SeattleU's campus. We chose SeattleU for their low cost, proximity to Seattle and access to public transit. The food trucks near by didn't hurt either. We are looking for help! Let us know if you want to help. Request for presentation proposals should be coming fairly soon. Start thinking about what you want to present or teach. The formal announcement can be found at: https://openstreetmap.us/2016/02/sotmus-2016/ Clifford

8 years, 2 months

Wikipedia.org portal - update on work in progress

by Julien Girault

Hi everybody, Here's an update of what the team has been up to: *Stats are updated!* Yesterday we: - updated www.wikipedia.org stats, - updated the sister portals with the latest version from Meta. - www.wikibooks.org - www.wikinews.org - www.wikiquote.org - www.wikiversity.org - www.wiktionary.org In order to improve the process Deborah created a recurring task for it T128546 <https://phabricator.wikimedia.org/T128546>. It's pretty cool to see what's changed :) - www.wikipedia.org (https://github.com/wikimedia/wikimedia-portals/commit/864d5ebed066aa3267a17…) <https://github.com/wikimedia/wikimedia-portals/commit/864d5ebed066aa3267a17…> - sister portals (https://github.com/wikimedia/wikimedia-portals/commit/2e57bfbd83acce979a25b…) <https://github.com/wikimedia/wikimedia-portals/commit/2e57bfbd83acce979a25b…> *Deploying the enhanced search box to production* Phab: https://phabricator.wikimedia.org/T125571 We had a list of improvements to make before pushing to production. It took us longer than expected, specially because the language picker implementation was not 100% production-ready: - IE8(and lower) users were ignored for the A/B test. - We initially took the decision to not support IE8 for the A/B test in order to save development time (in case the test results show no improvements). - Mobile user experience was not optimal because of the custom dropdown. - We took the decision to figure this out only once we got the test results, to save some development time (in case the test results show no improvements). - The language picker relied on JavaScript and no solution was provided for non-JS users. We will learn from our first A/B test. The test showed a significant improvement though, and we are now getting ready for a deployment to production. Here's the latest update: - We changed the custom dropdown, triggered by JavaScript, and replaced it with a native <select> element. - Styling this native <select> element as well as we can (to match what we had in the A/B test), - but it may look a little odd in old browsers (there isn't so much we can do with <select> elements). - Solves mobile user experience issues because the device's native selector will be triggered and will be a lot easier than a custom dropdown. - Solves non-JS traffic (~ 7% <https://upload.wikimedia.org/wikipedia/commons/e/e6/Analysis_of_Wikipedia_P…>) because it's a native <select> element (no JS required) - minor detail: only the custom arrow is not clickable because we need a little JS hack. *Status: *The patch made it to code review today. We will release when it's approved. As of today, here is what you can expect: We have an idea on how to make it even better for old and weird browsers, but we want to move forward with this now :) And the new typeahead is fantastic!!! *Next A/B test: Use language detection to re-arrange the primary links to suit the user better* (Primary links = the 10 wiki links in the screenshot above). Phab: https://phabricator.wikimedia.org/T125472 For the test we read the user's preferred languages (from the browser) and show the corresponding wikis in the most top positions. Let's see if people click on these links more than usual ! Please come to Deborah and us if you have any question :) *Status:* The patch made it to code review today. We will merge it into its feature branch as soon as it's approved. Then we will review the A/B test setup one more time and schedule a launch date. We hope to get the new search box in production before we launch the A/B test. *Improving performance* Performance matters... it's a huge part of the user experience. We only talk about it when it's bad, we often forget it when it's good. Performance improvements can definitely increase user engagement. Take a look at what we've done since November: https://grafana.wikimedia.org/dashboard/db/webpagetest-portals :) For the Wikipedia.org portal team, Julien.

8 years, 2 months

(no subject)

by Trey Jones

Greetings language nerds, I've completed the creation of a 21-language balanced (i.e., 200 each) corpus of relatively clean queries for use in evaluating language identification model testing. The 21 languages were chosen based on query volume across wikis in those languages. I've also evaluated our current version of TextCat against this corpus, using the known 21 languages, and all 59 languages I have models for. The 21 languages have pretty good models, because they had lots of query volume to be built on. The full set of 59 is a bit more dodgy, esp. Igbo, which is known to have a lot of English in the training data. Indonesian is the most unexpectedly poor performing of the bunch (most other poor performance is across language or script families and so is expected). The best model size among those test (500 to 10K), was the full 10,000! However performance at the 3,000 ngram model size (what we've been using for A/B tests) was only a few percentage points worse. Full write up with lots more details here: https://www.mediawiki.org/wiki/User:TJones_(WMF)/Notes/Balanced_Language_Id… I'll commit models for the rest of these 21 languages after verifying that they won't mess up our A/B tests. Cheers, —Trey Trey Jones Software Engineer, Discovery Wikimedia Foundation

8 years, 2 months

Wikipedia.org Portal and Wiki article stats updated

by Deborah Tankersley

Hi there, Just a heads-up that we updated the Wikipedia.org portal page and 5 sister wiki sites statistics this afternoon. The sister wiki sites that were updated from Meta are: https://www.wikibooks.org, https://www.wikinews.org, https://www.wikiquote.org, https://www.wikiversity.org, https://www.wiktionary.org We have a recurring ticket <https://phabricator.wikimedia.org/T128546> to update the stats on a regular basis, optimally every two weeks, as we can. Let us know if there are any questions or concerns! Cheers, Deb -- Deb Tankersley Product Manager, Discovery Wikimedia Foundation

8 years, 2 months

Updates to Discovery page on mediawiki.org

by Dan Garry

I have updated the Discovery page <https://www.mediawiki.org/wiki/Wikimedia_Discovery#The_team> on mediawiki.org to convey who is working on what. If you're curious, take a look. Please note the disclaimer: this is only intended to roughly convey who is working on what, and there are no guarantees that this is accurate to any particular level of detail. Thanks! Dan -- Dan Garry Lead Product Manager, Discovery Wikimedia Foundation

8 years, 2 months

Push completion suggester rollout a week earlier?

by Erik Bernhardson

I seem to have forgotten when last discussing this, but the week of March 22 when we plan to roll the feature to prod is the same week tech ops is planning to test shifting all traffic to our failover data center in dallas. They have requested bi deployments that week to limit the variance for bugs that will invariably crop up. I think discovery is mostly ready, but we wll need to move comms a bit faster? We could also push back to the following week. I will be out on mid week to Israel but I trust dcausse and gehel can handle the stayed rollout regardless of my availability.

8 years, 2 months

Orchestration of deployments

by Guillaume Lederrey

I am finishing the upgrade of elasticsearch to 1.7.5 for codfw (eqiad still to do). For this, I used a small script [1], heavily inspired (copied / stolen / ...) from bd808. The script is ugly, but it does the job. It runs the deployment over a list of hosts, pausing for manual steps / validations along the way. The script runs locally on my workstation, so it is subject to loss of connectivity, local crashes, hard to handover, ... Do we have a central place for those kind of scripts? I'd like to version it in a more obvious place than my personal Github repo. Do we have examples of similar scripts? A specific tool for this? Rundeck [2] comes to mind. Note that I'm not a huge fan of Rundeck as it brings far too much complexity for simple tasks, but the concept of having a central place of re usable operational components is appealing. [1] https://github.com/gehel/elasticsearch-utility-scripts [2] http://rundeck.org/

8 years, 2 months

Varnish / caching presentation

by Guillaume Lederrey

We took some time with Brandon last Friday to have a presentation on the Varnish / caching infrastructure. Brandon did not have a lot of time to prepare this, so it ended up being more of a conversation, mainly driven by our questions. Brandon did have some slides, but they were just a very light support to our conversation. Honestly, that's my preferred format. (Thanks you Brandon for being busy on more important stuff and still take time to answer our questions!) If keeping this fairly unstructured format helps to have more regular Ops Sessions, I'm all for it! I also had to friends / former coworkers who were with me for this session. They both work on the caching infrastructure of Nespresso, and they found it really interesting. If we could open some of our Ops Session I think there could be quite a few people interested in watching them. And in my understanding, this would be quite aligned with our mission of disseminating the world's knowledge (and we do have a sizeable body of technical knowledge to disseminate). I can see a few constraints in open in those Ops Sessions to a wider audience. We still need some private time to discuss some of the things that are sensitive. And we need to find a way for this to not had a significant overhead to our busy schedules. Still I think it would be great if we could do that... What do you think? And again, big thanks to Brandon for his time! MrG

8 years, 2 months

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

Discovery March 2016