Discovery January 2016

discovery@lists.wikimedia.org

21 participants
25 discussions

by Christopher Koerner

Hey all, I'm new and trying to wrap my head around how we present all the work that's going in Discovery. I have a thought I wanted to run by folks that (I think) will help us provide a more consistent introduction to the various projects that are being worked on. Right now Wikimedia Discovery <https://www.mediawiki.org/wiki/Wikimedia_Discovery> is a wealth of information, but there isn't much consistency on what each project page contains. Some are labeled as "Wikimedia engineering activity", some don't have a page about the project (Search links to the extension), etc. My thought is to bring the minimum for each project to a consistent level. For example, the page for Project A would talk about what is it, what are we trying to to do, members involved, roadmap, current progress, where to give feedback, etc. Each project would have it's own page with this minimum and a consistent layout. Related, I also have the sub-hub for the Portal marked for translation <https://meta.wikimedia.org/wiki/Special:PageTranslation> and I think it would be wise to do the same elsewhere. If there's any dissent, let me know otherwise full-steam ahead. -- Yours, Chris Koerner Community Liaison - Discovery Wikimedia Foundation

8 years, 3 months

PrefixSearch/SearchEngine refactoring

by Stas Malyshev

Hi! I'd like to describe the refactoring we are doing on SearchEngine/Prefix search. The goal of it is to bring prefix search into SearchEngine API and use the unified API for prefix searches, which will also allow to use new ElasticSearch completion suggester in many places where prefix search is done. Please comment if you see any problem in this or have any suggestions. The current plan is as follows: 1. SearchEngine gets the following new API functions: public function defaultPrefixSearch( $search ); public function completionSearch( $search ); public function completionSearchWithVariants( $search ); defaultPrefixSearch is for simple prefix searches (namespace lists, special pages, etc.). completionSearch* is for completions that need scoring, fuzzy matching, and so on. 2. There's also internal function: protected function completionSearchBackend( $search ) That is what SearchEngine implementation (like CirrusSearch) will override. SearchEngine base class deals with namespace handling, result ordering, etc. 3. TitlePrefixSearch and StringPrefixSearch will be deprecated (but stay in code almost unchanged for now) and use of SearchEngine APIs is recommended, such as SearchEngine::defaultPrefixSearch() or completionSearch() depending on whether simple or advanced handling is desired. 4. Hooks PrefixSearchBackend and PrefixSearchExtractNamespace will be deprecated, and overriding completionSearchBackend() and normalizeNamespaces() in SearchEngine is recommended instead. For now, these hooks will be supported by base SearchEngine implementation, but not by CirrusSearch. The task for this is https://phabricator.wikimedia.org/T121430, it also links the patches as they are now (still work in progress). -- Stas Malyshev smalyshev(a)wikimedia.org

8 years, 3 months

Portal A/B test results available

by Oliver Keyes

Hey all, A couple of weeks ago we ran an A/B test on the Wikipedia portal (www.wikipedia.org) to test whether a more prominent search box, optionally combined with additional metadata such as small images in the search results, would increase the rate at which people clicked through from the portal to one of our projects. We are delighted to say that the test showed a 1-5% increase in the clickthrough rate, where both a prominent search box and metadata is used. Accordingly, once we've resolved concerns about the design's non-JavaScript usability, we hope to deploy it for all users. The report can be seen at https://commons.wikimedia.org/wiki/File:First_Portal_Test.pdf - please let me know if you have any questions. For Discovery Analytics, -- Oliver Keyes Count Logula Wikimedia Foundation

8 years, 3 months

Re: [discovery] Portal A/B test results available

by Christopher Koerner

These results are really promising. I'm excited to share them with an audience larger than discovery-l if that's appropriate. Any suggestions on a good audience to start with? I took a stab at updating the Portal Improvements page <https://meta.wikimedia.org/w/index.php?title=Wikipedia.org_Portal_Improveme…> and would appreciate any feedback/corrections. -- Yours, Chris Koerner Community Liaison Wikimedia Foundation

8 years, 3 months

A few info on discovery project and API

by Luigi Assom

Hello, I have few question for discovery portal and API: - possible to be included in the 0.05% who tried the* A/B test*? - do you have or plan to have a feature for *discovery *between* topics?* E.g. suggestions connecting two topics. I ve been working on this. - Which is the difference in API:Search between: *list=search *and* generator=search ?* The first seems to be faster to me. - I need to fetch the *pageIDs* of the results, and have them already redirected and decorated with images and snipped/excerpt. With generator=search I could do it, I can't with list=search Could you help in grasping query parameters? #generator=search params = {'action':'query', 'generator':'search', 'gsrnamespace' : 0, 'gsrsearch' : keywords, 'gsrlimit' : 20 , 'prop' : 'pageimages|extracts', 'pilimit' : 'max', 'exintro' : '', 'explaintext' : '', 'exsentences' : 3, 'exlimit' : 'max', 'redirects' : '' } #list=search params = {'action':'query', 'list':'search', 'srsearch' : keywords, 'srlimit' : 20 , 'srprop' : 'size', *'indexpageids' : 1 *} ?? I m trying with sand box with no success, only search results are hit, no matter of https://en.wikipedia.org/wiki/Special:ApiSandbox#action=query&list=search&f… - I also found 0&formatversion=2' is another param used in search=list Has it a special meaning?

8 years, 3 months

Story point estimations

by Kevin Smith

I just added a section to the Discovery process page[1] documenting the guidelines that the Portal team uses for story point estimation. I would like to include a corresponding section for the Analysis team. It can be entirely different, since story point values never cross team boundaries. [1] https://www.mediawiki.org/wiki/Wikimedia_Discovery/Process#Story_Points Kevin Smith Agile Coach, Wikimedia Foundation

8 years, 3 months

Re: [discovery] A few info on discovery project and API | full-text search

by Luigi Assom

About fetching pageID with generator module on list=search, could you please debrief? I ll try to be more specific. I understood that generators take a list of titles, and I would like to return the pageID of titles in that list. But cannot get it to work, it looks like generators gives another output, although in documentation is written it doesn't. As example, I query "DJ Tiesto" with list=search. Then use generator=allpages with indexpageids selected. Results in query['search'] are relevant; the pageIDs in query['pageids'] are not related to the list titles: https://en.wikipedia.org/wiki/Special:ApiSandbox#action=query&list=search&f… *How to have items in query['search'] completed with pageIds along their snippet, or a dictionary of matching tiltes ?* Searching with a generator=search (and not list=search) will work. https://en.wikipedia.org/wiki/Special:ApiSandbox#action=query&prop=extracts… I don't understand how the two (generator=search and list=search) work.

8 years, 3 months

Completion suggester and pageviews

by David Causse

Hi, http://en-suggesty.wmflabs.org/suggest.html is updated with a score that integrates pageviews. Pageviews solve most of the problems we encountered in the previous formula unfortunately we now see some porn related suggestions. - x will suggest xxx - po will suggest pornhub just below poland in 2nd position. And is ranked #6 for the query 'p' I just wanted to let you know about this and would like to know if it's something we should address. Thanks for your feedback. David.

8 years, 3 months

Discovery ops process thoughts

by Kevin Smith

As we prepare to bring our new Discovery ops person on board in a couple weeks, we have been talking about how to integrate Guillaume's workflow into our process. The following strawdog proposals are based on conversations that involved me, Dan, Guillaume, and others. 1. Meetings We propose having Guillaume attend the search team standups for now, since 90+% of his initial work will relate to search. He would probably not any of the sprint planning or backlog grooming meetings. It's not clear whether we should have a weekly Ops planning meeting, which would resemble the Analysis planning meetings we already have. 2. Phabricator We would create a Discovery-ops-sprint project/board, which will represent the work of the Discovery ops team. This aligns with how we have handled the Maps and WDQS sub-teams, which are also very small teams. 3. Learn and iterate Whatever we end up trying as a starting point, we'll inspect and adapt it as we go. Any questions, comments, concerns, or questions? Kevin Smith Agile Coach, Wikimedia Foundation

8 years, 3 months

[Action required] Portal Labs page

by Julien Girault

The Discovery Portal team has been thinking about a Portal Labs page. The idea is that we can implement some revolutionary ideas for our portal page, things that are completely different than what the current portal page looks like, and deploy it on this site for real users to use. But without imposing a disruptive user experience to our users. We can put a link to this page on the production portal page (in the bottom?), and users can have an option to bookmark the page, and maybe make one of the experiments their default. We would have two trains: - Slow train: running regular A/B tests (like the one we just ran) on the official portal page and deploying small improvements as we learn. - Faster train: "Revolutionary" prototypes in Labs where we also collect traffic and clickthrough rate to measure user satisfaction. We can also implement a "Send a Feedback" feature (or have a link on the prototype page that points to a Phab ticket where community can add comments/feedback). To give you an example of what we mean by revolutionary ideas, I uploaded some of my research time work: https://people.wikimedia.org/~jgirault/ Pay closer attention to: Trending Showing top 9 articles (grid) <https://people.wikimedia.org/~jgirault/react-top9-cards/>Trending Showing top 10 articles (full screen) <https://people.wikimedia.org/~jgirault/react-top10-fs/> This would allow us to think outside the box and test different layouts/features, with real users who chose to. This is kind of a crazy idea, and we want to know what you all think about it. Also we would need some naming ideas for it. Portal labs, or beta portal, or something else. Please let us know what you think about this, how you think we can go towards making this happen and what we should name it. Thanks! Julien

8 years, 3 months

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

Discovery January 2016