Hello,
I'm the new Product Manager for the Discovery Portal team and I'm happy to
be here with all the cool kids! :)
I'm also very happy to announce that we launched a new A/B test on the
Wikipedia
portal (www.wikipedia.org) as of last night. This test aims to identify whether
the use of a more prominent search box, and/or the use of a more
informative set of search results (featuring small images and descriptive
labels of each result) increases the rate at which users click through from
the portal to our projects.
This test will be seen by a small sample (0.05%) of Wikipedia portal
visitors. It is due to run for a week, ending on 20 January and we hope to
have the results of our analysis soon after.
If you have any questions about the test or about what the portal team is
working on and/or planning next, please let me know.
Cheers,
Deb
--
Deb Tankersley
Product Manager, Discovery
Wikimedia Foundation
Since the so-called Discovery-Cirrus-Sprint actually is a sprint used for
search work, which may or may not actually be related to the CirrusSearch
extension, this morning I renamed Discovery-Cirrus-Sprint to
Discovery-Search-Sprint. Except the name, this does not effect anything.
As always, if there are any questions, let me know!
Thanks,
Dan
--
Dan Garry
Lead Product Manager, Discovery
Wikimedia Foundation
Hi all,
Last year (er, several weeks ago) we decided to set up a dedicated space
for experimental dashboards so that anybody (including members of the
community who want to do new things with existing data) can deploy a
dashboard. The homepage for this space is
http://discovery-experimental.wmflabs.org/
For example, I recently began work on a predictive modeling project with
the goal to forecast usage of our services and detect outliers (e.g. when a
labs bot went haywire on WDQS several weeks ago and we suddenly saw >21mil
SPARQL requests). So I've deployed a prototype dashboard to the
experimental space so it doesn't mess up our existing dashboards:
http://discovery-experimental.wmflabs.org/forecast/ (Currently it only
forecasts Cirrus API usage, but more will be added soon.)
To add your Shiny-powered dashboard, you need to add it as a submodule to
the wikimedia/discovery/experimental repository on gerrit
<https://git.wikimedia.org/summary/?r=wikimedia/discovery/experimental.git>.
Instructions for doing this can be found on this README
<https://github.com/wikimedia/wikimedia-discovery-experimental/blob/master/R…>
With warm regards,
Mikhail Popov on behalf of Discovery's Analysts
Hi!
Since we've talked about maybe using TextCat-based algorithms, I've made
an implementation of textcat as PHP class/utility, which may be useful:
https://github.com/smalyshev/textcat
Please feel free to comment. It bases on what I found at
http://odur.let.rug.nl/~vannoord/TextCat/ which is pretty old, so we may
want to patch it up, but it works as a starting point I think (provided
we'd want to pursue this route).
I'll work on improving the loading latency (converting LM format to PHP)
and making it into a real composer module. Maybe also add some tests.
Improvement suggestions welcome of course.
--
Stas Malyshev
smalyshev(a)wikimedia.org
Hey all,
After several weeks of work to switch all the scripts over and
backfill, all the Discovery dashboards now have the ability to filter
crawlers and automated software out from graphs where that is
relevant. You should notice a simple checkbox on, for example, the
Zero Results Rate data or Wikidata Query Service traffic.
While a bit of backfilling is still waiting on the servers syncing up,
this work is essentially complete, and provides another way to look at
data on how people are using search (and who those people are). It was
a heck of a lot of work, by both myself and Mikhail, but it's
hopefully valuable :).
For Discovery Analytics,
--
Oliver Keyes
Count Logula
Wikimedia Foundation