Here's a task that captures some of the things to consider for server side
enrichment of X-Analytics (in this case it would be the Mobile Content
Service doing the work, I think).
Here are the quarterly goals. The thought to reflect counting in a more
efficient way kind of entered a little later in the quarter, sorry about
that (and thanks for helping us figure out short and mid-term approach).
-Adam
On Wed, Aug 19, 2015 at 9:27 AM, Andrew Otto <aotto(a)wikimedia.org> wrote:
Ya, we can probably tweak pageview definition to use
page_id / page_title
if they exist, and only use the rest of the logic if they don’t.
On Aug 19, 2015, at 12:24, Oliver Keyes
<okeyes(a)wikimedia.org> wrote:
It'll need to be, some requests don't know pageID in advance, which I
think was the reason Apps initially didn't implement this.
On 19 August 2015 at 12:19, Andrew Otto <aotto(a)wikimedia.org> wrote:
> If your app/site/etc. is creating a request that it wants to count as a
> pageview, add an X-Analytics header with pageview_id=<page_id> or
> pageview_title=<page_title>
>
>
> page_id is the current key, so let’s keep that. page_title would be
good to
> have too. Let’s make it an and/or.
>
>
> On Aug 19, 2015, at 12:17, Bernd Sitzmann <bernd(a)wikimedia.org> wrote:
>
>> If your app/site/etc. is creating a request that it wants to count as a
>> pageview, add an X-Analytics header with pageview_id=<page_id> or
>> pageview_title=<page_title>
>
>
> Ideally the page id would be the way to go. From a client's perspective
I
> prefer the page title since clients don't
always know the page id ahead
of
> time. (We could put that header into the
second request of loading the
page
> but I cannot guarantee that we we will always
have a second request in
the
> future.)
>
> --Cheers,
> Bernd
>
> On Wed, Aug 19, 2015 at 8:53 AM, Dan Andreescu <
dandreescu(a)wikimedia.org>
> wrote:
>>
>> This (making pageviews proactive) is a great idea, and we should follow
>> through. Here's a simple start:
>>
>> If your app/site/etc. is creating a request that it wants to count as a
>> pageview, add an X-Analytics header with pageview_id=<page_id> or
>> pageview_title=<page_title>
>>
>> If we can make this change uniformly, I think we'd be in a very good
>> place.
>>
>> On Wed, Aug 19, 2015 at 10:23 AM, Oliver Keyes <okeyes(a)wikimedia.org>
>> wrote:
>>>
>>> On 19 August 2015 at 10:19, Andrew Otto <aotto(a)wikimedia.org> wrote:
>>>>> If we /do/ include RESTBase requests we will not only have to
>>>>> rewrite the pageview definition for the apps to recognise the new
URL
>>>>> scheme
>>>>
>>>> I really think that apps and APIs should do something proactive to
tag
>>>> or log a pageview. With more
ways of viewing content, it is going
to get
>>>> harder and harder to maintain a
pattern based definition. A
pageview should
>>>> be an event that is logged, not
something that is pattern matched
out of a
>>>> very noisy stream of data.
>>>>
>>>> Most mediawiki requests do this now, via the page_id field in the
>>>> X-Analytlics header, but we can’t use this for all pageviews because
APIs
>>>> are more complicated (e.g. more
than one page can be served in a
single
>>>> request, etc.). In the longterm,
there should be a pageview event
stream
>>>> just like rcstream! :)
>>>
>>> This is an excellent point. IIRC we'd been asking Apps to do this for
>>> kind of a while, so...
>>>
>>>>
>>>> -Ao
>>>>
>>>>
>>>>
>>>>> On Aug 18, 2015, at 19:58, Oliver Keyes <okeyes(a)wikimedia.org>
wrote:
>>>>>
>>>>> On 18 August 2015 at 19:11, Bernd Sitzmann
<bernd(a)wikimedia.org>
>>>>> wrote:
>>>>>> This discussion is about needed updates of the definition and
>>>>>> Analytics
>>>>>> implementation for mobile apps page view metrics. There is also
an
>>>>>> associated Phab task[4]. Please add the proper Analytics project
>>>>>> there.
>>>>>>
>>>>>> Background / Changes
>>>>>>
>>>>>> As you probably remember, the Android app splits a page view into
two
>>>>>> requests: one for the
lead section and metadata, plus another one
for
>>>>>> the
>>>>>> remainder.
>>>>>>
>>>>>> The mobile apps are going to change the way they load pages in
two
>>>>>> different
>>>>>> ways:
>>>>>>
>>>>>> We'll add a link preview when someone clicks on a link from a
page.
>>>>>> We're planning on switching over the using RESTBase for
loading
pages
>>>>>> and
>>>>>> also the link preview (initially just the Android beta, ater
more)
>>>>>>
>>>>>
>>>>> Woah woah woah woah woah. By RESTBase do you mean Gabriel's
RESTful
>>>>> service API?
>>>>>
>>>>> Last time I checked that wasn't even consumed by HDFS. Is it now
being
>>>>> consumed by HDFS?
>>>>>
>>>>> More importantly the actual URLs are going to look /totally/
>>>>> different. If we do not include RESTBase requests, we will miss the
>>>>> apps. If we /do/ include RESTBase requests we will not only have to
>>>>> rewrite the pageview definition for the apps to recognise the new
URL
>>>>> scheme, we will also
potentially have to rewrite every /other/ bit
of
>>>>> the definition to /not/
incorporate those requests.
>>>>>
>>>>> (I use "we" in a collective sense. This isn't my baby
any more,
>>>>> although if Joseph et al want help with the refactor here I'm
happy
to
>>>>> spend my volunteer time on
it).
>>>>>
>>>>> But basically every other bit of your email is important but now
>>>>> secondary: this is a potentially massive change, all on its own,
even
>>>>> without the link preview,
even if the substance of the requests
going
>>>>> to RESTBase were identical.
>>>>>
>>>>>> This will have implications for the pageviews definition and how
we
>>>>>> count
>>>>>> user engagement.
>>>>>>
>>>>>> The big question is
>>>>>>
>>>>>> Should we count link previews as a page view since it's an
indication
>>>>>> of
>>>>>> user engagement? Or should there be a separate metric for link
>>>>>> previews?
>>>>>>
>>>>>> Counting page views
>>>>>>
>>>>>> IIRC we currently count action=mobileview§ions=0 query
parameters
>>>>>> of
>>>>>> api.php as a page view. When we publish link previews for all
Android
>>>>>> app
>>>>>> users then we would either want to count also the calls to
>>>>>> action=query&prop=extracts as a page view or add them to
another
>>>>>> metric.
>>>>>>
>>>>>> Once the apps use RESTBase the HTTPS requests will be very
different:
>>>>>>
>>>>>> Page view: Instead of action=mobileview§ions=0 the app
would
call
>>>>>> the
>>>>>> RESTBase endpoint for lead request[1] instead of the PHP API
>>>>>> mentioned
>>>>>> above. Then it would call [2].
>>>>>> Link preview: Instead of action=query&prop=extracts it would
call
the
>>>>>> lead
>>>>>> request[1], too, since there is a lot of overlap. At least that
our
>>>>>> current
>>>>>> plan. The advantage of that is that the client doesn't need
to
>>>>>> execute the
>>>>>> lead request a second time if the user clicks on the link preview
(--
>>>>>> either
>>>>>> through caching or app logic.)
>>>>>>
>>>>>> So, in the RESTBase case we either want to count the
>>>>>> mobile-html-sections-lead requests or the
>>>>>> mobile-html-sections-remaining
>>>>>> requests depending on what our definition for page views actually
is.
>>>>>> We
>>>>>> could also add a query parameter or extra HTTP header to one of
the
>>>>>> mobile-html-sections-lead requests if we need to distinguish
between
>>>>>> previews and page views.
>>>>>>
>>>>>> Both the current PHP API and the RESTBase based metrics would
need
to
>>>>>> be
>>>>>> compatible and be collected in parallel since we cannot control
when
>>>>>> users
>>>>>> update their apps.
>>>>>>
>>>>>> [1]
>>>>>>
>>>>>>
https://en.wikipedia.org/api/rest_v1/page/mobile-html-sections-lead/Dilbert
>>>>>> [2]
>>>>>>
>>>>>>
https://en.wikipedia.org/api/rest_v1/page/mobile-html-sections-remaining/Di…
>>>>>> [3]
>>>>>>
>>>>>>
https://www.mediawiki.org/wiki/Wikimedia_Apps/Team/RESTBase_services_for_ap…
>>>>
>>>> [4]
https://phabricator.wikimedia.org/T109383
>>>>
>>>>
>>>> Cheers,
>>>>
>>>> Bernd
>>>>
>>>>
>>>> _______________________________________________
>>>> Analytics mailing list
>>>> Analytics(a)lists.wikimedia.org
>>>>
https://lists.wikimedia.org/mailman/listinfo/analytics
>>>>
>>>
>>>
>>>
>>> --
>>> Oliver Keyes
>>> Count Logula
>>> Wikimedia Foundation
>>>
>>> _______________________________________________
>>> Analytics mailing list
>>> Analytics(a)lists.wikimedia.org
>>>
https://lists.wikimedia.org/mailman/listinfo/analytics
>>
>>
>> _______________________________________________
>> Analytics mailing list
>> Analytics(a)lists.wikimedia.org
>>
https://lists.wikimedia.org/mailman/listinfo/analytics
>
>
>
> --
> Oliver Keyes
> Count Logula
> Wikimedia Foundation
>
> _______________________________________________
> Analytics mailing list
> Analytics(a)lists.wikimedia.org
>
https://lists.wikimedia.org/mailman/listinfo/analytics
_______________________________________________
Analytics mailing list
Analytics(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics
_______________________________________________
Analytics mailing list
Analytics(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics
_______________________________________________
Analytics mailing list
Analytics(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics
--
Oliver Keyes
Count Logula
Wikimedia Foundation
_______________________________________________
Analytics mailing list
Analytics(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics
_______________________________________________
Analytics mailing list
Analytics(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics