Hi Ismael,

You're right to be confused, we left the work and documentation in a messy state following the departure of team members that worked on this dataset.  We have not yet been able to prioritize cleaning it up.

The basic idea was that pageviews_complete was going to be a combined dataset, in a uniform format, as compressed as possible, with everything we retained about pageviews to Wikimedia projects since 2007.  Currently we only have data available for download since 2011, and we still have links to the old data and deprecated datasets that pageviews_complete is supposed to replace.  As a result, it's very confusing.  If you tell me what exactly you're looking for, I can try to direct you.  And if you're looking to help improve the documentation, then we very much welcome that, and I can point you in the right direction as well.

On Mon, Jan 16, 2023 at 8:08 AM Ismael Olea <ismael@olea.org> wrote:
Hi:

I can't find any (official?) definition about the nature and practical use between the «Pageview» «Pageview complete» dumps other than a brief description at Dumps[1].

Searching for pageview complete there is not results at:
Maybe is it because the complete ones are by design related to Wikipedia Administrative Pages Analytics[3]? Or because it included data before the latest pageview definition (2015)? Maybe both?

[1] https://dumps.wikimedia.org/other/analytics/
[2] https://dumps.wikimedia.org/other/pageview_complete/readme.html
[3] https://meta.wikimedia.org/wiki/Wikipedia_Administrative_Pages_Analytics
--

Ismael Olea

http://olea.org/diario/
_______________________________________________
Analytics mailing list -- analytics@lists.wikimedia.org
To unsubscribe send an email to analytics-leave@lists.wikimedia.org