Is there an simple description of the pageview API available anywhere
already?
The API is deployed publicly, but we don't want to spread the news yet
because the data is not fully loaded into it (there are many Terabytes).
However, the spec for it is pretty easy to read:
https://github.com/wikimedia/restbase/blob/master/mods/pageviews.yaml (the
public endpoint is a bit different, but just to get the idea of the routes
available for now)
Specifically, how far back will the API access historical data?
Right now, we're going to load data back to May of this year. That's as
far back as we have high quality data. We are collecting use cases for
people who need the other data, available in pagecounts-raw and/or
pagecounts-all-sites. So let us know if you need that and what you use it
for, it'll help.
> And, maybe even more importantly, will either pagecounts-raw,
> pagecounts-all-sites or both of those datasets, be discontinued in the near
> future due to the (presumably) more powerful pageview API?
No plans to discontinue those right now, but it is getting very confusing
to understand the differences between the datasets. Even if we do
discontinue or merge some of those datasets, we'll talk about it on this
list first and get opinions. And we'll always have some type of static
file dump for people who need to do bulk work, we just have to figure out
which type(s) of data to keep and which to remove to simplify everyone's
life :)