Brilliant, thanks very much Marko!
Joe
On 11 October 2017 at 14:19, Marko Obrovac <mobrovac(a)wikimedia.org>
wrote:
Hello Joe,
On 11 October 2017 at 14:27, Joe Wass <jwass(a)crossref.org> wrote:
> Hi there,
>
> I hope this is the right list for a RESTBase query? Let me know if
> this is the wrong list, or I should head over to Phabricator.
>
> I'm visiting a large number of Wikipedia pages' specific versions (for
> the Crossref Event Data service, if you're interested -
>
https://www.eventdata.crossref.org/guide ). I'm getting page ids /
> versions from EventStreams. I'm using the RESTBase API because it gives the
> cleanest HTML and it was recommended to me for the volume of queries, e.g.
>
>
https://ceb.wikipedia.org/api/rest_v1/page/html/Quebrada_Fan
> tasma/13659774
>
> I want to get the *canonical URL* for that version page, e.g.
>
>
https://ceb.wikipedia.org/wiki/Quebrada_Fantasma
>
> The 'normal' HTML view of a page supplies the canonical URL as a <link
> rel="canonical"> tag, but the RESTBase response doesn't. It does
supply an
> isVersionOf link though:
>
> <link rel="dc:isVersionOf"
href="//ceb.wikipedia.org/wiki
> /Quebrada_Fantasma"/>
>
> Questions:
>
> 1 - Is the isVersionOf URL in RESTBase identical to the "official"
> canonical URL that I would get from the HTML metadata (using https:)?
>
Yes, it is :)
>
> 2 - Is the "title" component of the RESTBase URL the same as used in
> the Canonical URL? The Swagger docs say "Page title. Use underscores
> instead of spaces. Example: Main_Page". I'm not clear if that is the same
> thing.
>
Yes, that is the canonical title of the page, with the exception that
forward slashes need to be encoded when contacting the REST API, whereas
that is not needed (but allowed) for the canonical URL. So for the page
entitled "Page/SubPage", you need to provide "Page%2FSubPage" to the
REST
API. Note that you will still get the correct canonical URL in the
`dc:isVersionOf` field.
>
> 3 - Is there a general recommended way of getting the canonical URL
> for a page from RESTBase?
>
You can either use the `dc:isVersionOf` field, or use the simple
transform: https://{{domain}}/api/rest_v1/page/html/{title} => https://
{{domain}}/wiki/{title} which is guaranteed to work.
Cheers,
Marko
Marko Obrovac, PhD
Senior Services Engineer
Wikimedia Foundation
>
> Thanks in advance!
>
> Joe Wass
>
>
https://en.wikipedia.org/wiki/User:Afandian
> Crossref
>
> _______________________________________________
> Services mailing list
> Services(a)lists.wikimedia.org
>
https://lists.wikimedia.org/mailman/listinfo/services
>
>
_______________________________________________
Services mailing list
Services(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/services
_______________________________________________
Services mailing list
Services(a)lists.wikimedia.org