[Labs-l] [Analytics] doubt on GeoData / how to obtain articles with coords

Bryan White bgwhite at gmail.com
Mon Mar 2 22:52:00 UTC 2015


Marc,

If anybody would know, it would be Kolossos.  He is one of the people
responsible for geohack, integration with OpenStreetmap and other
geographical referencing doohickeys.

He is more active on the German site, see
https://de.wikipedia.org/wiki/Benutzer:Kolossos.   His email link is there
or you can search thru this email list for it.

Bryan

On Mon, Mar 2, 2015 at 3:42 PM, Marc Miquel <marcmiquel at gmail.com> wrote:

> Hi Max and Oliver,
>
> Thanks for your answers. geo_tags table seems quite uncomplete. I just
> checked some random articles in for instance Nepali Wikipedia, for its
> Capital Katmandú there is coords in the real article but it doesn't appear
> in geo_tags. Then it doesn't seem an option.
>
> Marc
>>
> 2015-03-02 23:38 GMT+01:00 Oliver Keyes <okeyes at wikimedia.org>:
>
>> Max's idea is an improvement but still a lot of requests. We really need
>> to start generating these dumps :(.
>>
>> Until the dumps are available, the fastest way to do it is probably
>> Quarry (http://quarry.wmflabs.org/) an open MySQL client to our public
>> database tables. So, you want the geo_tags table; getting all the
>> coordinate sets on the English-language Wikipedia would be something like:
>>
>> SELECT * FROM enwiki_p.geo_tags;
>>
>> This should be available for all of our production wikis (SHOW DATABASES
>> is your friend): you want [project]_p rather than [project]. Hope that
>> helps!
>>
>> On 2 March 2015 at 17:35, Max Semenik <maxsem.wiki at gmail.com> wrote:
>>
>>> Use generators:
>>> api.php?action=query&generator=allpages&gapnamespace=0&prop=coordinates&gaplimit=max&colimit=max
>>>
>>> On Mon, Mar 2, 2015 at 2:33 PM, Marc Miquel <marcmiquel at gmail.com>
>>> wrote:
>>>
>>>> Hi guys,
>>>>
>>>> I am doing some research and I struggling a bit to obtain geolocalized
>>>> articles in several languages. They told me that the best tool to obtain
>>>> the geolocalization for each article would be GeoData API. But I see there
>>>> I need to introduce each article name and I don't know if it is the best
>>>> way.
>>>>
>>>> I am thinking for instance that for big wikipedies like French or
>>>> German I might need to make a million queries to get only those with
>>>> coords... Also, I would like to obtain the region according to ISO 3166-2
>>>> which seems to be there.
>>>>
>>>> My objective is to obtain different lists of articles related to
>>>> countries and regions.
>>>>
>>>> I don't know if using WikiData with python would be a better option.
>>>> But I see that there there isn't the region. Maybe I could combine WikiData
>>>> and some other tool to give me the region.
>>>> Anyone could help me?
>>>>
>>>> Thanks a lot.
>>>>
>>>> Marc Miquel
>>>>>>>>
>>>> _______________________________________________
>>>> Analytics mailing list
>>>> Analytics at lists.wikimedia.org
>>>> https://lists.wikimedia.org/mailman/listinfo/analytics
>>>>
>>>>
>>>
>>>
>>> --
>>> Best regards,
>>> Max Semenik ([[User:MaxSem]])
>>>
>>> _______________________________________________
>>> Analytics mailing list
>>> Analytics at lists.wikimedia.org
>>> https://lists.wikimedia.org/mailman/listinfo/analytics
>>>
>>>
>>
>>
>> --
>> Oliver Keyes
>> Research Analyst
>> Wikimedia Foundation
>>
>> _______________________________________________
>> Analytics mailing list
>> Analytics at lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/analytics
>>
>>
>
> _______________________________________________
> Labs-l mailing list
> Labs-l at lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/labs-l
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.wikimedia.org/pipermail/labs-l/attachments/20150302/33159725/attachment-0001.html>


More information about the Labs-l mailing list