You might try using the query to the full text search backend. I'm on
mobile or I would whip up an example. I believe the syntax is
incategory:"you_category ". You might need to use CirrusSearch
(srbackend=CirrusSearch) if the category is added by a template.
I don't know if it's be faster but you can try.
On Sep 6, 2014 5:30 PM, "Nikolas Everett" <neverett(a)wikimedia.org> wrote:
I don't know the MySQL syntax super well but that
query implied that it
will always use the time index and then iterate in time order looking for a
link to the category you want. Unless that category is an appreciable
fraction of the total links I don't imagine that is a good plan.
On Sep 6, 2014 3:58 PM, "Brad Jorsch (Anomie)" <bjorsch(a)wikimedia.org>
wrote:
> The database query for that is simple enough:
>
> SELECT /* ApiQueryCategoryMembers::run Anomie */
> cl_from,cl_sortkey,cl_type,page_namespace,page_title,cl_timestamp FROM
> `page`,`categorylinks` FORCE INDEX (cl_timestamp) WHERE cl_to =
> 'Copy_to_Wikimedia_Commons_(bot-assessed)' AND (cl_from=page_id) ORDER BY
> cl_timestamp,cl_from LIMIT 501;
>
> And the PHP code doesn't do anything complicated either. Maybe Sean can
> give us more insight if there's some subtle database thing going on here.
>
> Note, though, that you're not getting anything at all random here; you're
> always getting the files that have been in the category longest first.
>
>
>
> On Fri, Sep 5, 2014 at 11:52 PM, This, that and the other <
> at.light(a)live.com.au> wrote:
>
>> A tool I have written, For the Common Good [1], uses the following type
>> of query to fetch a list of "random" files that users may like to
transfer
>> to Commons. The category name may differ but the structure is the same:
>>
>>
https://en.wikipedia.org/w/api.php?format=xml&cmnamespace=6&cmtitle=
>> Category%3ACopy%20to%20Wikimedia%20Commons%20(bot-
>> assessed)&action=query&list=categorymembers&cmsort=
>> timestamp&cmprop=title&cmlimit=500
>>
>> In 2011 when I was first writing FtCG, this query ran at an acceptable
>> speed. Recently, though, it has become extremely slow, to the point where
>> timeouts are now a regular occurrence. It sometimes takes 4 or 5 tries (and
>> several minutes) before results are returned. From then on, however, it
>> works quickly. If you run this exact query now, there's a good chance it
>> will work quickly because others have been running the query before you.
>>
>> The cause seems to be the "cmsort=timestamp" portion of the request.
If
>> this is removed, it works essentially instantaneously. However, I don't
>> really want the files in alphabetical order, as it doesn't seem very
>> "random".
>>
>> Four questions:
>> 1. Why does this query take so long?
>> 2. Can anything be done on the server side to make it faster?
>> 3. Why does it take so much longer now than it did in 2011?
>> 4. Is there a better way to fetch a random cross-section of files in a
>> particular category?
>>
>> TTO
>>
>> [1]
https://en.wikipedia.org/wiki/User:This,_that_and_the_other/
>> For_the_Common_Good
>>
>>
>> _______________________________________________
>> Mediawiki-api mailing list
>> Mediawiki-api(a)lists.wikimedia.org
>>
https://lists.wikimedia.org/mailman/listinfo/mediawiki-api
>>
>
>
>
> --
> Brad Jorsch (Anomie)
> Software Engineer
> Wikimedia Foundation
>
> _______________________________________________
> Mediawiki-api mailing list
> Mediawiki-api(a)lists.wikimedia.org
>
https://lists.wikimedia.org/mailman/listinfo/mediawiki-api
>
>