Let's start with the minimum (1 thread?), with images spread apart as far as possible from each other during the day and see how it goes. We'll keep an eye on the server load every day and see if there's room for increasing the rate. Week days would be highly preferable for us.


On Tue, May 20, 2014 at 2:27 PM, Fæ <faewik@gmail.com> wrote:
On 20 May 2014 13:12, Gilles Dubuc <gilles@wikimedia.org> wrote:
>> I have some rather nice >100MB tiffs
>
> How large is that batch?
>
> We're still working on technical changes, nothing has been merged since the
> last outage.

It should be small, as it is the exception that is over 50MB, let
alone 100MB. Some of it is tidying up where I skipped 100MB files
previously (the 19thC. British Cartoons collection). I would *guess*
no more than 100 or 200 in a day. I can actually choose my xml to
limit the overall daily number if that is a concern and you would like
to suggest a number. (Sidenote - preparing the xml metadata to
discover which files to upload is slow due to LoC API limits of 15
requests per minute for "security" reasons - I was unaware of this
until I contacted the LoC a couple of days ago. This is not a project
that can be rushed through.)

I am happy to kick these off on 2 threads maximum, which should mean
something like a maximum possible throughput rate for large files of
less than c.500 in a day. 1 thread would presumably be half that.

I will put aside the small number of remaining NPYL map files - there
is no hurry and it would be good to use these "trouble making" files
to test out the technical changes when they are implemented.

PS Beta cluster is still not working for me today, I get the standard
server down page every time I run GWT there - I have not tried the
production environment in the last week.

Fae