[Labs-l] Accessing the databases from labs - A comparison with the toolserver

Fri Jul 12 18:24:05 UTC 2013

On 07/12/2013 01:59 PM, Platonides wrote:
> These connections are cached, so if I connected to fiwiki
> and then to eowiki, the same db object would be returned.

I don't think that any putative gain of performance or resources this
would give is worth the added complexity; but if you insisted on doing
that caching, you should do it by the actual cluster IP, not the name
you used since only the former is guaranteed to be valid in all cases.

(Well, strictly speaking, only the [host,port] tuple is, but all the
ports will always remain the same since we do portmapping)

>> In this particular case, it's not avoidable (for user databases).
> Why?

We use a double underscore as the guaranteed cannot-occur-in-a-username
mark.  But beyond that, the usernames are different because we don't
handle credentials the same way.  Since the allowable databases are
derived from the username, then the database names are necessarily
different (picking between "u_foo" and "u_p12345g12345" is no harder
than having to pick between "u_foo" and "p12345g12345__foo" --
hardcoding the dabase name breaks either way).

> There is also a grid engine on TS.

Yes, but while its use has been recently been made mandatory, I think,
most tools do not in fact use it and need to be adapted.

> It's not possible to specify a
> cluster as requisite in labs, BTW.

That would be because, by design, any execution node has access to every
cluster.  Having this be a requested resource would be akin to being
able to request that your job be provided with a CPU to execute it.  :-)

-- Marc