Hi Tim
Thanks for bringing some light into the DBO_TRX stuff. Seems like few knew it
existed, and hardly anyone understood what it means or how it should be used.
I'll give my thoughts inline and propose a solution at the bottom.
On 25.09.2012 13:38, Tim Starling wrote:
On 25/09/12 19:33, Daniel Kinzler wrote:
So, can someone shed light on what DBO_TRX is
intended to do, and how it is
supposed to work?
Maybe you should have asked that before you broke it with I8c0426e1.
Well, I did ask about nested transactions on the list. Nobody mentioned the
scheme you describe. Is it documented somewhere?
Anyway, I just added warnings, the behavior didn't change.
DBO_TRX provides the following benefits:
* It provides improved consistency of write operations for code which
is not transaction-aware, for example rollback-on-error.
But it *breaks* write consistency for code that *is* transaction aware. Calling
begin() will prematurely commit the already open transaction.
* It provides a snapshot for consistent reads, which
improves
application correctness when concurrent writes are occurring.
Ok.
Initially, I set up a scheme where transactions were
"nested", in the
sense that begin() incremented the transaction level and commit()
decremented it. When it was decremented to zero, an actual COMMIT was
issued. So you would have a call sequence like:
* begin() -- sends BEGIN
* begin() -- does nothing
* commit() -- does nothing
* commit() -- sends COMMIT
This scheme soon proved to be inappropriate, since it turned out that
the most important thing for performance and correctness is for an
application to be able to commit the current transaction after some
particular query has completed. Database::immediateCommit() was
introduced to support this use case -- its function was to immediately
reduce the transaction level to zero and commit the underlying
transaction.
Ok.
When it became obvious that that every
Database::commit() call should
really be Database::immediateCommit(), I changed the semantics,
effectively renaming Database::immediateCommit() to
Database::commit(). I removed the idea of nested transactions in
favour of a model of cooperative transaction length management:
* Database::begin() became effectively a no-op for web requests and
was sometimes omitted for brevity.
But it isn't! It's not a no-op if there is an active transaction! It *breaks*
the active transaction! I think that is the crucial point here.
* Database::commit() should be called after completion
of a sequence
of write operations where atomicity is desired, or at the earliest
opportunity when contended locks are held.
Ok, so it's basically a savepoint.
Using that scheme, a new transaction should be started immediately after the
commit(). I guess when DBO_TRX is set, query() will do that.
In cases where transactions end up being too short due
to the need for
a called function to commit a transaction when the caller already has
a transaction open, it is the responsibility of the callers to
introduce some suitable abstraction for serializing the transactions.
In the presence of hooks implemented by extensions, this frankly seems
impossible. Also, it would require functions to document exactly if and when
they are using transactions, and hooks have to document whether implementors can
use transactions.
Currently, the only safe way for an extension to use transactions seems to be to
check the trxLevel explicitly, and only start a transaction if there is none
already in progress. Which effectively brings us back to the level-counting scheme.
When transactions too long, you hit performance
problems due to lock
contention.
Yes... which makes me wonder why it's a good idea to start a transaction upon
the first select, even for requests that do not write to the database at all.
When transactions are too short, you hit consistency
problems when requests fail. The scheme I introduced favours
performance over consistency. It resolves conflicts between callers
and callees by using the shortest transaction time. I think was an
appropriate choice for Wikipedia, both then and now, and I think it is
probably appropriate for many other medium to high traffic wikis.
In terms of performance, perhaps it would be feasible to use short
transactions with an explicit begin() with savepoints for nesting. But
then you would lose the consistency benefits of DBO_TRX that I
mentioned at the start of this post.
I'm trying to think of a way to implement this scheme without breaking
transactions and causing creeping inconsistencies.
For the semantics you propose, begin() and commit() seem to be misleading names
- flush() would be more descriptive of what is going on, and implies no nesting.
So, how about this:
* normally, flush() will commit any currently open DBO_TRX transaction (and the
next query will start a new one). In cli (auto-commit) mode, it would do nothing.
* begin() and end() (!) use counted levels. In auto-commit mode, the outer-most
level of them starts and commits a transaction.
* Between begin and end, flush does nothing.
This would allow us to use the "earliest commit" semantics, but also control
blocks of DB operations that should be consistent and atomic. And it would make
it explicit to programmers what they are doing.
-- daniel
PS: for the time being, I suggest to just disable the transaction warnings if
the DBO_TRX flag is set. That lets us keep warnings about the use of nested
transactions in unit tests without flooding the logs.