G'day everyone,
I've done several hour work and stripped lots of code from memcached and replaced it
with BerkeleyDB library hooks.
The package can be located at
http://dammit.lt/dbcached.tgz, though guys on #mediawiki
already told it is 'notmemcached', though it has no ./configure and Makefile has
to be tweaked yet by hand, as it is opensource, someone might fix that ;-) Anyway, it
implements set/get methods with memcached interface, but has persistant on-disk store,
cache management and may have transactions and stuff (just additional two or three lines
of code at the initialisation). That would have ACID store at the cost of memcached. That
would also allow analysis of internal cached data structures, out-of-software data
maintainance (as it's store can be accessed by berkeleydb library), and other stuff.
Therefore, we've got near-line-memory store, or near-line-disk store, something in
between.
I've done simple benchmarks, providing that in cached operations speed is equivalent
or outperforming. I didn't check with large arrays of data yet, but that's already
specific to applications.
I'm offering deployment of this stuff on wikimedia servers with possible other future
uses (like distributed search store, I discussed with some p2p gurus on freenode, session
caches, 100%-effective parser cache, some other object store...)
If project is interested, I'd like to put a module on wikipedia or some other cvs and
clean/debug/test/improve code.
Cheers,
Domas
P.S. Some benchmarks:
Processes in action:
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
21989 midom 15 0 90128 18m 2224 S 6.0 0.8 0:03.82 dbcached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
23296 midom 15 0 16720 14m 1320 R 5.9 0.6 0:02.11 memcached
READ ACCESS:
METHOD:
while i<10000:
i+=1
key=i
# All keys exist and get a half-kilobyte data
mc.get(key)
$ time python memtest.py
real 0m9.091s
user 0m0.692s
sys 0m1.799s
$ time python dbtest.py
real 0m9.043s
user 0m0.796s
sys 0m1.776s
WRITE ACCESS:
while i<10000:
i+=1
key=i
mc.set(key,data)
$ time python dbtest.py
real 0m7.969s
user 0m0.713s
sys 0m0.962s
$ time python memtest.py
real 0m7.723s
user 0m0.735s
sys 0m0.844s