David Gerard wrote:
On 29/02/2008, Domas Mituzas
<midom.lists(a)gmail.com> wrote:
> That gave performance similar to the MySQL
fulltext index *BUT* when I
> queried the same index with Luke (which is Java), the query was
> *fast*.
> Sorry, I can't find the mailing list posts about that.
Zend Lucene is 100x slower than Java Lucene.
We were running a Mono version of Lucene for a while, weren't we? How
did that compare?
It was moderately slower than the Java, but on the same order of
magnitude for most stuff. Performance differences here were mainly about
the Mono VM being a bit slower (at least at the time) and in some cases
the regex library being much less efficient (index generation).
The reasons for using Mono at the time over Sun Java or GCJ were:
* Sun Java - fast, but not open source enough
* GCJ - fast, open source, but mystery memory leaks
* Mono - a bit slower, open source, no mystery memory leaks
Of course over time, mystery memory leaks crept into the system. ;)
Eventually, Sun Java became more and more open to the point where we
don't really care anymore (if we get real pissy about it again we could
start running an OpenJDK-based VM such as IcedTea), and the guy who
picked up development on our Lucene server again preferred to work with
the Java version instead of the C# one. (Among other things, this gives
you access to the latest Lucene version instead of an older port.)
There's no real reason to choose Mono for this sort of task to start
again. Hypothetically if we wanted to ship a Lucene-based tool by
default, we could attempt to have backends supporting both the PHP Zend
Lucene and the Java one... assuming you can get even vaguely useful
performance out of the PHP one. :)
-- brion vibber (brion @
wikimedia.org)