On 4/14/05, Brion Vibber <brion(a)pobox.com> wrote:
The choice of initial languages is driven by
availability; English,
German, and Russian word-stem normalizers are included in the main
Lucene package and I wrote a quickie Esperanto one as a test since I was
familiar with the language. There are a number of other analyzers
available in contribs packages which I'll be setting up over the next
couple days, as well as indexing the rest of the wikis on the supported
languages.
How hard would it be to come up with these word-stem normalizers for other
languages (i.e. did you base Esperanto off of another similar language or
did you come up with it yourself relatively easily)? Is there a good
description somewhere on how to come up with them?
Dori