Brian Suda wrote:
I have been on this list awhile, when i originally
joined i was
interesting in the possibility of exporting the wiktionary data as
.dict format. Now that the newest version of OSX 10.4 has a built-in
dictionary that uses the dict:// to look-up words i was interested to
see if anyone on the technicaly side would like to explore the
possibility of either exporting the Wiktionary database as .dict
format, or run a dictionary daemon that would access the wiktionary
database server and return dict entries. It would be read-only, but it
would be another interesting way to access the wiktionary besides the
web interface.
Does anyone on the tech list know if this is even possible? I'm not
asking you to do it (i can write the export), i was wondering if there
is some sort of database schema available to extract the data into
dict format, or are the entries too fragmented to even attempt an
export?
-brian
In fact exporting the Wiktionaries is almost impossible right now. I
once tried to write a script in Python to convert an English Wiktionary
entry in some common, logical format. There are too many different
possible ways an entry can be built up. Maybe I'll try it again one day,
but my time has become very limited lately. I'm not saying it's entirely
impossible, but any solution will need to have some manual input. It's
very hard to automate it all the way.
Another thing is that the Wiktionary content is mostly not ready yet.
We've done an amazing amount of work already and some entries are
already quite good, but most of the content needs a few (maybe 10 or 20)
more years of work to be able to say before it will be usable by the public.
We should be thinking about making it possible to export to various
formats right now though. If all you want to to is look up a word and
keep whatever is currently on its page and simply present that to the
user. (or only grab the part describing the English word) That should be
possible and would be very easy to do. Maybe I should look into this
dict format, to know whether that would work. I'll go do that right now...
Jo