[Foundation-l] Printed on Demand Books from Wikipedia articles

Magnus Manske magnus.manske at web.de
Thu Jul 13 13:15:09 UTC 2006


Anthony schrieb:
> On 7/13/06, Volker Haas <volker.haas at brainbot.com> wrote:
>   
>> Hi Robert
>>
>> Robert Scott Horning wrote:
>>
>>     
>>> I am curious how you plan to compile these books in a format that looks
>>> good on paper and not strictly as a web format.  While this can be
>>> automated to an extent, I do think there are some issues that come from
>>> trying to move content from a web format to a printed page, and not all
>>> of these can be completely automated.
>>>       
>> As you have pointed out, an automatic conversion of html (or mediawiki
>> markup) into latex can not be automated in a way that the book is
>> absolutely perfect.
>> But right now we are pretty satisfied with the results - even though we
>> are still working on improvments of the conversion.
>> The conversion is done with a parser we developed from scratch - as
>> mentioned in a previous post. The mediawiki markup is translated
>> into an internal representation which then gets transformed to latex (to
>> be more precise "context") - this is the hard part. It is slightly easier to
>> transform the internal representation back to html since some css-style
>> information can be maintained.
>>
>>     
> As I've mentioned before, I'd love to see a Wikimedia project which
> does exactly this.  Then the latex could be edited collaboratively to
> make things more "absolutely perfect".  It's nice to see it's at least
> somewhat possible, though I'd say the quality of the previews right
> now is fairly low.
>   
My wiki2xml converter is in the subversion repository. It is a set of
scripts to convert MediaWiki markup into XML, and from there into other
formats, including plain text, HTML, DocBook, and ODT
(OpenDocument/OpenOffice format). OpenOffice is also open source and can
generate PDFs natively.


I admit that wiki2xml is currently less advanced than the impressive
pediapress software, as it does not render <math> tags, does not make a
list of figures, etc. which is due to me being busy, and being too lazy
merging in patches by others ;-)

Magnus

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 249 bytes
Desc: OpenPGP digital signature
Url : http://lists.wikimedia.org/pipermail/foundation-l/attachments/20060713/e6ec7cfc/attachment-0001.pgp 


More information about the foundation-l mailing list