Re: [Offline-l] Javascript ZIM reader?

1 Jan 2013

Hi Renaud,

Here are some I have seen:
* LZMA-JS

https://github.com/nmrugg/LZMA-JS/

Online demo: http://nmrugg.github.com/LZMA-JS/demos/advanced_demo.html

* js-lzma

http://code.google.com/p/js-lzma/

I have not seen any LZMA2 or XZ Javascript libraries.  I've written much
of the XZ container support, and are looking at the LZMA2 support, but
there seems to be a lot of unnecessary baggage in the XZ/LZMA2 containers.

First tests suggest LZMA decompression in JS is very slow, but I would
not want to draw any conclusions until analysing it further.

If raw LZMA1 were an option then it could also be useful to include the
CRC32 for each cluster blob so that the decompresser could exit as soon
as the blob was decoded rather than having to decompress the entire
cluster to verify the XZ checks.  The CRC32 could be included after each
blob offset.  If the JS decompresser is slow then such optimizations
might be essential, and might help battery life on mobile devices.

Regards
Douglas Crosher

On 01/02/2013 12:23 AM, renaud gaudin wrote:
...
  Douglas,

 There are multiple JS LZMA libraries. I haven't looked at any of them
 but have you ? It might be enough for you to get a sens of performances.

 renaud

 On Tue, Jan 1, 2013 at 1:18 PM, Douglas Crosher &lt;dtc(a)scieneer.com
 <mailto:dtc@scieneer.com>> wrote:

     On 01/01/2013 08:09 PM, Emmanuel Engelhart wrote:
  Hi Douglas

 On 01/01/2013 02:22 AM, Douglas Crosher wrote:
  Has anyone considered a pure Javascript ZIM file
reader and Wikipedia
 reader? 
 No, this is complicated to do... although this could be practical. I'm
 also not sure if we could achieve to get acceptable performances.  
     I'll hack something together to explore the performance question, and
     follow up.

 > I have made a small start, writing some hack
code to open a ZIM      file and
   it gets
to the point of needing to uncompress a cluster.  A start has
 been made on the needed XZ decompress code but it's not done yet. 
 Great. Yes, xz decompression is the most complicated part.  
     Would it be very limiting on ZIM files if the XZ decoder were restricted
     to the 'XZ embedded' format, supporting only the 'LZMA2' filter?  
See:
     http://tukaani.org/xz/embedded.html

     Do ZIM files really need the XZ/LZMA2 containers, or could they just use
     raw LZMA1 compression?  This could be added as a new cluster compression
     type for compatibility.

     Two possible uses for XZ/LZMA2 may be for large entries and/or entries
     with distinct regions that are compressible and not compressible.
     However perhaps a significant amount of content does not need this.

     I expect that typical HTML entries would be relatively small.  It would
     seem pointless for a cluster to use multiple XZ blocks and/or streams
     when these could be avoided by placing entries in separate clusters.  So
     perhaps there is a case for clusters with just one LZMA1 block.  Further
     entries are likely to either be compressible or not, and could be placed
     in separate clusters rather than exploiting the LZMA2 support for such
     content.

     It might even save space not having the XZ container overhead.

     Regards
     Douglas Crosher

     _______________________________________________
     Offline-l mailing list
     Offline-l(a)lists.wikimedia.org <mailto:Offline-l@lists.wikimedia.org>
     https://lists.wikimedia.org/mailman/listinfo/offline-l

 _______________________________________________
 Offline-l mailing list
 Offline-l(a)lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/offline-l

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

Re: [Offline-l] Javascript ZIM reader?