Changes between Version 20 and Version 21 of WikiStart


Ignore:
Timestamp:
06/02/10 08:24:51 (3 years ago)
Author:
faltet
Comment:

Cosmetic changes

Legend:

Unmodified
Added
Removed
Modified
  • WikiStart

    v20 v21  
    88Blosc is a high performance compressor optimized for binary data. It has been designed to transmit data to the processor cache faster than the traditional, non-compressed, direct memory fetch approach. Blosc is the first (that I'm aware of) of a series of compressors that are meant not only to reduce the size of large datasets on-disk or in-memory, but also to accelerate computations that are currently memory-bound. 
    99 
    10 It uses the blocking technique (as described in this [http://www.pytables.org/docs/CISE-12-2-ScientificPro.pdf article]) to reduce activity on the memory bus as much as possible.  In short, the blocking technique works by dividing datasets in blocks that are small enough to fit in L1 cache of modern processor and perform compression/decompression there. You may want to see more info about Blosc, as well as some preliminary benchmarks, in the last part of this [http://www.pytables.org/docs/StarvingCPUs.pdf presentation]. 
     10It uses the blocking technique (as described in this [http://www.pytables.org/docs/CISE-12-2-ScientificPro.pdf article]) to reduce activity on the memory bus as much as possible.  In short, the blocking technique works by dividing datasets in blocks that are small enough to fit in L1 cache of modern processor and perform compression/decompression there. It also leverages threading in nowadays multicore processors so as to accelerate the compression/decompression process to a maximum. 
     11 
     12You may want to see more info about Blosc in the last part of this [http://www.pytables.org/docs/StarvingCPUs.pdf presentation].  You can see some recent bencharks in SyntheticBenchmarks. 
    1113 
    1214== Where Blosc Can Be Used? == 
     
    1618== Is It Stable? == 
    1719 
    18 No, not yet, so please be careful when using it.  Blosc is still a young project (in terms of what is needed for a compressor to be considered stable), and it is currently undergoing very intensive testing on many different kinds of datasets.  Being said this, since 0.8 version I've frozen the format of Blosc, so at least it is guaranteed that the format will not change in a long while.  Also, it is being included in the [http://www.pytables.org/download/preliminary/ 2.2] version of !PyTables, so it is probably being tested quite intensively in many places.  But still, Blosc really needs much more testing before declaring it stable enough for production purposes. 
     20No, not yet, so please be careful when using it.  Blosc is still a young project (in terms of what is needed for a compressor to be considered stable), and it is currently undergoing very intensive testing on many different kinds of datasets.  Being said this, since 0.8 version I've frozen the format of Blosc, so at least it is guaranteed that the format will not change in a long while.  The API is not yet frozen too (once this is done, that will mark the 1.0 release). 
     21 
     22I'm currently testing it very hard, and I'm happy to say that, since 0.9.1 on, it worked flawlessly compressing several thousands of terabytes on Windows and Unix machines, both in 32-bit and 64-bit.  Also, it is being included in the [http://www.pytables.org/download/preliminary/ 2.2] version of !PyTables, so it is probably being tested quite intensively in many other places. 
    1923 
    2024== Want To Contribute? ==