Changes between Version 21 and Version 22 of WikiStart


Ignore:
Timestamp:
06/12/10 15:25:28 (3 years ago)
Author:
faltet
Comment:

Updated info about the testing process in home page

Legend:

Unmodified
Added
Removed
Modified
  • WikiStart

    v21 v22  
    88Blosc is a high performance compressor optimized for binary data. It has been designed to transmit data to the processor cache faster than the traditional, non-compressed, direct memory fetch approach. Blosc is the first (that I'm aware of) of a series of compressors that are meant not only to reduce the size of large datasets on-disk or in-memory, but also to accelerate computations that are currently memory-bound. 
    99 
    10 It uses the blocking technique (as described in this [http://www.pytables.org/docs/CISE-12-2-ScientificPro.pdf article]) to reduce activity on the memory bus as much as possible.  In short, the blocking technique works by dividing datasets in blocks that are small enough to fit in L1 cache of modern processor and perform compression/decompression there. It also leverages threading in nowadays multicore processors so as to accelerate the compression/decompression process to a maximum. 
     10It uses the blocking technique (as described in this [http://www.pytables.org/docs/CISE-12-2-ScientificPro.pdf article]) to reduce activity on the memory bus as much as possible.  In short, the blocking technique works by dividing datasets in blocks that are small enough to fit in L1 cache of modern processor and perform compression/decompression there. It also leverages multimedia extensions (SSE2) and multi-threading capabilities in nowadays multicore processors so as to accelerate the compression/decompression process to a maximum. 
    1111 
    1212You may want to see more info about Blosc in the last part of this [http://www.pytables.org/docs/StarvingCPUs.pdf presentation].  You can see some recent bencharks in SyntheticBenchmarks. 
     
    1414== Where Blosc Can Be Used? == 
    1515 
    16 Blosc is being developed mainly for the needs of the [http://www.pytables.org/ PyTables] database, although it may be used elsewhere.  Although it is still in beta state, it is expected to allow !PyTables to perform arithmetic (for example, see [http://pytables.org/moin/ComputingKernel]) and indexing operations with large datasets well beyond the speed of more traditional approaches (like memmap'ed access to files). 
     16Blosc is being developed mainly for the needs of the [http://www.pytables.org/ PyTables] database, although it may be used elsewhere.  Although it is still a young project, it is expected to allow !PyTables to perform arithmetic (for example, see [http://pytables.org/moin/ComputingKernel]) and indexing operations with large datasets well beyond the speed of more traditional approaches (like memmap'ed access to files). 
    1717 
    1818== Is It Stable? == 
    1919 
    20 No, not yet, so please be careful when using it.  Blosc is still a young project (in terms of what is needed for a compressor to be considered stable), and it is currently undergoing very intensive testing on many different kinds of datasets.  Being said this, since 0.8 version I've frozen the format of Blosc, so at least it is guaranteed that the format will not change in a long while.  The API is not yet frozen too (once this is done, that will mark the 1.0 release). 
     20No, not yet, so please be careful when using it.  Being said this, since 0.8 version the format has been frozen, so at least it is guaranteed that it will not change in a long while.  The API has been frozen in release 0.9.5 too.  The only part that remains is testing Blosc extensively and broadely. 
    2121 
    22 I'm currently testing it very hard, and I'm happy to say that, since 0.9.1 on, it worked flawlessly compressing several thousands of terabytes on Windows and Unix machines, both in 32-bit and 64-bit.  Also, it is being included in the [http://www.pytables.org/download/preliminary/ 2.2] version of !PyTables, so it is probably being tested quite intensively in many other places. 
     22Part of the !PyTables community is currently testing Blosc very hard now, and I'm happy to say that, since 0.9.5 on, it worked flawlessly compressing several thousands of terabytes on many different Windows and Unix boxes, both in 32-bit and 64-bit.  Also, it is being included in the [http://www.pytables.org/download/preliminary/ 2.2] version of !PyTables, so it is probably being tested quite intensively in many other places.  When all this test process would end (very soon now), that will mark the begining of the 1.x series. 
    2323 
    2424== Want To Contribute? == 
    2525 
    26 Your cooperation is very important to make Blosc stable as soon as possible so, if you detect some bug or want to propose an enhancement, feel free to open a new ticket.  Also, you can contribute to this project by simply compiling and running a small benchmark as explained in the SyntheticBenchmarks page and mailing back the results for your platform to [http://pytables.org/moin/FrancescAlted me]. 
     26Your cooperation is very important to make Blosc stable as soon as possible so, if you detect some bug or want to propose an enhancement, feel free to open a new ticket.  Also, you can contribute to this project by simply compiling and running different benchmark and test suites as explained in the SyntheticBenchmarks page. 
    2727 
    2828== Blosc License == 
     
    3838== Source tarball == 
    3939 
    40 There is not a source tarball as such yet.  I'll provide one once Blosc will become stable. 
     40There is not a source tarball as such yet.  I'll provide one once Blosc will be declared stable. 
    4141 
    4242== About This Site ==