| 1 | Blosc supports threads now |
|---|
| 2 | ========================== |
|---|
| 3 | |
|---|
| 4 | It just happened: Blosc can be run in threaded mode for both |
|---|
| 5 | compressing and decompressing. However, threaded Blosc doesn't work |
|---|
| 6 | better than the serial version in all cases and the reason is that |
|---|
| 7 | threads, and most specially, the cost of synchronization between them. |
|---|
| 8 | |
|---|
| 9 | In order to reduce the overhead of threads as much as possible, I've |
|---|
| 10 | decided to implement a pool of threads (the workers) that are waiting |
|---|
| 11 | for the main process (the master) to send them jobs (basically, |
|---|
| 12 | compressing and decompressing small blocks of the initial buffer). |
|---|
| 13 | |
|---|
| 14 | Despite this and many other internal optimizations in the threaded |
|---|
| 15 | code, it does not work faster than the serial version for buffer sizes |
|---|
| 16 | around 128 KB or less (Intel Quad Core2 / Linux). This is why Blosc |
|---|
| 17 | falls back to use the serial version for such a 'small' buffers. |
|---|
| 18 | |
|---|
| 19 | In contrast, for buffers larger than 128 KB, the threaded version |
|---|
| 20 | starts to behave significantly better, being the sweet point at 1 MB. |
|---|
| 21 | For larger buffer sizes than 1 MB, the threaded code slows down, but |
|---|
| 22 | it is still considerably faster than serial code. |
|---|
| 23 | |
|---|
| 24 | For this reason, I decided that Blosc will automatically enable the |
|---|
| 25 | threaded version only when the buffer to be compressed/decompressed |
|---|
| 26 | would be larger than 128 KB, while still using the serial version for |
|---|
| 27 | smaller buffer sizes. |
|---|
| 28 | |
|---|
| 29 | The 128 KB limit might seem a bit arbitrary, and certainly is. I |
|---|
| 30 | still have to study other multi-core processors to fine-tune this. |
|---|
| 31 | |
|---|
| 32 | Francesc Alted |
|---|
| 33 | 2010-04-28 |
|---|