Synthetic Benchmarks (And How You Can Contribute Yours)
In order to assess the performance of Blosc in a variety of scenarios, a benchmark is provided so as to enable a fair comparison between different platforms. In this page, a series of plots about the performance of Blosc on a selected set of platforms are shown. It is fun to see the evolution of the hardware/software in the last years in terms of the speed of Blosc compared to a plain OS memcpy(). These conclusions can be applied to the evolution of the ratio of computing power versus memory bandwidth in general.
In case you want to compare your own platform against the ones listed here, after the plots there are instructions on how to compile, run and report back the results of this benchmark. Also you will be presented the opportunity to contribute testing Blosc by running the hard or extreme suites. Please read the sections below for the gory details.
Contributed outputs
Processor model: Intel Core i7 3930K Unlocked (6 x 3.2 GHZ) 12 MB Cache
Compiler: GCC version 4.6.1 20110908 (Red Hat 4.6.1-9)
OS: Fedora 15 2.6.42.9-1.fc15.x86_64 #1 SMP (64 bit)
Contributed by: Alvaro Tejero
Processor model: 2 x Six-Core Intel Xeon CPU X5690 @ 3.47GHz
(12 physical cores, hyperthreading disabled)
Compiler: GCC 4.4.5 (Red Hat 4.4.5-6)
OS: Red Hat Enterprise Linux 6 (2.6.32-131.0.15.el6.x86_64), 64-bit
Contributed by: Valentin Haenel
Processor model: 4 x Six-Core AMD Opteron(tm) Processor 8431 (24 cores @ 2.4 GHz)
Compiler: GCC (Debian 4.4.5-8) 4.4.5
OS: Debian 6.0.2 (Squeeze), 64-bit
Contributed by: Valentin Haenel
Processor model: Six-Core Intel i7 (Bloomfield) @ 3.33 GHz / 1 processors
Compiler: GCC 4.4.5
OS: Debian 6.0.2 (Squeeze), 64-bit
Contributed by: Francesc Alted
Processor model: Quad-Core Intel Xeon (Nehalem) @ 2.93 GHz / 2 processors
Compiler: GCC 4.5
OS: Mac OSX Snow Leopard (10.6.3), 64-bit
Contributed by: Louis Wicker
Processor model: AMD Phenom II X6 @ 3.7 GHz
Compiler: MINGW64/GCC version 4.4.5
OS: Windows 7, 64-bit
Contributed by: Francesc Alted
Processor model: Intel Core2 Quad @ 3 GHz
Compiler: MSVC 2008
OS: Windows, 64-bit
Contributed by: Christoph Gohlke
Processor model: Intel Core2 Quad Q8400 @ 2.66 GHz
Compiler: GCC version 4.4.1
OS: openSUSE 11.2, 64-bit
Contributed by: Francesc Alted
Processor model: Intel Core2 Duo E8400 @ 3 GHz
Compiler: GCC version 4.4.3
OS: OpenSUSE Linux 11.2, 64-bit
Contributed by: Francesc Alted
Processor model: Dual-Core AMD Opteron 1214 @ 2.2 GHz
Compiler: GCC version 4.4.3
OS: Ubuntu Linux 10.04, 64-bit
Contributed by: Tony Theodore
Processor model: Intel Pentium4 @ 3.2 GHz (with hyper-threading)
Compiler: GCC version 4.4.3
OS: Ubuntu Linux 10.04, 32-bit
Contributed by: Gabriel Beckers
Processor model: Intel Atom 330 @ 1.6 GHz (2 physical cores, with hyper-threading)
Compiler: GCC version 4.5.2
OS: Ubuntu Linux 11.04, 64-bit
Contributed by: Valentin Haenel
Processor model: Intel Atom N270 @ 1.6 GHz (with hyper-threading)
Compiler: GCC version 4.4.3
OS: Ubuntu Linux 10.04, 32-bit
Contributed by: Francesc Alted
Processor model: PowerPC G4 @ 1.2 GHz / 512 KB L2 cache
Compiler: GCC version 4.0.1
OS: Mac OSX Tiger, 32-bit
Contributed by: Ivan Vilata
How to compile (or get binaries for) the benchmark suite for Blosc
First, checkout the master version from:
https://github.com/FrancescAlted/blosc
Then, compile the sources:
GCC (Unix) or MINGW (Windows):
$ cd your_blosc_sources/bench $ gcc -O3 -msse2 -o bench bench.c ../blosc/*.c -lpthread
MINGW (Windows):
> gcc -O3 -msse2 -o bench bench.c blosc\*.c
MSVC 2008 or higher (Windows):
> cd your_blosc_sources\bench > cl /Ox /Febench.exe bench.c ..\blosc\*.c
For those that are using Windows and do not have a compiler installed (I strongly recommend using the free MINGW), you can use the next binary for Windows 64-bit or binary for Windows 32-bit.
Running and plotting the different suites in benchmark
Now that you have the executable benchmark, you can run it by passing the 'suite' parameter followed by the number of cores in your machine to the bench program, i.e. something like:
$ ./bench suite [nthreads]
then a small suite will be run that checks the speed of Blosc for the specified number of threads. Given this output, you can convert it into a plot by using the bench/plot-speeds.py scripts (you will need the matplotlib library installed). You can print a small online help for this script usage:
$ python plot-speeds.py -h
Usage: plot-speeds.py [-o outfile] [-t title ] [-d|-c] filename
Options:
-h, --help show this help message and exit
-o OUTFILE, --outfile=OUTFILE
filename for output (many extensions supported, e.g.
.png, .jpg, .pdf)
-t TITLE, --title=TITLE
title of the plot
-l LIMIT, --limit=LIMIT
expression to limit number of threads shown
-x XMAX, --xmax=XMAX limit the x-axis
-d, --decompress plot decompression data
-c, --compress plot compression data
For example, if you have, say, 4 cores in your machine, and want to get the plots interactively, proceed like this:
$ ./bench suite 4 > mysuite.out $ python plot-speeds.py -c mysuite.out # get the compression plot $ python plot-speeds.py -d mysuite.out # get the decompression one
Alternatively, you can straight get a plot file by using the -o flag:
$ python plot-speeds.py -o plot.png -c mysuite.out
Sometimes the legend may cover some of the data in this case you can increase the limit of the x-axis (compression ratdio) using the -x switch (10 is quite a good value):
$ python plot-speeds.py -x 10 -c mysuite.out
If you have many, many threads, the output can become quite confusuing and you may want to take a look at the -l switch. This can limit the number of displayed threads using an arbitrary Python expression, like a list or an iterator over ints (indexing starts at 1, not 0):
$ python plot-speeds.py -l '[1]' -c mysuite.out $ python plot-speeds.py -l 'range(1, 8)' mysuite.out $ python plot-speeds.py -l 'range(1, 8, 2)' mysuite.out $ python plot-speeds.py -l '[1, 3, 28]' mysuite.out
Also, if you have spare CPU cycles available, you may want to run the hardsuite, which is a series of tests that are much more comprehensive (and costly) than the suite above. It will take between 1 and 6 hours to run, depending on your machine and the number of cores, and will compress/decompress around 4 TB of data, checking that it has had a good round-trip. Running it is easy:
$ ./bench hardsuite 4 > myhardsuite.out $ gzip -9 < myhardsuite.out > myhardsuite.out.gz # use zip or 7z compressors if on Windows
IMPORTANT: In order to get fine results, please be sure that you are not running other heavy process while running the suites.
You can look into the output for the FAILED string in order to see if something went wrong. If FAILED does not appear anywhere, you can pretty sure that Blosc works well for your platform. If failures appear, please report this to me.
NOTE: You cannot use plot-speeds.py to plot the results of the hardsuite, as it is only meant for plotting suite output purposes.
[Incidentally, I've added a new suite called extremesuite that performs a crazy check on many, many possible inputs to Blosc. It works similarly than the hardsuite, but it can take between 2 and 3 days to finish on a relatively recent CPU, and can account up to 60 TB of data checked. Really, this is not everyone but in case you are brave enough you might want to have it a try.]
Reporting your results back
If you want to help with the fine-tuning of Blosc for other processors, please send your own output of these suites (either suite, hardsuite or both) to me. That info will be extremely useful for allowing allow better compression ratios and performance in future versions. Please be sure that you also provide the next information:
CPU info: (vendor, model or cache sizes) Operating System: (e.g. Linux/Windows/MacOSX/Solaris and version) Compiler used: (e.g. GCC/ICC/MSVC/MINGW and version)
Thanks!
-- Francesc Alted
Attachments
-
compr-4t-linux64-small.png
(38.7 KB) -
added by faltet 3 years ago.
-
compr-6t-wingw64-small.png
(46.2 KB) -
added by faltet 3 years ago.
-
decompr-4t-linux64-small.png
(41.3 KB) -
added by faltet 3 years ago.
-
decompr-6t-wingw64-small.png
(51.9 KB) -
added by faltet 3 years ago.
-
suite-linux-core2.out
(7.0 KB) -
added by faltet 3 years ago.
-
suite-windows7-amdx6.out
(10.7 KB) -
added by faltet 3 years ago.
-
mysuite-powerpc-g4.out
(1.7 KB) -
added by faltet 3 years ago.
-
compr-8t-osx-nehalem-small.png
(46.1 KB) -
added by faltet 3 years ago.
-
decompr-8t-osx-nehalem-small.png
(51.0 KB) -
added by faltet 3 years ago.
-
suite-nehalem-gcc45.out
(14.0 KB) -
added by faltet 3 years ago.
-
suite-core2-ubuntu-gcc44.out
(3.2 KB) -
added by faltet 3 years ago.
-
bench-win32.zip
(27.3 KB) -
added by faltet 3 years ago.
-
bench-win64.zip
(35.9 KB) -
added by faltet 3 years ago.
-
compr-2t-linux32-pentium4-ht-small.png
(28.8 KB) -
added by faltet 3 years ago.
-
decompr-2t-linux32-pentium4-ht-small.png
(31.1 KB) -
added by faltet 3 years ago.
-
suite-2t-linux32-pentium4-ht.out
(3.5 KB) -
added by faltet 3 years ago.
-
compr-4t-win64-core2-quad-small.png
(35.7 KB) -
added by faltet 3 years ago.
-
decompr-4t-win64-core2-quad-small.png
(39.1 KB) -
added by faltet 3 years ago.
-
suite-4t-win64-core2-quad.out
(7.0 KB) -
added by faltet 3 years ago.
-
compr-ubuntu-opteron-2t-small.png
(29.0 KB) -
added by faltet 3 years ago.
-
decompr-ubuntu-opteron-2t-small.png
(33.8 KB) -
added by faltet 3 years ago.
-
mysuite-ubuntu-opteron-2t.out
(3.6 KB) -
added by faltet 3 years ago.
-
compr-1t-osx-powerpc-G4-small.png
(24.9 KB) -
added by faltet 3 years ago.
-
decompr-1t-osx-powerpc-G4-small.png
(24.8 KB) -
added by faltet 3 years ago.
-
compr-2t-linux-core2-small.png
(30.3 KB) -
added by faltet 3 years ago.
-
decompr-2t-linux-core2-small.png
(33.7 KB) -
added by faltet 3 years ago.
-
suite-core2-opensuse-gcc44.out
(3.5 KB) -
added by faltet 3 years ago.
-
suite-2t-linux32-atom-ht.out
(3.5 KB) -
added by faltet 3 years ago.
-
compr-ubuntu-atom-2t-small.png
(28.0 KB) -
added by faltet 3 years ago.
-
decompr-ubuntu-atom-2t-small.png
(29.9 KB) -
added by faltet 3 years ago.
-
suite-2t-linux32-atom-ht.2.out
(3.5 KB) -
added by faltet 3 years ago.
-
linux-i7-980x-6-compr.png
(85.5 KB) -
added by faltet 21 months ago.
-
linux-i7-980x-6-decompr.png
(102.3 KB) -
added by faltet 21 months ago.
-
linux-i7-980x-6.out
(11.1 KB) -
added by faltet 21 months ago.
-
linux-i7-980x-6-compr-small.png
(54.7 KB) -
added by faltet 21 months ago.
-
linux-i7-980x-6-decompr-small.png
(61.0 KB) -
added by faltet 21 months ago.
-
intel-atom-330-suite-decompression.png
(75.7 KB) -
added by faltet 14 months ago.
-
intel-atom-330-suite-compression.png
(66.3 KB) -
added by faltet 14 months ago.
-
intel-atom-330-suite-compression-small.png
(55.1 KB) -
added by faltet 14 months ago.
-
intel-atom-330-suite-decompression-small.png
(58.6 KB) -
added by faltet 14 months ago.
-
intel-atom-330-suite.out
(7.0 KB) -
added by faltet 14 months ago.
-
four-six-opteron-suite.out
(42.0 KB) -
added by faltet 14 months ago.
-
four-six-opteron-suite-decompression-small.png
(181.5 KB) -
added by faltet 14 months ago.
-
four-six-opteron-suite-compression-small.png
(168.4 KB) -
added by faltet 14 months ago.
-
viznode07.out
(21.1 KB) -
added by vhaenel 14 months ago.
-
viznode07-compression-small.png
(66.7 KB) -
added by vhaenel 14 months ago.
-
viznode07-decompression-small.png
(85.7 KB) -
added by vhaenel 14 months ago.
-
linux-3930K.out
(21.1 KB) -
added by faltet 14 months ago.
-
linux-3930K-compr-small.png
(80.3 KB) -
added by vhaenel 14 months ago.
-
linux-3930K-decompr-small.png
(102.7 KB) -
added by vhaenel 14 months ago.
![(please configure the [header_logo] section in trac.ini)](/images/blosc-logo-small.png)



























