Releases: OpenMathLib/OpenBLAS
OpenBLAS 0.2.20 version
Version 0.2.20
24-Jul-2017
common:
* Improved CMake support
* Fixed several thread race and locking bugs
* Fixed default LAPACK optimization level
* Updated LAPACK to 3.7.0
* Added ReLAPACK (/~https://github.com/HPAC/ReLAPACK), make BUILD_RELAPACK=1
POWER:
* Optimizations for Power9
* Fixed several Power8 assembly bugs
ARM:
* New optimized Vulcan and ThunderX2T99 targets
* Support for ARMV7 SOFT_FP ABI (make ARM_SOFTFP_ABI=1)
* Detect all cpu cores including offline ones
* Fix compilation with CLANG
* Support building a shared library for Android
MIPS:
* Fixed several threading issues
* Fix compilation with CLANG
x86_64:
* Detect Intel Bay Trail and Apollo Lake
* Detect Intel Sky Lake and Kaby Lake
* Detect Intel Knights Landing
* Detect AMD A8, A10, A12 and Ryzen
* Support 64bit builds with Visual Studio
* Fix building with Intel and PGI compilers
* Fix building with MINGW and TDM-GCC
* Fix cmake builds for Haswell and related cpus
* Fix building for Sandybridge with CLANG 3.9
* Add support for the FLANG compiler
IBM Z:
* New target z13 with BLAS3 optimizations
md5sum
e0d47385423944cbd14bcb9e58930ff9 OpenBLAS-0.2.20.zip
48637eb29f5b492b91459175dcc574b1 OpenBLAS-0.2.20.tar.gz
OpenBLAS 0.2.19 version
Version 0.2.19
1-Sep-2016
common:
* Improved cross compiling.
* Fix the bug on musl libc.
POWER:
* Optimize BLAS on Power8
* Fixed Julia+OpenBLAS bugs on Power8
MIPS:
* Optimize BLAS on MIPS P5600 and I6400 (Thanks, Shivraj Patil, Kaustubh Raste)
ARM:
* Improved on ARM Cortex-A57. (Thanks, Ashwin Sekhar T K)
md5sum
4b155ffcded04d72203b9e0e0bf008da OpenBLAS-0.2.19.zip
28c998054fd377279741c6f0b9ea7941 OpenBLAS-0.2.19.tar.gz
OpenBLAS 0.2.18 version
Version 0.2.18
12-Apr-2016
common:
- If you set MAKE_NB_JOBS flag less or equal than zero, make will be without -j.
x86/x86_64:
- Support building Visual Studio static library. (#813, Thanks, theoractice)
- Fix bugs to pass buidbot CI tests (http://build.openblas.net)
ARM:
- Provide DGEMM 8x4 kernel for Cortex-A57 (Thanks, Ashwin Sekhar T K)
POWER:
- Optimize S and C BLAS3 on Power8
- Optimize BLAS2/1 on Power8
md5sum
4ca49eb1c45b3ca82a0034ed3cc2cef1 OpenBLAS-0.2.18.zip
805e7f660877d588ea7e3792cda2ee65 OpenBLAS-0.2.18.tar.gz
OpenBLAS 0.2.17 version
OpenBLAS 0.2.16 version
Version 0.2.16
15-Mar-2016
common:
- Upgrade LAPACK to 3.6.0 version.
Add BUILD_LAPACK_DEPRECATED option in Makefile.rule to build
LAPACK deprecated functions. - Add MAKE_NB_JOBS option in Makefile.
Force number of make jobs.This is particularly
useful when using distcc. (#735. Thanks, Jerome Robert.) - Redesign unit test. Run unit/regression test at every build (Travis-CI and Appveyor).
- Disable multi-threading for small size swap and ger. (#744. Thanks, Jerome Robert)
- Improve small zger, zgemv, ztrmv using stack alloction (#727. Thanks, Jerome Robert)
- Let openblas_get_num_threads return the number of active threads.
(#760. Thanks, Jerome Robert) - Support illumos(OmniOS). (#749. Thanks, Lauri Tirkkonen)
- Fix LAPACK Dormbr, Dormlq bug. (#711, #713. Thanks, Brendan Tracey)
- Update scipy benchmark script. (#745. Thanks, John Kirkham)
- Avoid potential getenv segfault. (#716)
- Import LAPACK svn bugfix #142-#147,#150-#155
x86/x86_64:
- Optimize trsm kernels for AMD Bulldozer, Piledriver, Steamroller.
- Detect Intel Avoton.
- Detect AMD Trinity, Richland, E2-3200.
- Fix gemv performance bug on Mac OSX Intel Haswell.
- Fix some bugs with CMake and Visual Studio
- Optimize c/zgemv for AMD Bulldozer, Piledriver, Steamroller
- Fix bug with scipy linalg test.
ARM:
- Support and optimize Cortex-A57 AArch64.
(#686. Thanks, Ashwin Sekhar TK) - Fix Android build on ARMV7 (#778. Thanks, Paul Mustiere)
- Update ARMV6 kernels.
- Improve DGEMM for ARM Cortex-A57. (Thanks, Ashwin Sekhar T K)
POWER:
- Fix detection of POWER architecture
(#684. Thanks, Sebastien Villemot) - Optimize D and Z BLAS3 functions for Power8.
md5sum
8fae7cebfefa073c8640e99c4454dc03 OpenBLAS-0.2.16.zip
fef46ab92463bdbb1479dcec594ef6dc OpenBLAS-0.2.16.tar.gz
OpenBLAS 0.2.15 version
Version 0.2.15
27-Oct-2015
common:
-
Support cmake on x86/x86-64. Natively compiling on MS Visual Studio.
(experimental. Thank Hank Anderson for the initial cmake porting work.)On Linux and Mac OSX, OpenBLAS cmake supports assembly kernels. e.g. cmake . make make test (Optional) On Windows MS Visual Studio, OpenBLAS cmake only support C kernels. (OpenBLAS uses AT&T style assembly, which is not supported by MSVC.) e.g. cmake -G "Visual Studio 12 Win64" . Open OpenBLAS.sln and build.
-
Enable MAX_STACK_ALLOC flags by default.
Improve ger and gemv for small matrices. -
Improve gemv parallel with small m and large n case.
-
Improve ?imatcopy when lda==ldb (#633. Thanks, Martin Koehler)
-
Add vecLib benchmarks (#565. Thanks, Andreas Noack.)
-
Fix LAPACK lantr for row major matrices (#634. Thanks, Dan Kortschak)
-
Fix LAPACKE lansy (#640. Thanks, Dan Kortschak)
-
Import bug fixes for LAPACKE s/dormlq, c/zunmlq
-
Raise the signal when pthread_create fails (#668. Thanks, James K. Lowden)
-
Remove g77 from compiler list.
-
Enable AppVeyor Windows CI.
x86/x86-64:
- Support pure C generic kernels for x86/x86-64.
- Support Intel Boardwell and Skylake by Haswell kernels.
- Support AMD Excavator by Steamroller kernels.
- Optimize s/d/c/zdot for Intel SandyBridge and Haswell.
- Optimize s/d/c/zdot for AMD Piledriver and Steamroller.
- Optimize s/d/c/zapxy for Intel SandyBridge and Haswell.
- Optimize s/d/c/zapxy for AMD Piledriver and Steamroller.
- Optimize d/c/zscal for Intel Haswell, dscal for Intel SandyBridge.
- Optimize d/c/zscal for AMD Bulldozer, Piledriver and Steamroller.
- Optimize s/dger for Intel SandyBridge.
- Optimize s/dsymv for Intel SandyBridge.
- Optimize ssymv for Intel Haswell.
- Optimize dgemv for Intel Nehalem and Haswell.
- Optimize dtrmm for Intel Haswell.
ARM:
-
Support Android NDK armeabi-v7a-hard ABI (-mfloat-abi=hard)
e.g. make HOSTCC=gcc CC=arm-linux-androideabi-gcc NO_LAPACK=1 TARGET=ARMV7
POWER:
- Support ppc64le platform (ELF ABI v2. #612. Thanks, Matthew Brandyberry.)
- Support POWER7/8 by POWER6 kernels. (#612. Thanks, Fábio Perez.)
md5sum
c9c181a981897e1cb0192f8589130a6c OpenBLAS-0.2.15.zip
b1190f3d3471685f17cfd1ec1d252ac9 OpenBLAS-0.2.15.tar.gz
OpenBLAS 0.2.14 Version
Version 0.2.14
24-Mar-2015
common:
- Improve OpenBLASConfig.cmake. (#474, #475. Thanks, xantares.)
- Improve
ger
andgemv
for small matrices by stack allocation.
e.g.make -DMAX_STACK_ALLOC=2048
(#482. Thanks, Jerome Robert.) - Introduce
openblas_get_num_threads
andopenblas_get_num_procs
. (#497. Thanks, Erik Schnetter.) - Add ATLAS-style
?geadd
function. (#509. Thanks, Martin Köhler.) - Fix
c/zsyr
bug with negative incx. (#492.) - Fix race condition during shutdown causing a crash in
gotoblas_set_affinity()
. (#508. Thanks, Ton van den Heuvel.)
x86/x86-64:
- Support AMD Streamroller.
ARM:
- Add Cortex-A9 and Cortex-A15 targets.
md5sum
a57e197a34ba8b651347e6453961baab OpenBLAS-0.2.14.zip
53cda7f420e1ba0ea55de536b24c9701 OpenBLAS-0.2.14.tar.gz
OpenBLAS 0.2.13 version
Version 0.2.13
3-Dec-2014
common:
- Add SYMBOLPREFIX and SYMBOLSUFFIX makefile options
for adding a prefix or suffix to all exported symbol names
in the shared library.(#459, Thanks Tony Kelman) - Provide OpenBLASConfig.cmake at installation.
- Fix Fortran compiler detection on FreeBSD.
(#470, Thanks Mike Nolta)
x86/x86-64:
- Add generic kernel files for x86-64. make TARGET=GENERIC
- Fix a bug of sgemm kernel on Intel Sandy Bridge.
- Fix c_check bug on some amd64 systems. (#471, Thanks Mike Nolta)
ARM:
- Support APM's X-Gene 1 AArch64 processors.
Optimize trmm and sgemm. (#465, Thanks Dave Nuechterlein)
md5sum
8147e68c9e3e294a1a26280050bcf6a2 OpenBLAS-0.2.13.zip
bba7b37b5a5b6674fda91dcb2faab145 OpenBLAS-0.2.13.tar.gz
OpenBLAS 0.2.12 version
Version 0.2.12
13-Oct-2014
common:
* Added CBLAS interface for ?omatcopy and ?imatcopy.
* Enable ?gemm3m functions.
* Added benchmark for ?gemm3m.
* Optimized multithreading lower limits.
* Disabled SYMM3M and HEMM3M functions
because of segment violations.
x86/x86-64:
* Improved axpy and symv performance on AMD Bulldozer.
* Improved gemv performance on modern Intel and AMD CPUs.
md5sum
4889408c500aa7fa818d92c0c71d9098 /tmp/OpenBLAS-0.2.12.zip
5df7f175b2db6a2b02ca4ff932e39bc7 /tmp/OpenBLAS-0.2.12.tar.gz
OpenBLAS 0.2.11 version
OpenBLAS ChangeLog
Version 0.2.11
18-Aug-2014
common:
- Added some benchmark codes.
- Fix link error on Linux/musl.(Thanks Isaac Dunham)
x86/x86-64:
- Improved s/c/zgemm performance for Intel Haswell.
- Improved s/d/c/zgemv performance.
- Support the big numa machine.(EXPERIMENT)
ARM:
- Fix detection when cpuinfo uses "Processor". (Thanks Isaiah)
md5sum
946434ece1d7a12ba938902665b47434 OpenBLAS-0.2.11.zip
c456f3c5e84c3ab69ef89b22e616627a OpenBLAS-0.2.11.tar.gz