Skip to content

Releases: OpenMathLib/OpenBLAS

OpenBLAS 0.2.20 version

24 Jul 04:11
Compare
Choose a tag to compare

Version 0.2.20
24-Jul-2017

common:

    * Improved CMake support
    * Fixed several thread race and locking bugs
    * Fixed default LAPACK optimization level
    * Updated LAPACK to 3.7.0
    * Added ReLAPACK (/~https://github.com/HPAC/ReLAPACK), make BUILD_RELAPACK=1

POWER:

    * Optimizations for Power9
    * Fixed several Power8 assembly bugs

ARM:

    * New optimized Vulcan and ThunderX2T99 targets
    * Support for ARMV7 SOFT_FP ABI  (make ARM_SOFTFP_ABI=1)
    * Detect all cpu cores including offline ones
    * Fix compilation with CLANG
    * Support building a shared library for Android

MIPS:

    * Fixed several threading issues
    * Fix compilation with CLANG

x86_64:

    * Detect Intel Bay Trail and Apollo Lake
    * Detect Intel Sky Lake and Kaby Lake
    * Detect Intel Knights Landing
    * Detect AMD A8, A10, A12 and Ryzen
    * Support 64bit builds with Visual Studio
    * Fix building with Intel and PGI compilers
    * Fix building with MINGW and TDM-GCC
    * Fix cmake builds for Haswell and related cpus
    * Fix building for Sandybridge with CLANG 3.9
    * Add support for the FLANG compiler

IBM Z:

    * New target z13 with BLAS3 optimizations

md5sum
e0d47385423944cbd14bcb9e58930ff9 OpenBLAS-0.2.20.zip
48637eb29f5b492b91459175dcc574b1 OpenBLAS-0.2.20.tar.gz
Download OpenBLAS

OpenBLAS 0.2.19 version

01 Sep 04:03
Compare
Choose a tag to compare

Version 0.2.19
1-Sep-2016

common:

    * Improved cross compiling.
    * Fix the bug on musl libc.

POWER:

    * Optimize BLAS on Power8
    * Fixed Julia+OpenBLAS bugs on Power8

MIPS:

    * Optimize BLAS on MIPS P5600 and I6400 (Thanks, Shivraj Patil, Kaustubh Raste)

ARM:

    * Improved on ARM Cortex-A57. (Thanks, Ashwin Sekhar T K)

md5sum
4b155ffcded04d72203b9e0e0bf008da OpenBLAS-0.2.19.zip
28c998054fd377279741c6f0b9ea7941 OpenBLAS-0.2.19.tar.gz
Download OpenBLAS

OpenBLAS 0.2.18 version

12 Apr 19:34
Compare
Choose a tag to compare

Version 0.2.18
12-Apr-2016

common:

  • If you set MAKE_NB_JOBS flag less or equal than zero, make will be without -j.

x86/x86_64:

ARM:

  • Provide DGEMM 8x4 kernel for Cortex-A57 (Thanks, Ashwin Sekhar T K)

POWER:

  • Optimize S and C BLAS3 on Power8
  • Optimize BLAS2/1 on Power8

md5sum
4ca49eb1c45b3ca82a0034ed3cc2cef1 OpenBLAS-0.2.18.zip
805e7f660877d588ea7e3792cda2ee65 OpenBLAS-0.2.18.tar.gz
Download OpenBLAS

OpenBLAS 0.2.17 version

21 Mar 00:55
Compare
Choose a tag to compare

Version 0.2.17
20-Mar-2016

common:

  • Enable BUILD_LAPACK_DEPRECATED=1 by default.

md5sum
5f04c53183a5bc785b9900eb9bab44ac /tmp/OpenBLAS-0.2.17.zip
664a12807f2a2a7cda4781e3ab2ae0e1 /tmp/OpenBLAS-0.2.17.tar.gz

Download OpenBLAS

OpenBLAS 0.2.16 version

15 Mar 19:03
Compare
Choose a tag to compare

Version 0.2.16
15-Mar-2016

common:

  • Upgrade LAPACK to 3.6.0 version.
    Add BUILD_LAPACK_DEPRECATED option in Makefile.rule to build
    LAPACK deprecated functions.
  • Add MAKE_NB_JOBS option in Makefile.
    Force number of make jobs.This is particularly
    useful when using distcc. (#735. Thanks, Jerome Robert.)
  • Redesign unit test. Run unit/regression test at every build (Travis-CI and Appveyor).
  • Disable multi-threading for small size swap and ger. (#744. Thanks, Jerome Robert)
  • Improve small zger, zgemv, ztrmv using stack alloction (#727. Thanks, Jerome Robert)
  • Let openblas_get_num_threads return the number of active threads.
    (#760. Thanks, Jerome Robert)
  • Support illumos(OmniOS). (#749. Thanks, Lauri Tirkkonen)
  • Fix LAPACK Dormbr, Dormlq bug. (#711, #713. Thanks, Brendan Tracey)
  • Update scipy benchmark script. (#745. Thanks, John Kirkham)
  • Avoid potential getenv segfault. (#716)
  • Import LAPACK svn bugfix #142-#147,#150-#155

x86/x86_64:

  • Optimize trsm kernels for AMD Bulldozer, Piledriver, Steamroller.
  • Detect Intel Avoton.
  • Detect AMD Trinity, Richland, E2-3200.
  • Fix gemv performance bug on Mac OSX Intel Haswell.
  • Fix some bugs with CMake and Visual Studio
  • Optimize c/zgemv for AMD Bulldozer, Piledriver, Steamroller
  • Fix bug with scipy linalg test.

ARM:

  • Support and optimize Cortex-A57 AArch64.
    (#686. Thanks, Ashwin Sekhar TK)
  • Fix Android build on ARMV7 (#778. Thanks, Paul Mustiere)
  • Update ARMV6 kernels.
  • Improve DGEMM for ARM Cortex-A57. (Thanks, Ashwin Sekhar T K)

POWER:

  • Fix detection of POWER architecture
    (#684. Thanks, Sebastien Villemot)
  • Optimize D and Z BLAS3 functions for Power8.

md5sum
8fae7cebfefa073c8640e99c4454dc03 OpenBLAS-0.2.16.zip
fef46ab92463bdbb1479dcec594ef6dc OpenBLAS-0.2.16.tar.gz

Download OpenBLAS

OpenBLAS 0.2.15 version

27 Oct 20:51
Compare
Choose a tag to compare

Version 0.2.15
27-Oct-2015

common:

  • Support cmake on x86/x86-64. Natively compiling on MS Visual Studio.
    (experimental. Thank Hank Anderson for the initial cmake porting work.)

      On Linux and Mac OSX, OpenBLAS cmake supports assembly kernels.
      e.g. cmake .
           make
           make test (Optional)
    
      On Windows MS Visual Studio, OpenBLAS cmake only support C kernels.
      (OpenBLAS uses AT&T style assembly, which is not supported by MSVC.)
      e.g. cmake -G "Visual Studio 12 Win64" .
           Open OpenBLAS.sln and build.
    
  • Enable MAX_STACK_ALLOC flags by default.
    Improve ger and gemv for small matrices.

  • Improve gemv parallel with small m and large n case.

  • Improve ?imatcopy when lda==ldb (#633. Thanks, Martin Koehler)

  • Add vecLib benchmarks (#565. Thanks, Andreas Noack.)

  • Fix LAPACK lantr for row major matrices (#634. Thanks, Dan Kortschak)

  • Fix LAPACKE lansy (#640. Thanks, Dan Kortschak)

  • Import bug fixes for LAPACKE s/dormlq, c/zunmlq

  • Raise the signal when pthread_create fails (#668. Thanks, James K. Lowden)

  • Remove g77 from compiler list.

  • Enable AppVeyor Windows CI.

x86/x86-64:

  • Support pure C generic kernels for x86/x86-64.
  • Support Intel Boardwell and Skylake by Haswell kernels.
  • Support AMD Excavator by Steamroller kernels.
  • Optimize s/d/c/zdot for Intel SandyBridge and Haswell.
  • Optimize s/d/c/zdot for AMD Piledriver and Steamroller.
  • Optimize s/d/c/zapxy for Intel SandyBridge and Haswell.
  • Optimize s/d/c/zapxy for AMD Piledriver and Steamroller.
  • Optimize d/c/zscal for Intel Haswell, dscal for Intel SandyBridge.
  • Optimize d/c/zscal for AMD Bulldozer, Piledriver and Steamroller.
  • Optimize s/dger for Intel SandyBridge.
  • Optimize s/dsymv for Intel SandyBridge.
  • Optimize ssymv for Intel Haswell.
  • Optimize dgemv for Intel Nehalem and Haswell.
  • Optimize dtrmm for Intel Haswell.

ARM:

  • Support Android NDK armeabi-v7a-hard ABI (-mfloat-abi=hard)

      e.g. make HOSTCC=gcc CC=arm-linux-androideabi-gcc NO_LAPACK=1 TARGET=ARMV7
    
  • Fix lock, rpcc bugs (#616, #617. Thanks, Grazvydas Ignotas)

POWER:

  • Support ppc64le platform (ELF ABI v2. #612. Thanks, Matthew Brandyberry.)
  • Support POWER7/8 by POWER6 kernels. (#612. Thanks, Fábio Perez.)

md5sum
c9c181a981897e1cb0192f8589130a6c OpenBLAS-0.2.15.zip
b1190f3d3471685f17cfd1ec1d252ac9 OpenBLAS-0.2.15.tar.gz
Download OpenBLAS

OpenBLAS 0.2.14 Version

24 Mar 20:12
Compare
Choose a tag to compare

Version 0.2.14
24-Mar-2015

common:

  • Improve OpenBLASConfig.cmake. (#474, #475. Thanks, xantares.)
  • Improve ger and gemv for small matrices by stack allocation.
    e.g. make -DMAX_STACK_ALLOC=2048 (#482. Thanks, Jerome Robert.)
  • Introduce openblas_get_num_threads and openblas_get_num_procs. (#497. Thanks, Erik Schnetter.)
  • Add ATLAS-style ?geadd function. (#509. Thanks, Martin Köhler.)
  • Fix c/zsyr bug with negative incx. (#492.)
  • Fix race condition during shutdown causing a crash in gotoblas_set_affinity(). (#508. Thanks, Ton van den Heuvel.)

x86/x86-64:

  • Support AMD Streamroller.

ARM:

  • Add Cortex-A9 and Cortex-A15 targets.

md5sum
a57e197a34ba8b651347e6453961baab OpenBLAS-0.2.14.zip
53cda7f420e1ba0ea55de536b24c9701 OpenBLAS-0.2.14.tar.gz

OpenBLAS 0.2.13 version

03 Dec 15:18
Compare
Choose a tag to compare

Version 0.2.13
3-Dec-2014

common:

  • Add SYMBOLPREFIX and SYMBOLSUFFIX makefile options
    for adding a prefix or suffix to all exported symbol names
    in the shared library.(#459, Thanks Tony Kelman)
  • Provide OpenBLASConfig.cmake at installation.
  • Fix Fortran compiler detection on FreeBSD.
    (#470, Thanks Mike Nolta)

x86/x86-64:

  • Add generic kernel files for x86-64. make TARGET=GENERIC
  • Fix a bug of sgemm kernel on Intel Sandy Bridge.
  • Fix c_check bug on some amd64 systems. (#471, Thanks Mike Nolta)

ARM:

  • Support APM's X-Gene 1 AArch64 processors.
    Optimize trmm and sgemm. (#465, Thanks Dave Nuechterlein)

md5sum
8147e68c9e3e294a1a26280050bcf6a2 OpenBLAS-0.2.13.zip
bba7b37b5a5b6674fda91dcb2faab145 OpenBLAS-0.2.13.tar.gz

OpenBLAS 0.2.12 version

13 Oct 09:12
Compare
Choose a tag to compare

Version 0.2.12
13-Oct-2014
common:
* Added CBLAS interface for ?omatcopy and ?imatcopy.
* Enable ?gemm3m functions.
* Added benchmark for ?gemm3m.
* Optimized multithreading lower limits.
* Disabled SYMM3M and HEMM3M functions
because of segment violations.

x86/x86-64:
* Improved axpy and symv performance on AMD Bulldozer.
* Improved gemv performance on modern Intel and AMD CPUs.

md5sum
4889408c500aa7fa818d92c0c71d9098 /tmp/OpenBLAS-0.2.12.zip
5df7f175b2db6a2b02ca4ff932e39bc7 /tmp/OpenBLAS-0.2.12.tar.gz

OpenBLAS 0.2.11 version

26 Aug 08:19
Compare
Choose a tag to compare

OpenBLAS ChangeLog

Version 0.2.11
18-Aug-2014

common:

  • Added some benchmark codes.
  • Fix link error on Linux/musl.(Thanks Isaac Dunham)

x86/x86-64:

  • Improved s/c/zgemm performance for Intel Haswell.
  • Improved s/d/c/zgemv performance.
  • Support the big numa machine.(EXPERIMENT)

ARM:

  • Fix detection when cpuinfo uses "Processor". (Thanks Isaiah)

md5sum
946434ece1d7a12ba938902665b47434 OpenBLAS-0.2.11.zip
c456f3c5e84c3ab69ef89b22e616627a OpenBLAS-0.2.11.tar.gz