Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inconsistency between openblas_get_num_threads and openblas_set_num_threads #760

Closed
luc-j-bourhis opened this issue Jan 25, 2016 · 2 comments

Comments

@luc-j-bourhis
Copy link

Consider the following C code:

#include <cblas.h>
#include <stdio.h>

int main() {
  int initial_thread_count = openblas_get_num_threads();
  if(initial_thread_count >= 2) {
    int new_thread_count = initial_thread_count/2;
    openblas_set_num_threads(new_thread_count);
    printf("Initially: #threads = %d\n", initial_thread_count);
    printf("Setting #threads to %d: ", new_thread_count);
    int reported_threads = openblas_get_num_threads();
    if(reported_threads == new_thread_count) {
      printf("passed!\n");
    }
    else {
      printf("failed (#threads = %d)\n", reported_threads);
    }
  }
}

On my MacPro, it prints

OpenBLAS build configuration:
DYNAMIC_ARCH NO_AFFINITY Nehalem
Initially: #threads = 6
Setting #threads to 3: failed (#threads = 6)

So openblas_set_num_threads failed to reduce the number of threads. My machine is a MacPro 2010 with 6 physical cores and I compiled OpenBLAS with

make CC=clang FC=gfortran USE_THREAD=1 NUM_THREADS=16 DYNAMIC_ARCH=1 NO_STATIC=1

I specified NUM_THREADS because I wanted libopenblas.dylib to be usable on other machines with more cores. Here are the versions of the compilers:

~> clang --version
Apple LLVM version 7.0.2 (clang-700.1.81)
Target: x86_64-apple-darwin15.3.0
Thread model: posix

~> gfortran --version
GNU Fortran (MacPorts gcc5 5.3.0_0) 5.3.0

Either I am missing something badly or this is a bug!

@jeromerobert
Copy link
Contributor

I see the same thing. I would say there is a bug in goto_set_num_threads (blas_server.c) because it only set the blas_num_threads global if num_threads > blas_num_threads.

@jeromerobert
Copy link
Contributor

I was wrong. We have 2 globals:

  • blas_num_cpu: the number of threads used
  • blas_num_threads: the number of created threads

blas_num_threads can only grow (threads may be dormant). So we want openblas_get_num_threads to return blas_num_cpu not blas_num_threads.

There is also a openblas_get_num_procs function but it does not return blas_num_cpu. It return the physical number of cpu on the machine.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants