multithreading - OpenMP's mechanism for spreading threads out evenly -
openmp tries spread out threads across cores evenly possible, how work?
ultimately, os deciding how spread them. openmp recommend os (similar using likely
macro or register
keyword in c).
if we're running job num_threads
threads on machine num_cores
cores, none of in use, fair assume threads spread out across cores evenly (and assuming num_threads <= num_cores
, have pure parallelism), since os should working in our best interest , spreading load nicely.
i see graphs of strong scaling x axis # cores. assume maximum number of threads used run job <= number of cores , cores relatively idle?
or of moot point.
the scheduling of openmp threads on cores and/or hardware threads of machine responsibility of operating system. decide based on own heuristics , when start / stop / migrate them...
however, openmp gives tools direct / restrict span of choices os has taking decisions. example, have access to:
- the number of openmp threads launch on parallel region:
omp_num_threads
environment variable,num_threads
clause,omp_set_num_threads()
function - the logical cores threads can scheduled os:
omp_places
environment variable. - the optional pinning policy threads:
omp_proc_bind
environment variable,proc_bind
clause.
with that, have level of control steer os decisions, ultimately, remains in control of actual scheduling. , decisions take not have thought (especially when don't use placement or binding) since machine workload , global scheduling policy applies might interfere think have been optimal code. example, on numa (non-uniform memory access) machine, considerations such memory used on various nodes , memory segment belongs process might prevent seemingly spreading of threads across chips, leading cpu local contentions...
Comments
Post a Comment