Tuesday, 8 January 2013

OpenMP C and C++ Application Program Interface - 3

Directives


Directives are based on #pragma directives defined in the C and C++ standards.
Compilers that support the OpenMP C and C++ API will include a command-line
option that activates and allows interpretation of all OpenMP compiler directives.


Directive Format
The syntax of an OpenMP directive is formally specified by the grammar in
Appendix C, and informally as follows:

#pragma omp directive-name [clause[ [,] clause]...] new-line


Each directive starts with #pragma omp, to reduce the potential for conflict with
other (non-OpenMP or vendor extensions to OpenMP) pragma directives with the
same names. The remainder of the directive follows the conventions of the C and
C++ standards for compiler directives. In particular, white space can be used before
and after the #, and sometimes white space must be used to separate the words in a
directive. Preprocessing tokens following the #pragma omp are subject to macro
replacement.
Directives are case-sensitive. The order in which clauses appear in directives is not
significant. Clauses on directives may be repeated as needed, subject to the
restrictions listed in the description of each clause. If variable-list appears in a clause,
it must specify only variables. Only one directive-name can be specified per directive.
For example, the following directive is not allowed:


/* ERROR - multiple directive names not allowed */
#pragma omp parallel barrier

An OpenMP directive applies to at most one succeeding statement, which must be a
structured block.

Conditional Compilation
The _OPENMP macro name is defined by OpenMP-compliant implementations as the
decimal constant yyyymm, which will be the year and month of the approved
specification. This macro must not be the subject of a #define or a #undef
preprocessing directive.


#ifdef _OPENMP
iam = omp_get_thread_num() + index;
#endif


If vendors define extensions to OpenMP, they may specify additional predefined
macros.

parallel Construct
The following directive defines a parallel region, which is a region of the program
that is to be executed by multiple threads in parallel. This is the fundamental
construct that starts parallel execution.


#pragma omp parallel [clause[ [, ]clause] ...] new-line
structured-block


The clause is one of the following:
#ifdef _OPENMP
iam = omp_get_thread_num() + index;
#endif
#pragma omp parallel [clause[ [, ]clause] ...] new-line
structured-block
if(scalar-expression)
private(variable-list)
firstprivate(variable-list)
default(shared | none)
shared(variable-list)
copyin(variable-list)
reduction(operator: variable-list)
num_threads(integer-expression)


When a thread encounters a parallel construct, a team of threads is created if one of
the following cases is true:
n No if clause is present.
n The if expression evaluates to a nonzero value.
This thread becomes the master thread of the team, with a thread number of 0, and
all threads in the team, including the master thread, execute the region in parallel. If
the value of the if expression is zero, the region is serialized.
To determine the number of threads that are requested, the following rules will be
considered in order. The first rule whose condition is met will be applied:
1. If the num_threads clause is present, then the value of the integer expression is
the number of threads requested.
2. If the omp_set_num_threads library function has been called, then the value
of the argument in the most recently executed call is the number of threads
requested.
3. If the environment variable OMP_NUM_THREADS is defined, then the value of this
environment variable is the number of threads requested.
4. If none of the methods above were used, then the number of threads requested is
implementation-defined.
If the num_threads clause is present then it supersedes the number of threads
requested by the omp_set_num_threads library function or the
OMP_NUM_THREADS environment variable only for the parallel region it is applied
to. Subsequent parallel regions are not affected by it.
The number of threads that execute the parallel region also depends upon whether
or not dynamic adjustment of the number of threads is enabled. If dynamic
adjustment is disabled, then the requested number of threads will execute the
parallel region. If dynamic adjustment is enabled then the requested number of
threads is the maximum number of threads that may execute the parallel region.
If a parallel region is encountered while dynamic adjustment of the number of
threads is disabled, and the number of threads requested for the parallel region
exceeds the number that the run-time system can supply, the behavior of the
program is implementation-defined. An implementation may, for example, interrupt
the execution of the program, or it may serialize the parallel region.
The omp_set_dynamic library function and the OMP_DYNAMIC environment
variable can be used to enable and disable dynamic adjustment of the number of
threads.


The number of physical processors actually hosting the threads at any given time is
implementation-defined. Once created, the number of threads in the team remains
constant for the duration of that parallel region. It can be changed either explicitly
by the user or automatically by the run-time system from one parallel region to
another.
The statements contained within the dynamic extent of the parallel region are
executed by each thread, and each thread can execute a path of statements that is
different from the other threads. Directives encountered outside the lexical extent of
a parallel region are referred to as orphaned directives.
There is an implied barrier at the end of a parallel region. Only the master thread of
the team continues execution at the end of a parallel region.
If a thread in a team executing a parallel region encounters another parallel
construct, it creates a new team, and it becomes the master of that new team. Nested
parallel regions are serialized by default. As a result, by default, a nested parallel
region is executed by a team composed of one thread. The default behavior may be
changed by using either the runtime library function omp_set_nested or the
environment variable OMP_NESTED. However, the number of threads in a team that
execute a nested parallel region is implementation-defined.
Restrictions to the parallel directive are as follows:
n At most one if clause can appear on the directive.
n It is unspecified whether any side effects inside the if expression or
num_threads expression occur.
n A throw executed inside a parallel region must cause execution to resume within
the dynamic extent of the same structured block, and it must be caught by the
same thread that threw the exception.
n Only a single num_threads clause can appear on the directive. The
num_threads expression is evaluated outside the context of the parallel region,
and must evaluate to a positive integer value.
n The order of evaluation of the if and num_threads clauses is unspecified.














No comments:

Post a Comment