1Support, Getting Involved, and FAQ
2==================================
3
4Please do not hesitate to reach out to us via openmp-dev@lists.llvm.org or join
5one of our :ref:`regular calls <calls>`. Some common questions are answered in
6the :ref:`faq`.
7
8.. _calls:
9
10Calls
11-----
12
13OpenMP in LLVM Technical Call
14^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
15
16-   Development updates on OpenMP (and OpenACC) in the LLVM Project, including Clang, optimization, and runtime work.
17-   Join `OpenMP in LLVM Technical Call <https://bluejeans.com/544112769//webrtc>`__.
18-   Time: Weekly call on every Wednesday 7:00 AM Pacific time.
19-   Meeting minutes are `here <https://docs.google.com/document/d/1Tz8WFN13n7yJ-SCE0Qjqf9LmjGUw0dWO9Ts1ss4YOdg/edit>`__.
20-   Status tracking `page <https://openmp.llvm.org/docs>`__.
21
22
23OpenMP in Flang Technical Call
24^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
25-   Development updates on OpenMP and OpenACC in the Flang Project.
26-   Join `OpenMP in Flang Technical Call <https://bit.ly/39eQW3o>`_
27-   Time: Weekly call on every Thursdays 8:00 AM Pacific time.
28-   Meeting minutes are `here <https://docs.google.com/document/d/1yA-MeJf6RYY-ZXpdol0t7YoDoqtwAyBhFLr5thu5pFI>`__.
29-   Status tracking `page <https://docs.google.com/spreadsheets/d/1FvHPuSkGbl4mQZRAwCIndvQx9dQboffiD-xD0oqxgU0/edit#gid=0>`__.
30
31
32.. _faq:
33
34FAQ
35---
36
37.. note::
38   The FAQ is a work in progress and most of the expected content is not
39   yet available. While you can expect changes, we always welcome feedback and
40   additions. Please contact, e.g., through ``openmp-dev@lists.llvm.org``.
41
42
43Q: How to contribute a patch to the webpage or any other part?
44^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
45
46All patches go through the regular `LLVM review process
47<https://llvm.org/docs/Contributing.html#how-to-submit-a-patch>`_.
48
49
50.. _build_offload_capable_compiler:
51
52Q: How to build an OpenMP GPU offload capable compiler?
53^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
54To build an *effective* OpenMP offload capable compiler, only one extra CMake
55option, `LLVM_ENABLE_RUNTIMES="openmp"`, is needed when building LLVM (Generic
56information about building LLVM is available `here
57<https://llvm.org/docs/GettingStarted.html>`__.).  Make sure all backends that
58are targeted by OpenMP to be enabled. By default, Clang will be built with all
59backends enabled.  When building with `LLVM_ENABLE_RUNTIMES="openmp"` OpenMP
60should not be enabled in `LLVM_ENABLE_PROJECTS` because it is enabled by
61default.
62
63For Nvidia offload, please see :ref:`build_nvidia_offload_capable_compiler`.
64For AMDGPU offload, please see :ref:`build_amdgpu_offload_capable_compiler`.
65
66.. note::
67  The compiler that generates the offload code should be the same (version) as
68  the compiler that builds the OpenMP device runtimes. The OpenMP host runtime
69  can be built by a different compiler.
70
71.. _advanced_builds: https://llvm.org//docs/AdvancedBuilds.html
72
73.. _build_nvidia_offload_capable_compiler:
74
75Q: How to build an OpenMP NVidia offload capable compiler?
76^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
77The Cuda SDK is required on the machine that will execute the openmp application.
78
79If your build machine is not the target machine or automatic detection of the
80available GPUs failed, you should also set:
81
82- `CLANG_OPENMP_NVPTX_DEFAULT_ARCH=sm_XX` where `XX` is the architecture of your GPU, e.g, 80.
83- `LIBOMPTARGET_NVPTX_COMPUTE_CAPABILITIES=YY` where `YY` is the numeric compute capacity of your GPU, e.g., 75.
84
85
86.. _build_amdgpu_offload_capable_compiler:
87
88Q: How to build an OpenMP AMDGPU offload capable compiler?
89^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
90A subset of the `ROCm <https://github.com/radeonopencompute>`_ toolchain is
91required to build the LLVM toolchain and to execute the openmp application.
92Either install ROCm somewhere that cmake's find_package can locate it, or
93build the required subcomponents ROCt and ROCr from source.
94
95The two components used are ROCT-Thunk-Interface, roct, and ROCR-Runtime, rocr.
96Roct is the userspace part of the linux driver. It calls into the driver which
97ships with the linux kernel. It is an implementation detail of Rocr from
98OpenMP's perspective. Rocr is an implementation of `HSA
99<http://www.hsafoundation.com>`_.
100
101.. code-block:: text
102
103  SOURCE_DIR=same-as-llvm-source # e.g. the checkout of llvm-project, next to openmp
104  BUILD_DIR=somewhere
105  INSTALL_PREFIX=same-as-llvm-install
106
107  cd $SOURCE_DIR
108  git clone git@github.com:RadeonOpenCompute/ROCT-Thunk-Interface.git -b roc-4.1.x \
109    --single-branch
110  git clone git@github.com:RadeonOpenCompute/ROCR-Runtime.git -b rocm-4.1.x \
111    --single-branch
112
113  cd $BUILD_DIR && mkdir roct && cd roct
114  cmake $SOURCE_DIR/ROCT-Thunk-Interface/ -DCMAKE_INSTALL_PREFIX=$INSTALL_PREFIX \
115    -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=OFF
116  make && make install
117
118  cd $BUILD_DIR && mkdir rocr && cd rocr
119  cmake $SOURCE_DIR/ROCR-Runtime/src -DIMAGE_SUPPORT=OFF \
120    -DCMAKE_INSTALL_PREFIX=$INSTALL_PREFIX -DCMAKE_BUILD_TYPE=Release \
121    -DBUILD_SHARED_LIBS=ON
122  make && make install
123
124``IMAGE_SUPPORT`` requires building rocr with clang and is not used by openmp.
125
126Provided cmake's find_package can find the ROCR-Runtime package, LLVM will
127build a tool ``bin/amdgpu-arch`` which will print a string like ``gfx906`` when
128run if it recognises a GPU on the local system. LLVM will also build a shared
129library, libomptarget.rtl.amdgpu.so, which is linked against rocr.
130
131With those libraries installed, then LLVM build and installed, try:
132
133.. code-block:: shell
134
135    clang -O2 -fopenmp -fopenmp-targets=amdgcn-amd-amdhsa example.c -o example && ./example
136
137Q: What are the known limitations of OpenMP AMDGPU offload?
138^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
139LD_LIBRARY_PATH is presently required to find the openmp libraries.
140
141There is no libc. That is, malloc and printf do not exist. Also no libm, so
142functions like cos(double) will not work from target regions.
143
144Cards from the gfx10 line, 'navi', that use wave32 are not yet implemented.
145
146Some versions of the driver for the radeon vii (gfx906) will error unless the
147environment variable 'export HSA_IGNORE_SRAMECC_MISREPORT=1' is set.
148
149It is a recent addition to LLVM and the implementation differs from that which
150has been shipping in ROCm and AOMP for some time. Early adopters will encounter
151bugs.
152
153Q: Does OpenMP offloading support work in pre-packaged LLVM releases?
154^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
155For now, the answer is most likely *no*. Please see :ref:`build_offload_capable_compiler`.
156
157Q: Does OpenMP offloading support work in packages distributed as part of my OS?
158^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
159For now, the answer is most likely *no*. Please see :ref:`build_offload_capable_compiler`.
160
161
162.. _math_and_complex_in_target_regions:
163
164Q: Does Clang support `<math.h>` and `<complex.h>` operations in OpenMP target on GPUs?
165^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
166
167Yes, LLVM/Clang allows math functions and complex arithmetic inside of OpenMP
168target regions that are compiled for GPUs.
169
170Clang provides a set of wrapper headers that are found first when `math.h` and
171`complex.h`, for C, `cmath` and `complex`, for C++, or similar headers are
172included by the application. These wrappers will eventually include the system
173version of the corresponding header file after setting up a target device
174specific environment. The fact that the system header is included is important
175because they differ based on the architecture and operating system and may
176contain preprocessor, variable, and function definitions that need to be
177available in the target region regardless of the targeted device architecture.
178However, various functions may require specialized device versions, e.g.,
179`sin`, and others are only available on certain devices, e.g., `__umul64hi`. To
180provide "native" support for math and complex on the respective architecture,
181Clang will wrap the "native" math functions, e.g., as provided by the device
182vendor, in an OpenMP begin/end declare variant. These functions will then be
183picked up instead of the host versions while host only variables and function
184definitions are still available. Complex arithmetic and functions are support
185through a similar mechanism. It is worth noting that this support requires
186`extensions to the OpenMP begin/end declare variant context selector
187<https://clang.llvm.org/docs/AttributeReference.html#pragma-omp-declare-variant>`__
188that are exposed through LLVM/Clang to the user as well.
189
190Q: What is a way to debug errors from mapping memory to a target device?
191^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
192
193An experimental way to debug these errors is to use :ref:`remote process
194offloading <remote_offloading_plugin>`.
195By using ``libomptarget.rtl.rpc.so`` and ``openmp-offloading-server``, it is
196possible to explicitly perform memory transfers between processes on the host
197CPU and run sanitizers while doing so in order to catch these errors.
198
199Q: Why does my application say "Named symbol not found" and abort when I run it?
200^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
201
202This is most likely caused by trying to use OpenMP offloading with static
203libraries. Static libraries do not contain any device code, so when the runtime
204attempts to execute the target region it will not be found and you will get an
205an error like this.
206
207.. code-block:: text
208
209   CUDA error: Loading '__omp_offloading_fd02_3231c15__Z3foov_l2' Failed
210   CUDA error: named symbol not found
211   Libomptarget error: Unable to generate entries table for device id 0.
212
213Currently, the only solution is to change how the application is built and avoid
214the use of static libraries.
215
216Q: Can I use dynamically linked libraries with OpenMP offloading?
217^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
218
219Dynamically linked libraries can be only used if there is no device code split
220between the library and application. Anything declared on the device inside the
221shared library will not be visible to the application when it's linked.
222
223Q: How to build an OpenMP offload capable compiler with an outdated host compiler?
224^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
225
226Enabling the OpenMP runtime will perform a two-stage build for you.
227If your host compiler is different from your system-wide compiler, you may need
228to set the CMake variable `GCC_INSTALL_PREFIX` so clang will be able to find the
229correct GCC toolchain in the second stage of the build.
230
231For example, if your system-wide GCC installation is too old to build LLVM and
232you would like to use a newer GCC, set the CMake variable `GCC_INSTALL_PREFIX`
233to inform clang of the GCC installation you would like to use in the second stage.
234
235Q: How can I include OpenMP offloading support in my CMake project?
236^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
237
238Currently, there is an experimental CMake find module for OpenMP target
239offloading provided by LLVM. It will attempt to find OpenMP target offloading
240support for your compiler. The flags necessary for OpenMP target offloading will
241be loaded into the ``OpenMPTarget::OpenMPTarget_<device>`` target or the
242``OpenMPTarget_<device>_FLAGS`` variable if successful. Currently supported
243devices are ``AMDGPU`` and ``NVPTX``.
244
245To use this module, simply add the path to CMake's current module path and call
246``find_package``. The module will be installed with your OpenMP installation by
247default. Including OpenMP offloading support in an application should now only
248require a few additions.
249
250.. code-block:: cmake
251
252  cmake_minimum_required(VERSION 3.13.4)
253  project(offloadTest VERSION 1.0 LANGUAGES CXX)
254
255  list(APPEND CMAKE_MODULE_PATH "${PATH_TO_OPENMP_INSTALL}/lib/cmake/openmp")
256
257  find_package(OpenMPTarget REQUIRED NVPTX)
258
259  add_executable(offload)
260  target_link_libraries(offload PRIVATE OpenMPTarget::OpenMPTarget_NVPTX)
261  target_sources(offload PRIVATE ${CMAKE_CURRENT_SOURCE_DIR}/src/Main.cpp)
262
263Using this module requires at least CMake version 3.13.4. Supported languages
264are C and C++ with Fortran support planned in the future. Compiler support is
265best for Clang but this module should work for other compiler vendors such as
266IBM, GNU.
267