• Home
  • History
  • Annotate
Name Date Size #Lines LOC

..03-May-2022-

vendor/github.com/H27-Oct-2018-

DockerfileH A D27-Oct-2018242

Gopkg.lockH A D27-Oct-20181.6 KiB

Gopkg.tomlH A D27-Oct-2018210

LICENSEH A D27-Oct-201811.1 KiB

MakefileH A D27-Oct-2018378

README.mdH A D27-Oct-20182.2 KiB

main.goH A D27-Oct-20185.2 KiB

README.md

1NVIDIA GPU Prometheus Exporter
2------------------------------
3
4This is a [Prometheus Exporter](https://prometheus.io/docs/instrumenting/exporters/) for
5exporting NVIDIA GPU metrics. It uses the [Go bindings](https://github.com/mindprince/gonvml)
6for [NVIDIA Management Library](https://developer.nvidia.com/nvidia-management-library-nvml)
7(NVML) which is a C-based API that can be used for monitoring NVIDIA GPU devices.
8Unlike some other similar exporters, it does not call the
9[`nvidia-smi`](https://developer.nvidia.com/nvidia-system-management-interface) binary.
10
11## Building
12
13The repository includes `nvml.h`, so there are no special requirements from the
14build environment. `go get` should be able to build the exporter binary.
15
16```
17go get github.com/mindprince/nvidia_gpu_prometheus_exporter
18```
19
20## Running
21
22The exporter requires the following:
23- access to NVML library (`libnvidia-ml.so.1`).
24- access to the GPU devices.
25
26To make sure that the exporter can access the NVML libraries, either add them
27to the search path for shared libraries. Or set `LD_LIBRARY_PATH` to point to
28their location.
29
30By default the metrics are exposed on port `9445`. This can be updated using
31the `-web.listen-address` flag.
32
33## Running inside a container
34
35There's a docker image available on Docker Hub at
36[mindprince/nvidia_gpu_prometheus_exporter](https://hub.docker.com/r/mindprince/nvidia_gpu_prometheus_exporter/)
37
38If you are running the exporter inside a container, you will need to do the
39following to give the container access to NVML library:
40```
41-e LD_LIBRARY_PATH=<path-where-nvml-is-present>
42--volume <above-path>:<above-path>
43```
44
45And you will need to do one of the following to give it access to the GPU
46devices:
47- Run with `--privileged`
48- If you are on docker v17.04.0-ce or above, run with `--device-cgroup-rule 'c 195:* mrw'`
49- Run with `--device /dev/nvidiactl:/dev/nvidiactl /dev/nvidia0:/dev/nvidia0 /dev/nvidia1:/dev/nvidia1 <and-so-on-for-all-nvidia-devices>`
50
51If you don't want to do the above, you can run it using nvidia-docker.
52
53## Running using [nvidia-docker](https://github.com/NVIDIA/nvidia-docker)
54
55```
56nvidia-docker run -p 9445:9445 -ti mindprince/nvidia_gpu_prometheus_exporter:0.1
57```
58