• Home
  • History
  • Annotate
Name Date Size #Lines LOC

..03-May-2022-

android/H26-Mar-2020-951748

cmake/H03-May-2022-10082

include/H03-May-2022-374269

results/H26-Mar-2020-

snap/H26-Mar-2020-2823

src/H03-May-2022-2,2891,694

.gitignoreH A D26-Mar-202015 22

.gitmodulesH A D26-Mar-2020148 54

.travis.ymlH A D26-Mar-20201.6 KiB5447

CREDITSH A D26-Mar-202086 31

LICENSEH A D26-Mar-20201.2 KiB2520

README.mdH A D26-Mar-20201.7 KiB6753

README.md

1# clpeak
2
3[![Build Status](https://travis-ci.com/krrishnarraj/clpeak.svg?branch=master)](https://travis-ci.com/krrishnarraj/clpeak)
4[![Snap Status](https://build.snapcraft.io/badge/krrishnarraj/clpeak.svg)](https://build.snapcraft.io/user/krrishnarraj/clpeak)
5
6A synthetic benchmarking tool to measure peak capabilities of opencl devices. It only measures the peak metrics that can be achieved using vector operations and does not represent a real-world use case
7
8## Building
9
10```console
11git submodule update --init --recursive --remote
12mkdir build
13cd build
14cmake ..
15cmake --build .
16```
17
18## Sample
19
20```text
21Platform: NVIDIA CUDA
22  Device: Tesla V100-SXM2-16GB
23    Driver version  : 390.77 (Linux x64)
24    Compute units   : 80
25    Clock frequency : 1530 MHz
26
27    Global memory bandwidth (GBPS)
28      float   : 767.48
29      float2  : 810.81
30      float4  : 843.06
31      float8  : 726.12
32      float16 : 735.98
33
34    Single-precision compute (GFLOPS)
35      float   : 15680.96
36      float2  : 15674.50
37      float4  : 15645.58
38      float8  : 15583.27
39      float16 : 15466.50
40
41    No half precision support! Skipped
42
43    Double-precision compute (GFLOPS)
44      double   : 7859.49
45      double2  : 7849.96
46      double4  : 7832.96
47      double8  : 7799.82
48      double16 : 7740.88
49
50    Integer compute (GIOPS)
51      int   : 15653.47
52      int2  : 15654.40
53      int4  : 15655.21
54      int8  : 15659.04
55      int16 : 15608.65
56
57    Transfer bandwidth (GBPS)
58      enqueueWriteBuffer         : 10.64
59      enqueueReadBuffer          : 11.92
60      enqueueMapBuffer(for read) : 9.97
61        memcpy from mapped ptr   : 8.62
62      enqueueUnmap(after write)  : 11.04
63        memcpy to mapped ptr     : 9.16
64
65    Kernel launch latency : 7.22 us
66```
67