Name | Date | Size | #Lines | LOC | ||
---|---|---|---|---|---|---|
.. | 03-May-2022 | - | ||||
android/ | H | 26-Mar-2020 | - | 951 | 748 | |
cmake/ | H | 03-May-2022 | - | 100 | 82 | |
include/ | H | 03-May-2022 | - | 374 | 269 | |
results/ | H | 26-Mar-2020 | - | |||
snap/ | H | 26-Mar-2020 | - | 28 | 23 | |
src/ | H | 03-May-2022 | - | 2,289 | 1,694 | |
.gitignore | H A D | 26-Mar-2020 | 15 | 2 | 2 | |
.gitmodules | H A D | 26-Mar-2020 | 148 | 5 | 4 | |
.travis.yml | H A D | 26-Mar-2020 | 1.6 KiB | 54 | 47 | |
CREDITS | H A D | 26-Mar-2020 | 86 | 3 | 1 | |
LICENSE | H A D | 26-Mar-2020 | 1.2 KiB | 25 | 20 | |
README.md | H A D | 26-Mar-2020 | 1.7 KiB | 67 | 53 |
README.md
1# clpeak 2 3[![Build Status](https://travis-ci.com/krrishnarraj/clpeak.svg?branch=master)](https://travis-ci.com/krrishnarraj/clpeak) 4[![Snap Status](https://build.snapcraft.io/badge/krrishnarraj/clpeak.svg)](https://build.snapcraft.io/user/krrishnarraj/clpeak) 5 6A synthetic benchmarking tool to measure peak capabilities of opencl devices. It only measures the peak metrics that can be achieved using vector operations and does not represent a real-world use case 7 8## Building 9 10```console 11git submodule update --init --recursive --remote 12mkdir build 13cd build 14cmake .. 15cmake --build . 16``` 17 18## Sample 19 20```text 21Platform: NVIDIA CUDA 22 Device: Tesla V100-SXM2-16GB 23 Driver version : 390.77 (Linux x64) 24 Compute units : 80 25 Clock frequency : 1530 MHz 26 27 Global memory bandwidth (GBPS) 28 float : 767.48 29 float2 : 810.81 30 float4 : 843.06 31 float8 : 726.12 32 float16 : 735.98 33 34 Single-precision compute (GFLOPS) 35 float : 15680.96 36 float2 : 15674.50 37 float4 : 15645.58 38 float8 : 15583.27 39 float16 : 15466.50 40 41 No half precision support! Skipped 42 43 Double-precision compute (GFLOPS) 44 double : 7859.49 45 double2 : 7849.96 46 double4 : 7832.96 47 double8 : 7799.82 48 double16 : 7740.88 49 50 Integer compute (GIOPS) 51 int : 15653.47 52 int2 : 15654.40 53 int4 : 15655.21 54 int8 : 15659.04 55 int16 : 15608.65 56 57 Transfer bandwidth (GBPS) 58 enqueueWriteBuffer : 10.64 59 enqueueReadBuffer : 11.92 60 enqueueMapBuffer(for read) : 9.97 61 memcpy from mapped ptr : 8.62 62 enqueueUnmap(after write) : 11.04 63 memcpy to mapped ptr : 9.16 64 65 Kernel launch latency : 7.22 us 66``` 67