1Performance Counters 2==================== 3 4FFTW measures execution time in the planning stage, optionally taking advantage 5of hardware performance counters. This document describes the supported 6counters and additional steps needed to enable each on different architectures. 7 8See `./configure --help` for flags for enabling each supported counter. 9See [kernel/cycle.h](kernel/cycle.h) for the code that accesses the counters. 10 11ARMv7-A (armv7a) 12================ 13 14`CNTVCT`: Virtual Count Register in VMSA 15-------------------------------------- 16 17A 64-bit counter part of Virtual Memory System Architecture. 18Section B4.1.34 in ARM Architecture Reference Manual ARMv7-A/ARMv7-R 19 20For access from user mode, requires `CNTKCTL.PL0VCTEN == 1`, which must 21be set in kernel mode on each CPU: 22 23 #define CNTKCTL_PL0VCTEN 0x2 /* B4.1.26 in ARM Architecture Rreference */ 24 uint32_t r; 25 asm volatile("mrc p15, 0, %0, c14, c1, 0" : "=r"(r)); /* read */ 26 r |= CNTKCTL_PL0VCTEN; 27 asm volatile("mcr p15, 0, %0, c14, c1, 0" :: "r"(r)); /* write */ 28 29Kernel module source *which can be patched with the above code* available at: 30https://github.com/thoughtpolice/enable_arm_pmu 31 32`PMCCNTR`: Performance Monitors Cycle Count Register in VMSA 33---------------------------------------------------------- 34 35A 32-bit counter part of Virtual Memory System Architecture. 36Section B4.1.113 in ARM Architecture Reference Manual ARMv7-A/ARMv7-R 37 38For access from user mode, requires user-mode access to PMU to be enabled 39(`PMUSERENR.EN == 1`), which must be done from kernel mode on each CPU: 40 41 #define PERF_DEF_OPTS (1 | 16) 42 /* enable user-mode access to counters */ 43 asm volatile("mcr p15, 0, %0, c9, c14, 0" :: "r"(1)); 44 /* Program PMU and enable all counters */ 45 asm volatile("mcr p15, 0, %0, c9, c12, 0" :: "r"(PERF_DEF_OPTS)); 46 asm volatile("mcr p15, 0, %0, c9, c12, 1" :: "r"(0x8000000f)); 47 48Kernel module source with the above code available at: 49[GitHub thoughtpolice/enable\_arm\_pmu](https://github.com/thoughtpolice/enable_arm_pmu) 50 51More information: 52http://neocontra.blogspot.com/2013/05/user-mode-performance-counters-for.html 53 54ARMv8-A (aarch64) 55================= 56 57`CNTVCT_EL0`: Counter-timer Virtual Count Register 58------------------------------------------------ 59 60A 64-bit counter, part of Generic Registers. 61Section D8.5.17 in ARM Architecture Reference Manual ARMv8-A 62 63For user-mode access, requires `CNTKCTL_EL1.EL0VCTEN == 1`, which 64must be set from kernel mode for each CPU: 65 66 #define CNTKCTL_EL0VCTEN 0x2 67 uint32_t r; 68 asm volatile("mrs %0, CNTKCTL_EL1" : "=r"(r)); /* read */ 69 r |= CNTKCTL_EL0VCTEN; 70 asm volatile("msr CNTKCTL_EL1, %0" :: "r"(r)); /* write */ 71 72*WARNING*: Above code was not tested. 73 74`PMCCNTR_EL0`: Performance Monitors Cycle Count Register 75------------------------------------------------------ 76 77A 64-bit counter, part of Performance Monitors. 78Section D8.4.2 in ARM Architecture Reference Manual ARMv8-A 79 80For access from user mode, requires user-mode access to PMU (`PMUSERENR_EL0.EN 81== 1`), which must be set from kernel mode for each CPU: 82 83 #define PERF_DEF_OPTS (1 | 16) 84 /* enable user-mode access to counters */ 85 asm volatile("msr PMUSERENR_EL0, %0" :: "r"(1)); 86 /* Program PMU and enable all counters */ 87 asm volatile("msr PMCR_EL0, %0" :: "r"(PERF_DEF_OPTS)); 88 asm volatile("msr PMCNTENSET_EL0, %0" :: "r"(0x8000000f)); 89 asm volatile("msr PMCCFILTR_EL0, %0" :: "r"(0)); 90 91Kernel module source with the above code available at: 92[GitHub rdolbeau/enable\_arm\_pmu](https://github.com/rdolbeau/enable_arm_pmu) 93or in [Pull Request #2 at thoughtpolice/enable\_arm\_pmu](https://github.com/thoughtpolice/enable_arm_pmu/pull/2) 94