1.\" $OpenBSD: pctr.4,v 1.8 2019/10/05 20:53:49 matthieu Exp $ 2.\" 3.\" Pentium performance counter driver for OpenBSD. 4.\" Copyright 1996 David Mazieres <dm@lcs.mit.edu>. 5.\" 6.\" Modification and redistribution in source and binary forms is 7.\" permitted provided that due credit is given to the author and the 8.\" OpenBSD project by leaving this copyright notice intact. 9.\" 10.Dd $Mdocdate: October 5 2019 $ 11.Dt PCTR 4 amd64 12.Os 13.Sh NAME 14.Nm pctr 15.Nd driver for CPU performance counters 16.Sh SYNOPSIS 17.Cd "pseudo-device pctr 1" 18.Sh DESCRIPTION 19The 20.Nm 21device provides access to the performance counters on AMD and Intel brand 22processors, and to the TSC on others. 23.Pp 24Intel processors have two 40-bit performance counters which can be 25programmed to count events such as cache misses, branch target buffer hits, 26TLB misses, dual-issues, interrupts, pipeline flushes, and more. 27While AMD processors have four 48-bit counters, their precision is decreased 28to 40 bits. 29.Pp 30There is one 31.Em ioctl 32call to read the status of all counters, and one 33.Em ioctl 34call to program the function of each counter. 35All require the following includes: 36.Bd -literal -offset indent 37#include <sys/types.h> 38#include <machine/cpu.h> 39#include <machine/pctr.h> 40.Ed 41.Pp 42The current state of all counters can be read with the 43.Dv PCIOCRD 44.Em ioctl , 45which takes an argument of type 46.Dv "struct pctrst" : 47.Bd -literal -offset indent 48#define PCTR_NUM 4 49struct pctrst { 50 u_int pctr_fn[PCTR_NUM]; 51 pctrval pctr_tsc; 52 pctrval pctr_hwc[PCTR_NUM]; 53}; 54.Ed 55.Pp 56In this structure, 57.Em ctr_fn 58contains the functions of the counters, as previously set by the 59.Dv PCIOCS0 , 60.Dv PCIOCS1 , 61.Dv PCIOCS2 62and 63.Dv PCIOCS3 64ioctls (see below). 65.Em pctr_hwc 66contains the actual value of the hardware counters. 67.Em pctr_tsc 68is a free-running, 64-bit cycle counter. 69.Pp 70The functions of the counters can be programmed with ioctls 71.Dv PCIOCS0 , 72.Dv PCIOCS1 , 73.Dv PCIOCS2 74and 75.Dv PCIOCS3 76which require a writeable file descriptor and take an argument of type 77.Dv "unsigned int" . \& 78The meaning of this integer is dependent on the particular CPU. 79.Ss Time stamp counter 80The time stamp counter is available on most of the AMD and Intel CPUs. 81It is set to zero at boot time, and then increments with each cycle. 82Because the counter is 64-bits wide, it does not overflow. 83.Pp 84The value of the time stamp counter is returned by the 85.Dv PCIOCRD 86.Em ioctl , 87so that one can get an exact timestamp on readings of the hardware 88event counters. 89.Pp 90The performance counters can be read directly from user-mode without 91need to invoke the kernel. 92The macro 93.Fn rdpmc ctr 94takes 0, 1, 2 or 3 as an argument to specify a counter, and returns that 95counter's 40-bit value (which will be of type 96.Em pctrval ) . 97This is generally preferable to making a system call as it introduces 98less distortion in measurements. 99.Pp 100Counter functions supported by these CPUs contain several parts. 101The most significant byte (an 8-bit integer shifted left by 102.Dv PCTR_CM_SHIFT ) 103contains a 104.Em "counter mask" . 105If non-zero, this sets a threshold for the number of times an event 106must occur in one cycle for the counter to be incremented. 107The 108.Em "counter mask" 109can therefore be used to count cycles in which an event 110occurs at least some number of times. 111The next byte contains several flags: 112.Bl -tag -width PCTR_EN 113.It Dv PCTR_U 114Enables counting of events that occur in user mode. 115.It Dv PCTR_K 116Enables counting of events that occur in kernel mode. 117You must set at least one of 118.Dv PCTR_K 119and 120.Dv PCTR_U 121to count anything. 122.It Dv PCTR_E 123Counts edges rather than cycles. 124For some functions this allows you 125to get an estimate of the number of events rather than the number of 126cycles occupied by those events. 127.It Dv PCTR_EN 128Enable counters. 129This bit must be set in the function for counter 0 130in order for either of the counters to be enabled. 131This bit should probably be set in counter 1 as well. 132.It Dv PCTR_I 133Inverts the sense of the 134.Em "counter mask" . \& 135When this bit is set, the counter only increments on cycles in which 136there are no 137.Em more 138events than specified in the 139.Em "counter mask" . 140.El 141.Pp 142The next byte (shifted left by the 143.Dv PCTR_UM_SHIFT ) 144contains flags specific to the event being counted, also known as the 145.Em "unit mask" . 146.Pp 147For events dealing with the L2 cache, the following flags are valid 148on Intel brand processors: 149.Bl -tag -width PCTR_UM_M 150.It Dv PCTR_UM_M 151Count events involving modified cache coherency state lines. 152.It Dv PCTR_UM_E 153Count events involving exclusive cache coherency state lines. 154.It Dv PCTR_UM_S 155Count events involving shared cache coherency state lines. 156.It Dv PCTR_UM_I 157Count events involving invalid cache coherency state lines. 158.El 159.Pp 160To measure all L2 cache activity, all these bits should be set. 161They can be set with the macro 162.Dv PCTR_UM_MESI 163which contains the bitwise or of all of the above. 164.Pp 165For event types dealing with bus transactions, there is another flag 166that can be set in the 167.Em "unit mask" : 168.Bl -tag -width PCTR_UM_A 169.It Dv PCTR_UM_A 170Count all appropriate bus events, not just those initiated by the 171processor. 172.El 173.Pp 174Events marked 175.Em (MESI) 176require the 177.Dv PCTR_UM_[MESI] 178bits in the 179.Em "unit mask" . \& 180Events marked 181.Em (A) 182can take the 183.Dv PCTR_UM_A 184bit. 185.Pp 186Finally, the least significant byte of the counter function is the 187event type to count. 188A list of possible event functions could be obtained by running a 189.Xr pctr 1 190command with 191.Fl l 192option. 193.Sh FILES 194.Bl -tag -width /dev/pctr -compact 195.It Pa /dev/pctr 196.El 197.Sh ERRORS 198.Bl -tag -width "[ENODEV]" 199.It Bq Er ENODEV 200An attempt was made to set the counter functions on a CPU that does 201not support counters. 202.It Bq Er EINVAL 203An invalid counter function was provided as an argument to the 204.Dv PCIOCSx 205.Em ioctl . 206.It Bq Er EPERM 207An attempt was made to set the counter functions, but the device was 208not open for writing. 209.El 210.Sh SEE ALSO 211.Xr pctr 1 , 212.Xr ioctl 2 213.Sh HISTORY 214A 215.Nm 216device first appeared in 217.Ox 2.0 . 218Support for amd64 architecture appeared in 219.Ox 4.3 . 220.Sh AUTHORS 221.An -nosplit 222The 223.Nm 224device was written by 225.An David Mazieres Aq Mt dm@lcs.mit.edu . 226Support for amd64 architecture was written by 227.An Mike Belopuhov Aq Mt mikeb@openbsd.org . 228.Sh BUGS 229Not all counter functions are completely accurate. 230Some of the functions may not make any sense at all. 231Also you should be aware of the possibility of an interrupt between 232invocations of 233.Fn rdpmc 234that can potentially decrease the accuracy of measurements. 235