xref: /freebsd/lib/libpmc/pmc.haswellxeon.3 (revision 315ee00f)
1.\"
2.\" Copyright (c) 2013 Hiren Panchasara <hiren.panchasara@gmail.com>
3.\" All rights reserved.
4.\"
5.\" Redistribution and use in source and binary forms, with or without
6.\" modification, are permitted provided that the following conditions
7.\" are met:
8.\" 1. Redistributions of source code must retain the above copyright
9.\"    notice, this list of conditions and the following disclaimer.
10.\" 2. Redistributions in binary form must reproduce the above copyright
11.\"    notice, this list of conditions and the following disclaimer in the
12.\"    documentation and/or other materials provided with the distribution.
13.\"
14.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
15.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
16.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
17.\" ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
18.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
19.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
20.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
21.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
22.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
23.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
24.\" SUCH DAMAGE.
25.\"
26.Dd November 21, 2014
27.Dt PMC.HASWELLXEON 3
28.Os
29.Sh NAME
30.Nm pmc.haswellxeon
31.Nd measurement events for
32.Tn Intel
33.Tn Haswell Xeon
34family CPUs
35.Sh LIBRARY
36.Lb libpmc
37.Sh SYNOPSIS
38.In pmc.h
39.Sh DESCRIPTION
40.Tn Intel
41.Tn "Haswell"
42CPUs contain PMCs conforming to version 2 of the
43.Tn Intel
44performance measurement architecture.
45These CPUs may contain up to two classes of PMCs:
46.Bl -tag -width "Li PMC_CLASS_IAP"
47.It Li PMC_CLASS_IAF
48Fixed-function counters that count only one hardware event per counter.
49.It Li PMC_CLASS_IAP
50Programmable counters that may be configured to count one of a defined
51set of hardware events.
52.El
53.Pp
54The number of PMCs available in each class and their widths need to be
55determined at run time by calling
56.Xr pmc_cpuinfo 3 .
57.Pp
58Intel Haswell Xeon PMCs are documented in
59.Rs
60.%B "Intel(R) 64 and IA-32 Architectures Software Developer's Manual"
61.%T "Combined Volumes: 1, 2A, 2B, 2C, 3A, 3B and 3C"
62.%N "Order Number: 325462-052US"
63.%D September 2014
64.%Q "Intel Corporation"
65.Re
66.Ss HASWELL FIXED FUNCTION PMCS
67These PMCs and their supported events are documented in
68.Xr pmc.iaf 3 .
69.Ss HASWELL PROGRAMMABLE PMCS
70The programmable PMCs support the following capabilities:
71.Bl -column "PMC_CAP_INTERRUPT" "Support"
72.It Em Capability Ta Em Support
73.It PMC_CAP_CASCADE Ta \&No
74.It PMC_CAP_EDGE Ta Yes
75.It PMC_CAP_INTERRUPT Ta Yes
76.It PMC_CAP_INVERT Ta Yes
77.It PMC_CAP_READ Ta Yes
78.It PMC_CAP_PRECISE Ta \&No
79.It PMC_CAP_SYSTEM Ta Yes
80.It PMC_CAP_TAGGING Ta \&No
81.It PMC_CAP_THRESHOLD Ta Yes
82.It PMC_CAP_USER Ta Yes
83.It PMC_CAP_WRITE Ta Yes
84.El
85.Ss Event Qualifiers
86Event specifiers for these PMCs support the following common
87qualifiers:
88.Bl -tag -width indent
89.It Li rsp= Ns Ar value
90Configure the Off-core Response bits.
91.Bl -tag -width indent
92.It Li DMND_DATA_RD
93Counts the number of demand and DCU prefetch data reads of full
94and partial cachelines as well as demand data page table entry
95cacheline reads.
96Does not count L2 data read prefetches or instruction fetches.
97.It Li REQ_DMND_RFO
98Counts the number of demand and DCU prefetch reads for ownership (RFO)
99requests generated by a write to data cacheline.
100Does not count L2 RFO prefetches.
101.It Li REQ_DMND_IFETCH
102Counts the number of demand and DCU prefetch instruction cacheline reads.
103Does not count L2 code read prefetches.
104.It Li REQ_WB
105Counts the number of writeback (modified to exclusive) transactions.
106.It Li REQ_PF_DATA_RD
107Counts the number of data cacheline reads generated by L2 prefetchers.
108.It Li REQ_PF_RFO
109Counts the number of RFO requests generated by L2 prefetchers.
110.It Li REQ_PF_IFETCH
111Counts the number of code reads generated by L2 prefetchers.
112.It Li REQ_PF_LLC_DATA_RD
113L2 prefetcher to L3 for loads.
114.It Li REQ_PF_LLC_RFO
115RFO requests generated by L2 prefetcher
116.It Li REQ_PF_LLC_IFETCH
117L2 prefetcher to L3 for instruction fetches.
118.It Li REQ_BUS_LOCKS
119Bus lock and split lock requests.
120.It Li REQ_STRM_ST
121Streaming store requests.
122.It Li REQ_OTHER
123Any other request that crosses IDI, including I/O.
124.It Li RES_ANY
125Catch all value for any response types.
126.It Li RES_SUPPLIER_NO_SUPP
127No Supplier Information available.
128.It Li RES_SUPPLIER_LLC_HITM
129M-state initial lookup stat in L3.
130.It Li RES_SUPPLIER_LLC_HITE
131E-state.
132.It Li RES_SUPPLIER_LLC_HITS
133S-state.
134.It Li RES_SUPPLIER_LLC_HITF
135F-state.
136.It Li RES_SUPPLIER_LOCAL
137Local DRAM Controller.
138.It Li RES_SNOOP_SNP_NONE
139No details on snoop-related information.
140.It Li RES_SNOOP_SNP_NO_NEEDED
141No snoop was needed to satisfy the request.
142.It Li RES_SNOOP_SNP_MISS
143A snoop was needed and it missed all snooped caches:
144-For LLC Hit, ReslHitl was returned by all cores
145-For LLC Miss, Rspl was returned by all sockets and data was returned from
146DRAM.
147.It Li RES_SNOOP_HIT_NO_FWD
148A snoop was needed and it hits in at least one snooped cache.
149Hit denotes a cache-line was valid before snoop effect.
150This includes:
151-Snoop Hit w/ Invalidation (LLC Hit, RFO)
152-Snoop Hit, Left Shared (LLC Hit/Miss, IFetch/Data_RD)
153-Snoop Hit w/ Invalidation and No Forward (LLC Miss, RFO Hit S)
154In the LLC Miss case, data is returned from DRAM.
155.It Li RES_SNOOP_HIT_FWD
156A snoop was needed and data was forwarded from a remote socket.
157This includes:
158-Snoop Forward Clean, Left Shared (LLC Hit/Miss, IFetch/Data_RD/RFT).
159.It Li RES_SNOOP_HITM
160A snoop was needed and it HitM-ed in local or remote cache.
161HitM denotes a cache-line was in modified state before effect as a results of snoop.
162This includes:
163-Snoop HitM w/ WB (LLC miss, IFetch/Data_RD)
164-Snoop Forward Modified w/ Invalidation (LLC Hit/Miss, RFO)
165-Snoop MtoS (LLC Hit, IFetch/Data_RD).
166.It Li RES_NON_DRAM
167Target was non-DRAM system address.
168This includes MMIO transactions.
169.El
170.It Li cmask= Ns Ar value
171Configure the PMC to increment only if the number of configured
172events measured in a cycle is greater than or equal to
173.Ar value .
174.It Li edge
175Configure the PMC to count the number of de-asserted to asserted
176transitions of the conditions expressed by the other qualifiers.
177If specified, the counter will increment only once whenever a
178condition becomes true, irrespective of the number of clocks during
179which the condition remains true.
180.It Li inv
181Invert the sense of comparison when the
182.Dq Li cmask
183qualifier is present, making the counter increment when the number of
184events per cycle is less than the value specified by the
185.Dq Li cmask
186qualifier.
187.It Li os
188Configure the PMC to count events happening at processor privilege
189level 0.
190.It Li usr
191Configure the PMC to count events occurring at privilege levels 1, 2
192or 3.
193.El
194.Pp
195If neither of the
196.Dq Li os
197or
198.Dq Li usr
199qualifiers are specified, the default is to enable both.
200.Ss Event Specifiers (Programmable PMCs)
201Haswell programmable PMCs support the following events:
202.Bl -tag -width indent
203.It Li LD_BLOCKS.STORE_FORWARD
204.Pq Event 03H , Umask 02H
205Loads blocked by overlapping with store buffer that
206cannot be forwarded.
207.It Li MISALIGN_MEM_REF.LOADS
208.Pq Event 05H , Umask 01H
209Speculative cache-line split load uops dispatched to
210L1D.
211.It Li MISALIGN_MEM_REF.STORES
212.Pq Event 05H , Umask 02H
213Speculative cache-line split Store-address uops
214dispatched to L1D.
215.It Li LD_BLOCKS_PARTIAL.ADDRESS_ALIAS
216.Pq Event 07H , Umask 01H
217False dependencies in MOB due to partial compare
218on address.
219.It Li DTLB_LOAD_MISSES.MISS_CAUSES_A_WALK
220.Pq Event 08H , Umask 01H
221Misses in all TLB levels that cause a page walk of any
222page size.
223.It Li DTLB_LOAD_MISSES.WALK_COMPLETED_4K
224.Pq Event 08H , Umask 02H
225Completed page walks due to demand load misses
226that caused 4K page walks in any TLB levels.
227.It Li DTLB_LOAD_MISSES.WALK_COMPLETED_2M_4K
228.Pq Event 08H , Umask 02H
229Completed page walks due to demand load misses
230that caused 2M/4M page walks in any TLB levels.
231.It Li DTLB_LOAD_MISSES.WALK_COMPLETED
232.Pq Event 08H , Umask 0EH
233Completed page walks in any TLB of any page size
234due to demand load misses
235.It Li DTLB_LOAD_MISSES.WALK_DURATION
236.Pq Event 08H , Umask 10H
237Cycle PMH is busy with a walk.
238.It Li DTLB_LOAD_MISSES.STLB_HIT_4K
239.Pq Event 08H , Umask 20H
240Load misses that missed DTLB but hit STLB (4K).
241.It Li DTLB_LOAD_MISSES.STLB_HIT_2M
242.Pq Event 08H , Umask 40H
243Load misses that missed DTLB but hit STLB (2M).
244.It Li DTLB_LOAD_MISSES.STLB_HIT
245.Pq Event 08H , Umask 60H
246Number of cache load STLB hits.
247No page walk.
248.It Li DTLB_LOAD_MISSES.PDE_CACHE_MISS
249.Pq Event 08H , Umask 80H
250DTLB demand load misses with low part of linear-to-
251physical address translation missed
252.It Li INT_MISC.RECOVERY_CYCLES
253.Pq Event 0DH , Umask 03H
254Cycles waiting to recover after Machine Clears
255except JEClear.
256Set Cmask= 1.
257.It Li UOPS_ISSUED.ANY
258.Pq Event 0EH , Umask 01H
259ncrements each cycle the # of Uops issued by the
260RAT to RS.
261Set Cmask = 1, Inv = 1, Any= 1to count stalled cycles
262of this core.
263.It Li UOPS_ISSUED.FLAGS_MERGE
264.Pq Event 0EH , Umask 10H
265Number of flags-merge uops allocated.
266Such uops adds delay.
267.It Li UOPS_ISSUED.SLOW_LEA
268.Pq Event 0EH , Umask 20H
269Number of slow LEA or similar uops allocated.
270Such uop has 3 sources (e.g. 2 sources + immediate)
271regardless if as a result of LEA instruction or not.
272.It Li UOPS_ISSUED.SiNGLE_MUL
273.Pq Event 0EH , Umask 40H
274Number of multiply packed/scalar single precision
275uops allocated.
276.It Li L2_RQSTS.DEMAND_DATA_RD_MISS
277.Pq Event 24H , Umask 21H
278Demand Data Read requests that missed L2, no
279rejects.
280.It Li L2_RQSTS.DEMAND_DATA_RD_HIT
281.Pq Event 24H , Umask 41H
282Demand Data Read requests that hit L2 cache.
283.It Li L2_RQSTS.ALL_DEMAND_DATA_RD
284.Pq Event 24H , Umask E1H
285Counts any demand and L1 HW prefetch data load
286requests to L2.
287.It Li L2_RQSTS.RFO_HIT
288.Pq Event 24H , Umask 42H
289Counts the number of store RFO requests that hit
290the L2 cache.
291.It Li L2_RQSTS.RFO_MISS
292.Pq Event 24H , Umask 22H
293Counts the number of store RFO requests that miss
294the L2 cache.
295.It Li L2_RQSTS.ALL_RFO
296.Pq Event 24H , Umask E2H
297Counts all L2 store RFO requests.
298.It Li L2_RQSTS.CODE_RD_HIT
299.Pq Event 24H , Umask 44H
300Number of instruction fetches that hit the L2 cache.
301.It Li L2_RQSTS.CODE_RD_MISS
302.Pq Event 24H , Umask 24H
303Number of instruction fetches that missed the L2
304cache.
305.It Li L2_RQSTS.ALL_DEMAND_MISS
306.Pq Event 24H , Umask 27H
307Demand requests that miss L2 cache.
308.It Li L2_RQSTS.ALL_DEMAND_REFERENCES
309.Pq Event 24H , Umask E7H
310Demand requests to L2 cache.
311.It Li L2_RQSTS.ALL_CODE_RD
312.Pq Event 24H , Umask E4H
313Counts all L2 code requests.
314.It Li L2_RQSTS.L2_PF_HIT
315.Pq Event 24H , Umask 50H
316Counts all L2 HW prefetcher requests that hit L2.
317.It Li L2_RQSTS.L2_PF_MISS
318.Pq Event 24H , Umask 30H
319Counts all L2 HW prefetcher requests that missed
320L2.
321.It Li L2_RQSTS.ALL_PF
322.Pq Event 24H , Umask F8H
323Counts all L2 HW prefetcher requests.
324.It Li L2_RQSTS.MISS
325.Pq Event 24H , Umask 3FH
326All requests that missed L2.
327.It Li L2_RQSTS.REFERENCES
328.Pq Event 24H , Umask FFH
329All requests to L2 cache.
330.It Li L2_DEMAND_RQSTS.WB_HIT
331.Pq Event 27H , Umask 50H
332Not rejected writebacks that hit L2 cache
333.It Li LONGEST_LAT_CACHE.REFERENCE
334.Pq Event 2EH , Umask 4FH
335This event counts requests originating from the core
336that reference a cache line in the last level cache.
337.It Li LONGEST_LAT_CACHE.MISS
338.Pq Event 2EH , Umask 41H
339This event counts each cache miss condition for
340references to the last level cache.
341.It Li CPU_CLK_UNHALTED.THREAD_P
342.Pq Event 3CH , Umask 00H
343Counts the number of thread cycles while the thread
344is not in a halt state.
345The thread enters the halt state when it is running the HLT instruction.
346The core frequency may change from time to time due to
347power or thermal throttling.
348.It Li CPU_CLK_THREAD_UNHALTED.REF_XCLK
349.Pq Event 3CH , Umask 01H
350Increments at the frequency of XCLK (100 MHz)
351when not halted.
352.It Li L1D_PEND_MISS.PENDING
353.Pq Event 48H , Umask 01H
354Increments the number of outstanding L1D misses
355every cycle.
356Set Cmaks = 1 and Edge =1 to count occurrences.
357.It Li DTLB_STORE_MISSES.MISS_CAUSES_A_WALK
358.Pq Event 49H , Umask 01H
359Miss in all TLB levels causes an page walk of any
360page size (4K/2M/4M/1G).
361.It Li DTLB_STORE_MISSES.WALK_COMPLETED_4K
362.Pq Event 49H , Umask 02H
363Completed page walks due to store misses in one or
364more TLB levels of 4K page structure.
365.It Li DTLB_STORE_MISSES.WALK_COMPLETED_2M_4M
366.Pq Event 49H , Umask 04H
367Completed page walks due to store misses in one or
368more TLB levels of 2M/4M page structure.
369.It Li DTLB_STORE_MISSES.WALK_COMPLETED
370.Pq Event 49H , Umask 0EH
371Completed page walks due to store miss in any TLB
372levels of any page size (4K/2M/4M/1G).
373.It Li DTLB_STORE_MISSES.WALK_DURATION
374.Pq Event 49H , Umask 10H
375Cycles PMH is busy with this walk.
376.It Li DTLB_STORE_MISSES.STLB_HIT_4K
377.Pq Event 49H , Umask 20H
378Store misses that missed DTLB but hit STLB (4K).
379.It Li DTLB_STORE_MISSES.STLB_HIT_2M
380.Pq Event 49H , Umask 40H
381Store misses that missed DTLB but hit STLB (2M).
382.It Li DTLB_STORE_MISSES.STLB_HIT
383.Pq Event 49H , Umask 60H
384Store operations that miss the first TLB level but hit
385the second and do not cause page walks.
386.It Li DTLB_STORE_MISSES.PDE_CACHE_MISS
387.Pq Event 49H , Umask 80H
388DTLB store misses with low part of linear-to-physical
389address translation missed.
390.It Li LOAD_HIT_PRE.SW_PF
391.Pq Event 4CH , Umask 01H
392Non-SW-prefetch load dispatches that hit fill buffer
393allocated for S/W prefetch.
394.It Li LOAD_HIT_PRE.HW_PF
395.Pq Event 4CH , Umask 02H
396Non-SW-prefetch load dispatches that hit fill buffer
397allocated for H/W prefetch.
398.It Li L1D.REPLACEMENT
399.Pq Event 51H , Umask 01H
400Counts the number of lines brought into the L1 data
401cache.
402.It Li MOVE_ELIMINATION.INT_NOT_ELIMINATED
403.Pq Event 58H , Umask 04H
404Number of integer Move Elimination candidate uops
405that were not eliminated.
406.It Li MOVE_ELIMINATION.SMID_NOT_ELIMINATED
407.Pq Event 58H , Umask 08H
408Number of SIMD Move Elimination candidate uops
409that were not eliminated.
410.It Li MOVE_ELIMINATION.INT_ELIMINATED
411.Pq Event 58H , Umask 01H
412Unhalted core cycles when the thread is in ring 0.
413.It Li MOVE_ELIMINATION.SMID_ELIMINATED
414.Pq Event 58H , Umask 02H
415Number of SIMD Move Elimination candidate uops
416that were eliminated.
417.It Li CPL_CYCLES.RING0
418.Pq Event 5CH , Umask 02H
419Unhalted core cycles when the thread is in ring 0.
420.It Li CPL_CYCLES.RING123
421.Pq Event 5CH , Umask 01H
422Unhalted core cycles when the thread is not in ring 0.
423.It Li RS_EVENTS.EMPTY_CYCLES
424.Pq Event 5EH , Umask 01H
425Cycles the RS is empty for the thread.
426.It Li OFFCORE_REQUESTS_OUTSTANDING.DEMAND_DATA_RD
427.Pq Event 60H , Umask 01H
428Offcore outstanding Demand Data Read transactions
429in SQ to uncore.
430Set Cmask=1 to count cycles.
431.It Li OFFCORE_REQUESTS_OUTSTANDING.DEMAND_CORE_RD
432.Pq Event 60H , Umask 02H
433Offcore outstanding Demand code Read transactions
434in SQ to uncore.
435Set Cmask=1 to count cycles.
436.It Li OFFCORE_REQUESTS_OUTSTANDING.DEMAND_RFO
437.Pq Event 60H , Umask 04H
438Offcore outstanding RFO store transactions in SQ to uncore.
439Set Cmask=1 to count cycles.
440.It Li OFFCORE_REQUESTS_OUTSTANDING.ALL_DATA_RD
441.Pq Event 60H , Umask 08H
442Offcore outstanding cacheable data read
443transactions in SQ to uncore.
444Set Cmask=1 to count cycles.
445.It Li LOCK_CYCLES.SPLIT_LOCK_UC_LOCK_DURATION
446.Pq Event 63H , Umask 01H
447Cycles in which the L1D and L2 are locked, due to a
448UC lock or split lock.
449.It Li LOCK_CYCLES.CACHE_LOCK_DURATION
450.Pq Event 63H , Umask 02H
451Cycles in which the L1D is locked.
452.It Li IDQ.EMPTY
453.Pq Event 79H , Umask 02H
454Counts cycles the IDQ is empty.
455.It Li IDQ.MITE_UOPS
456.Pq Event 79H , Umask 04H
457Increment each cycle # of uops delivered to IDQ from
458MITE path.
459Set Cmask = 1 to count cycles.
460.It Li IDQ.DSB_UOPS
461.Pq Event 79H , Umask 08H
462Increment each cycle. # of uops delivered to IDQ
463from DSB path.
464Set Cmask = 1 to count cycles.
465.It Li IDQ.MS_DSB_UOPS
466.Pq Event 79H , Umask 10H
467Increment each cycle # of uops delivered to IDQ
468when MS_busy by DSB.
469Set Cmask = 1 to count cycles.
470Add Edge=1 to count # of delivery.
471.It Li IDQ.MS_MITE_UOPS
472.Pq Event 79H , Umask 20H
473ncrement each cycle # of uops delivered to IDQ
474when MS_busy by MITE.
475Set Cmask = 1 to count cycles.
476.It Li IDQ.MS_UOPS
477.Pq Event 79H , Umask 30H
478Increment each cycle # of uops delivered to IDQ from
479MS by either DSB or MITE.
480Set Cmask = 1 to count cycles.
481.It Li IDQ.ALL_DSB_CYCLES_ANY_UOPS
482.Pq Event 79H , Umask 18H
483Counts cycles DSB is delivered at least one uops.
484Set Cmask = 1.
485.It Li IDQ.ALL_DSB_CYCLES_4_UOPS
486.Pq Event 79H , Umask 18H
487Counts cycles DSB is delivered four uops.
488Set Cmask =4.
489.It Li IDQ.ALL_MITE_CYCLES_ANY_UOPS
490.Pq Event 79H , Umask 24H
491Counts cycles MITE is delivered at least one uops.
492Set Cmask = 1.
493.It Li IDQ.ALL_MITE_CYCLES_4_UOPS
494.Pq Event 79H , Umask 24H
495Counts cycles MITE is delivered four uops.
496Set Cmask =4.
497.It Li IDQ.MITE_ALL_UOPS
498.Pq Event 79H , Umask 3CH
499# of uops delivered to IDQ from any path.
500.It Li ICACHE.MISSES
501.Pq Event 80H , Umask 02H
502Number of Instruction Cache, Streaming Buffer and
503Victim Cache Misses.
504Includes UC accesses.
505.It Li ITLB_MISSES.MISS_CAUSES_A_WALK
506.Pq Event 85H , Umask 01H
507Misses in ITLB that causes a page walk of any page
508size.
509.It Li ITLB_MISSES.WALK_COMPLETED_4K
510.Pq Event 85H , Umask 02H
511Completed page walks due to misses in ITLB 4K page
512entries.
513.It Li TLB_MISSES.WALK_COMPLETED_2M_4M
514.Pq Event 85H , Umask 04H
515Completed page walks due to misses in ITLB 2M/4M
516page entries.
517.It Li ITLB_MISSES.WALK_COMPLETED
518.Pq Event 85H , Umask 0EH
519Completed page walks in ITLB of any page size.
520.It Li ITLB_MISSES.WALK_DURATION
521.Pq Event 85H , Umask 10H
522Cycle PMH is busy with a walk.
523.It Li ITLB_MISSES.STLB_HIT_4K
524.Pq Event 85H , Umask 20H
525ITLB misses that hit STLB (4K).
526.It Li ITLB_MISSES.STLB_HIT_2M
527.Pq Event 85H , Umask 40H
528ITLB misses that hit STLB (2K).
529.It Li ITLB_MISSES.STLB_HIT
530.Pq Event 85H , Umask 60H
531TLB misses that hit STLB.
532No page walk.
533.It Li ILD_STALL.LCP
534.Pq Event 87H , Umask 01H
535Stalls caused by changing prefix length of the
536instruction.
537.It Li ILD_STALL.IQ_FULL
538.Pq Event 87H , Umask 04H
539Stall cycles due to IQ is full.
540.It Li BR_INST_EXEC.NONTAKEN_COND
541.Pq Event 88H , Umask 41H
542Count conditional near branch instructions that were executed (but not
543necessarily retired) and not taken.
544.It Li BR_INST_EXEC.TAKEN_COND
545.Pq Event 88H , Umask 81H
546Count conditional near branch instructions that were executed (but not
547necessarily retired) and taken.
548.It Li BR_INST_EXEC.DIRECT_JMP
549.Pq Event 88H , Umask 82H
550Count all unconditional near branch instructions excluding calls and
551indirect branches.
552.It Li BR_INST_EXEC.INDIRECT_JMP_NON_CALL_RET
553.Pq Event 88H , Umask 84H
554Count executed indirect near branch instructions that are not calls nor
555returns.
556.It Li BR_INST_EXEC.RETURN_NEAR
557.Pq Event 88H , Umask 88H
558Count indirect near branches that have a return mnemonic.
559.It Li BR_INST_EXEC.DIRECT_NEAR_CALL
560.Pq Event 88H , Umask 90H
561Count unconditional near call branch instructions, excluding non call
562branch, executed.
563.It Li BR_INST_EXEC.INDIRECT_NEAR_CALL
564.Pq Event 88H , Umask A0H
565Count indirect near calls, including both register and memory indirect,
566executed.
567.It Li BR_INST_EXEC.ALL_BRANCHES
568.Pq Event 88H , Umask FFH
569Counts all near executed branches (not necessarily retired).
570.It Li BR_MISP_EXEC.NONTAKEN_COND
571.Pq Event 89H , Umask 41H
572Count conditional near branch instructions mispredicted as nontaken.
573.It Li BR_MISP_EXEC.TAKEN_COND
574.Pq Event 89H , Umask 81H
575Count conditional near branch instructions mispredicted as taken.
576.It Li BR_MISP_EXEC.INDIRECT_JMP_NON_CALL_RET
577.Pq Event 89H , Umask 84H
578Count mispredicted indirect near branch instructions that are not calls
579nor returns.
580.It Li BR_MISP_EXEC.RETURN_NEAR
581.Pq Event 89H , Umask 88H
582Count mispredicted indirect near branches that have a return mnemonic.
583.It Li BR_MISP_EXEC.DIRECT_NEAR_CALL
584.Pq Event 89H , Umask 90H
585Count mispredicted unconditional near call branch instructions, excluding
586non call branch, executed.
587.It Li BR_MISP_EXEC.INDIRECT_NEAR_CALL
588.Pq Event 89H , Umask A0H
589Count mispredicted indirect near calls, including both register and memory
590indirect, executed.
591.It Li BR_MISP_EXEC.ALL_BRANCHES
592.Pq Event 89H , Umask FFH
593Counts all mispredicted near executed branches (not necessarily retired).
594.It Li IDQ_UOPS_NOT_DELIVERED.CORE
595.Pq Event 9CH , Umask 01H
596Count number of non-delivered uops to RAT per
597thread.
598.It Li UOPS_EXECUTED_PORT.PORT_0
599.Pq Event A1H , Umask 01H
600Cycles which a Uop is dispatched on port 0 in this
601thread.
602.It Li UOPS_EXECUTED_PORT.PORT_1
603.Pq Event A1H , Umask 02H
604Cycles which a Uop is dispatched on port 1 in this
605thread.
606.It Li UOPS_EXECUTED_PORT.PORT_2
607.Pq Event A1H , Umask 04H
608Cycles which a Uop is dispatched on port 2 in this
609thread.
610.It Li UOPS_EXECUTED_PORT.PORT_3
611.Pq Event A1H , Umask 08H
612Cycles which a Uop is dispatched on port 3 in this
613thread.
614.It Li UOPS_EXECUTED_PORT.PORT_4
615.Pq Event A1H , Umask 10H
616Cycles which a Uop is dispatched on port 4 in this
617thread.
618.It Li UOPS_EXECUTED_PORT.PORT_5
619.Pq Event A1H , Umask 20H
620Cycles which a Uop is dispatched on port 5 in this
621thread.
622.It Li UOPS_EXECUTED_PORT.PORT_6
623.Pq Event A1H , Umask 40H
624Cycles which a Uop is dispatched on port 6 in this
625thread.
626.It Li UOPS_EXECUTED_PORT.PORT_7
627.Pq Event A1H , Umask 80H
628Cycles which a Uop is dispatched on port 7 in this
629thread.
630.It Li RESOURCE_STALLS.ANY
631.Pq Event A2H , Umask 01H
632Cycles Allocation is stalled due to Resource Related
633reason.
634.It Li RESOURCE_STALLS.RS
635.Pq Event A2H , Umask 04H
636Cycles stalled due to no eligible RS entry available.
637.It Li RESOURCE_STALLS.SB
638.Pq Event A2H , Umask 08H
639Cycles stalled due to no store buffers available (not
640including draining form sync).
641.It Li RESOURCE_STALLS.ROB
642.Pq Event A2H , Umask 10H
643Cycles stalled due to re-order buffer full.
644.It Li CYCLE_ACTIVITY.CYCLES_L2_PENDING
645.Pq Event A3H , Umask 01H
646Cycles with pending L2 miss loads.
647Set Cmask=2 to count cycle.
648.It Li CYCLE_ACTIVITY.CYCLES_LDM_PENDING
649.Pq Event A3H , Umask 02H
650Cycles with pending memory loads.
651Set Cmask=2 to count cycle.
652.It Li CYCLE_ACTIVITY.STALLS_L2_PENDING
653.Pq Event A3H , Umask 05H
654Number of loads missed L2.
655.It Li CYCLE_ACTIVITY.CYCLES_L1D_PENDING
656.Pq Event A3H , Umask 08H
657Cycles with pending L1 cache miss loads.
658Set Cmask=8 to count cycle.
659.It Li ITLB.ITLB_FLUSH
660.Pq Event AEH , Umask 01H
661Counts the number of ITLB flushes, includes
6624k/2M/4M pages.
663.It Li OFFCORE_REQUESTS.DEMAND_DATA_RD
664.Pq Event B0H , Umask 01H
665Demand data read requests sent to uncore.
666.It Li OFFCORE_REQUESTS.DEMAND_CODE_RD
667.Pq Event B0H , Umask 02H
668Demand code read requests sent to uncore.
669.It Li OFFCORE_REQUESTS.DEMAND_RFO
670.Pq Event B0H , Umask 04H
671Demand RFO read requests sent to uncore, including
672regular RFOs, locks, ItoM.
673.It Li OFFCORE_REQUESTS.ALL_DATA_RD
674.Pq Event B0H , Umask 08H
675Data read requests sent to uncore (demand and prefetch).
676.It Li UOPS_EXECUTED.CORE
677.Pq Event B1H , Umask 02H
678Counts total number of uops to be executed per-core
679each cycle.
680.It Li OFF_CORE_RESPONSE_0
681.Pq Event B7H , Umask 01H
682Requires MSR 01A6H
683.It Li OFF_CORE_RESPONSE_1
684.Pq Event BBH , Umask 01H
685Requires MSR 01A7H
686.It Li PAGE_WALKER_LOADS.DTLB_L1
687.Pq Event BCH , Umask 11H
688Number of DTLB page walker loads that hit in the
689L1+FB.
690.It Li PAGE_WALKER_LOADS.ITLB_L1
691.Pq Event BCH , Umask 21H
692Number of ITLB page walker loads that hit in the
693L1+FB.
694.It Li PAGE_WALKER_LOADS.DTLB_L2
695.Pq Event BCH , Umask 12H
696Number of DTLB page walker loads that hit in the L2.
697.It Li PAGE_WALKER_LOADS.ITLB_L2
698.Pq Event BCH , Umask 22H
699Number of ITLB page walker loads that hit in the L2.
700.It Li PAGE_WALKER_LOADS.DTLB_L3
701.Pq Event BCH , Umask 14H
702Number of DTLB page walker loads that hit in the L3.
703.It Li PAGE_WALKER_LOADS.ITLB_L3
704.Pq Event BCH , Umask 24H
705Number of ITLB page walker loads that hit in the L3.
706.It Li PAGE_WALKER_LOADS.DTLB_MEMORY
707.Pq Event BCH , Umask 18H
708Number of DTLB page walker loads from memory.
709.It Li PAGE_WALKER_LOADS.ITLB_MEMORY
710.Pq Event BCH , Umask 28H
711Number of ITLB page walker loads from memory.
712.It Li TLB_FLUSH.DTLB_THREAD
713.Pq Event BDH , Umask 01H
714DTLB flush attempts of the thread-specific entries.
715.It Li TLB_FLUSH.STLB_ANY
716.Pq Event BDH , Umask 20H
717Count number of STLB flush attempts.
718.It Li INST_RETIRED.ANY_P
719.Pq Event C0H , Umask 00H
720Number of instructions at retirement.
721.It Li INST_RETIRED.ALL
722.Pq Event C0H , Umask 01H
723Precise instruction retired event with HW to reduce
724effect of PEBS shadow in IP distribution.
725.It Li OTHER_ASSISTS.AVX_TO_SSE
726.Pq Event C1H , Umask 08H
727Number of transitions from AVX-256 to legacy SSE
728when penalty applicable.
729.It Li OTHER_ASSISTS.SSE_TO_AVX
730.Pq Event C1H , Umask 10H
731Number of transitions from SSE to AVX-256 when
732penalty applicable.
733.It Li OTHER_ASSISTS.ANY_WB_ASSIST
734.Pq Event C1H , Umask 40H
735Number of microcode assists invoked by HW upon
736uop writeback.
737.It Li UOPS_RETIRED.ALL
738.Pq Event C2H , Umask 01H
739Counts the number of micro-ops retired, Use
740cmask=1 and invert to count active cycles or stalled
741cycles.
742.It Li UOPS_RETIRED.RETIRE_SLOTS
743.Pq Event C2H , Umask 02H
744Counts the number of retirement slots used each
745cycle.
746.It Li MACHINE_CLEARS.MEMORY_ORDERING
747.Pq Event C3H , Umask 02H
748Counts the number of machine clears due to memory
749order conflicts.
750.It Li MACHINE_CLEARS.SMC
751.Pq Event C3H , Umask 04H
752Number of self-modifying-code machine clears
753detected.
754.It Li MACHINE_CLEARS.MASKMOV
755.Pq Event C3H , Umask 20H
756Counts the number of executed AVX masked load
757operations that refer to an illegal address range with
758the mask bits set to 0.
759.It Li BR_INST_RETIRED.ALL_BRANCHES
760.Pq Event C4H , Umask 00H
761Branch instructions at retirement.
762.It Li BR_INST_RETIRED.CONDITIONAL
763.Pq Event C4H , Umask 01H
764Counts the number of conditional branch instructions Supports PEBS
765retired.
766.It Li BR_INST_RETIRED.NEAR_CALL
767.Pq Event C4H , Umask 02H
768Direct and indirect near call instructions retired.
769.It Li BR_INST_RETIRED.ALL_BRANCHES
770.Pq Event C4H , Umask 04H
771Counts the number of branch instructions retired.
772.It Li BR_INST_RETIRED.NEAR_RETURN
773.Pq Event C4H , Umask 08H
774Counts the number of near return instructions
775retired.
776.It Li BR_INST_RETIRED.NOT_TAKEN
777.Pq Event C4H , Umask 10H
778Counts the number of not taken branch instructions
779retired.
780 It Li BR_INST_RETIRED.NEAR_TAKEN
781.Pq Event C4H , Umask 20H
782Number of near taken branches retired.
783.It Li BR_INST_RETIRED.FAR_BRANCH
784.Pq Event C4H , Umask 40H
785Number of far branches retired.
786.It Li BR_MISP_RETIRED.ALL_BRANCHES
787.Pq Event C5H , Umask 00H
788Mispredicted branch instructions at retirement
789.It Li BR_MISP_RETIRED.CONDITIONAL
790.Pq Event C5H , Umask 01H
791Mispredicted conditional branch instructions retired.
792.It Li BR_MISP_RETIRED.CONDITIONAL
793.Pq Event C5H , Umask 04H
794Mispredicted macro branch instructions retired.
795.It Li FP_ASSIST.X87_OUTPUT
796.Pq Event CAH , Umask 02H
797Number of X87 FP assists due to Output values.
798.It Li FP_ASSIST.X87_INPUT
799.Pq Event CAH , Umask 04H
800Number of X87 FP assists due to input values.
801.It Li FP_ASSIST.SIMD_OUTPUT
802.Pq Event CAH , Umask 08H
803Number of SIMD FP assists due to Output values.
804.It Li FP_ASSIST.SIMD_INPUT
805.Pq Event CAH , Umask 10H
806Number of SIMD FP assists due to input values.
807.It Li FP_ASSIST.ANY
808.Pq Event CAH , Umask 1EH
809Cycles with any input/output SSE* or FP assists.
810.It Li ROB_MISC_EVENTS.LBR_INSERTS
811.Pq Event CCH , Umask 20H
812Count cases of saving new LBR records by hardware.
813.It Li MEM_TRANS_RETIRED.LOAD_LATENCY
814.Pq Event CDH , Umask 01H
815Randomly sampled loads whose latency is above a
816user defined threshold.
817A small fraction of the overall loads are sampled due to randomization.
818.It Li MEM_UOPS_RETIRED.STLB_MISS_LOADS
819.Pq Event D0H , Umask 11H
820Count retired load uops that missed the STLB.
821.It Li MEM_UOPS_RETIRED.STLB_MISS_STORES
822.Pq Event D0H , Umask 12H
823Count retired store uops that missed the STLB.
824.It Li MEM_UOPS_RETIRED.SPLIT_LOADS
825.Pq Event D0H , Umask 41H
826Count retired load uops that were split across a cache line.
827.It Li MEM_UOPS_RETIRED.SPLIT_STORES
828.Pq Event D0H , Umask 42H
829Count retired store uops that were split across a cache line.
830.It Li MEM_UOPS_RETIRED.ALL_LOADS
831.Pq Event D0H , Umask 81H
832Count all retired load uops.
833.It Li MEM_UOPS_RETIRED.ALL_STORES
834.Pq Event D0H , Umask 82H
835Count all retired store uops.
836.It Li MEM_LOAD_UOPS_RETIRED.L1_HIT
837.Pq Event D1H , Umask 01H
838Retired load uops with L1 cache hits as data sources.
839.It Li MEM_LOAD_UOPS_RETIRED.L2_HIT
840.Pq Event D1H , Umask 02H
841Retired load uops with L2 cache hits as data sources.
842.It Li MEM_LOAD_UOPS_RETIRED.LLC_HIT
843.Pq Event D1H , Umask 04H
844Retired load uops with LLC cache hits as data
845sources.
846.It Li MEM_LOAD_UOPS_RETIRED.L2_MISS
847.Pq Event D1H , Umask 10H
848Retired load uops missed L2.
849Unknown data source excluded.
850.It Li MEM_LOAD_UOPS_RETIRED.HIT_LFB
851.Pq Event D1H , Umask 40H
852Retired load uops which data sources were load uops
853missed L1 but hit FB due to preceding miss to the
854same cache line with data not ready.
855.It Li MEM_LOAD_UOPS_LLC_HIT_RETIRED.XSNP_MISS
856.Pq Event D2H , Umask 01H
857Retired load uops which data sources were LLC hit
858and cross-core snoop missed in on-pkg core cache.
859.It Li MEM_LOAD_UOPS_LLC_HIT_RETIRED.XSNP_HIT
860.Pq Event D2H , Umask 02H
861Retired load uops which data sources were LLC and
862cross-core snoop hits in on-pkg core cache.
863.It Li MEM_LOAD_UOPS_LLC_HIT_RETIRED.XSNP_HITM
864.Pq Event D2H , Umask 04H
865Retired load uops which data sources were HitM
866responses from shared LLC.
867.It Li MEM_LOAD_UOPS_LLC_HIT_RETIRED.XSNP_NONE
868.Pq Event D2H , Umask 08H
869Retired load uops which data sources were hits in
870LLC without snoops required.
871.It Li MEM_LOAD_UOPS_LLC_MISS_RETIRED.LOCAL_DRAM
872.Pq Event D3H , Umask 01H
873Retired load uops which data sources missed LLC but
874serviced from local dram.
875.It Li BACLEARS.ANY
876.Pq Event E6H , Umask 1FH
877Number of front end re-steers due to BPU
878misprediction.
879.It Li L2_TRANS.DEMAND_DATA_RD
880.Pq Event F0H , Umask 01H
881Demand Data Read requests that access L2 cache.
882.It Li L2_TRANS.RFO
883.Pq Event F0H , Umask 02H
884RFO requests that access L2 cache.
885.It Li L2_TRANS.CODE_RD
886.Pq Event F0H , Umask 04H
887L2 cache accesses when fetching instructions.
888.It Li L2_TRANS.ALL_PF
889.Pq Event F0H , Umask 08H
890Any MLC or LLC HW prefetch accessing L2, including
891rejects.
892.It Li L2_TRANS.L1D_WB
893.Pq Event F0H , Umask 10H
894L1D writebacks that access L2 cache.
895.It Li L2_TRANS.L2_FILL
896.Pq Event F0H , Umask 20H
897L2 fill requests that access L2 cache.
898.It Li L2_TRANS.L2_WB
899.Pq Event F0H , Umask 40H
900L2 writebacks that access L2 cache.
901.It Li L2_TRANS.ALL_REQUESTS
902.Pq Event F0H , Umask 80H
903Transactions accessing L2 pipe.
904.It Li L2_LINES_IN.I
905.Pq Event F1H , Umask 01H
906L2 cache lines in I state filling L2.
907.It Li L2_LINES_IN.S
908.Pq Event F1H , Umask 02H
909L2 cache lines in S state filling L2.
910.It Li L2_LINES_IN.E
911.Pq Event F1H , Umask 04H
912L2 cache lines in E state filling L2.
913.It Li L2_LINES_IN.ALL
914.Pq Event F1H , Umask 07H
915L2 cache lines filling L2.
916.It Li L2_LINES_OUT.DEMAND_CLEAN
917.Pq Event F2H , Umask 05H
918Clean L2 cache lines evicted by demand.
919.It Li L2_LINES_OUT.DEMAND_DIRTY
920.Pq Event F2H , Umask 06H
921Dirty L2 cache lines evicted by demand.
922.El
923.Sh SEE ALSO
924.Xr pmc 3 ,
925.Xr pmc.atom 3 ,
926.Xr pmc.core 3 ,
927.Xr pmc.corei7 3 ,
928.Xr pmc.corei7uc 3 ,
929.Xr pmc.haswell 3 ,
930.Xr pmc.haswelluc 3 ,
931.Xr pmc.iaf 3 ,
932.Xr pmc.ivybridge 3 ,
933.Xr pmc.ivybridgexeon 3 ,
934.Xr pmc.k7 3 ,
935.Xr pmc.k8 3 ,
936.Xr pmc.sandybridge 3 ,
937.Xr pmc.sandybridgeuc 3 ,
938.Xr pmc.sandybridgexeon 3 ,
939.Xr pmc.soft 3 ,
940.Xr pmc.tsc 3 ,
941.Xr pmc.ucf 3 ,
942.Xr pmc.westmere 3 ,
943.Xr pmc.westmereuc 3 ,
944.Xr pmc_cpuinfo 3 ,
945.Xr pmclog 3 ,
946.Xr hwpmc 4
947.Sh HISTORY
948Support for the Haswell Xeon microarchitecture first appeared in
949.Fx 10.2 .
950.Sh AUTHORS
951The
952.Lb libpmc
953library was written by
954.An "Joseph Koshy"
955.Aq jkoshy@FreeBSD.org .
956The support for the Haswell Xeon
957microarchitecture was written by
958.An "Randall Stewart"
959.Aq rrs@FreeBSD.org .
960