1.. _skiboot-6.4:
2
3skiboot-6.4
4===========
5
6skiboot v6.4 was released on Tuesday July 16th 2019. It is the first
7release of skiboot 6.4, which becomes the new stable release
8of skiboot following the 6.3 release, first released May 3rd 2019.
9
10Skiboot 6.4 will mark the basis for op-build v2.4.
11
12skiboot v6.4 contains all bug fixes as of :ref:`skiboot-6.0.20`,
13and :ref:`skiboot-6.3.2` (the currently maintained stable releases).
14
15For how the skiboot stable releases work, see :ref:`stable-rules` for details.
16
17Over skiboot 6.3, we have the following changes:
18
19.. _skiboot-6.4-new-features:
20
21New features
22------------
23
24Since skiboot v6.4-rc1:
25
26- npu2-opencapi: Add opencapi support on ZZ
27
28  This patch adds opencapi support on ZZ. It hard-codes the required
29  device tree entries for the NPU and links. The alternative was to use
30  HDAT, but it somehow proved too painful to do.
31
32  The new device tree entries activate the npu2 init code on ZZ. On
33  systems with no opencapi adapters, it should go unnoticed, as presence
34  detection will skip link training.
35
36Since skiboot v6.3:
37
38- platforms/nicole: Add new platform
39
40  The platform is a new platform from YADRO, it's a storage controller for
41  TATLIN server. It's Based on IBM Romulus reference design (POWER9).
42
43- platform/zz: Add new platform type
44
45  We have new platform type under ZZ. Lets add them. With this fix
46- nvram: Flag dangerous NVRAM options
47
48  Most nvram options used by skiboot are just for debug or testing for
49  regressions. They should never be used long term.
50
51  We've hit a number of issues in testing and the field where nvram
52  options have been set "temporarily" but haven't been properly cleared
53  after, resulting in crashes or real bugs being masked.
54
55  This patch marks most nvram options used by skiboot as dangerous and
56  prints a chicken to remind users of the problem.
57
58- hw/phb3: Add verbose EEH output
59
60  Add support for the pci-eeh-verbose NVRAM flag on PHB3. We've had this
61  on PHB4 since forever and it has proven very useful when debugging EEH
62  issues. When testing changes to the Linux kernel's EEH implementation
63  it's fairly common for the kernel to crash before printing the EEH log
64  so it's helpful to have it in the OPAL log where it can be dumped from
65  XMON.
66
67  Note that unlike PHB4 we do not enable verbose mode by default. The
68  nvram option must be used to explicitly enable it.
69
70- Experimental support for building without FSP code
71
72  Now, with CONFIG_FSP=0/1 we have:
73
74  - 1.6M/1.4M skiboot.lid
75  - 323K/375K skiboot.lid.xz
76
77- doc: travis-ci deploy docs!
78
79  Documentation is now automatically deployed if you configure Travis CI
80  appropriately (we have done this for the open-power branch of skiboot)
81
82- Big OPAL API Documentation improvement
83
84  A lot more OPAL API calls are now (at least somewhat) documented.
85- opal/hmi: Report NPU2 checkstop reason
86
87  The NPU2 is currently not passing any information to linux to explain
88  the cause of an HMI. NPU2 has three Fault Isolation Registers and over
89  30 of those FIR bits are configured to raise an HMI by default. We
90  won't be able to fit all possible state in the 32-bit xstop_reason
91  field of the HMI event, but we can still try to encode up to 4 HMI
92  reasons.
93- opal-msg: Enhance opal-get-msg API
94
95  Linux uses :ref:`OPAL_GET_MSG` API to get OPAL messages. This interface
96  supports upto 8 params (64 bytes). We have a requirement to send bigger data to
97  Linux. This patch enhances OPAL to send bigger data to Linux.
98
99  - Linux will use "opal-msg-size" device tree property to allocate memory for
100    OPAL messages (previous patch increased "opal-msg-size" to 64K).
101  - Replaced `reserved` field in "struct opal_msg" with `size`. So that Linux
102    side opal_get_msg user can detect actual data size.
103  - If buffer size < actual message size, then opal_get_msg will copy partial
104    data and return OPAL_PARTIAL to Linux.
105  - Add new variable "extended" to "opal_msg_entry" structure to keep track
106    of messages that has more than 64byte data. We will allocate separate
107    memory for these messages and once kernel consumes message we will
108    release that memory.
109- core/opal: Increase opal-msg-size size
110
111  Kernel will use `opal-msg-size` property to allocate memory for opal_msg.
112  We want to send bigger data from OPAL to kernel. Hence increase
113  opal-msg-size to 64K.
114- hw/npu2-opencapi: Add initial support for allocating OpenCAPI LPC memory
115
116  Lowest Point of Coherency (LPC) memory allows the host to access memory on
117  an OpenCAPI device.
118
119  Define 2 OPAL calls, :ref:`OPAL_NPU_MEM_ALLOC` and :ref:`OPAL_NPU_MEM_RELEASE`, for
120  assigning and clearing the memory BAR. (We try to avoid using the term
121  "LPC" to avoid confusion with Low Pin Count.)
122
123  At present, we use a fixed location in the address space, which means we
124  are restricted to a single range of 4TB, on a single OpenCAPI device per
125  chip. In future, we'll use some chip ID extension magic to give us more
126  space, and some sort of allocator to assign ranges to more than one device.
127- core/fast-reboot: Add im-feeling-lucky option
128
129  Fast reboot gets disabled for a number of reasons e.g. the availability
130  of nvlink. However this doesn't actually affect the ability to perform fast
131  reboot if no nvlink device is actually present.
132
133  Add a nvram option for fast-reset where if it's set to
134  "im-feeling-lucky" then perform the fast-reboot irrespective of if it's
135  previously been disabled.
136
137- platforms/astbmc: Check for SBE validation step
138
139  On some POWER8 astbmc systems an update to the SBE requires pausing at
140  runtime to ensure integrity of the SBE. If this is required the BMC will
141  set a chassis boot option IPMI flag using the OEM parameter 0x62. If
142  Skiboot sees this flag is set it waits until the SBE update is complete
143  and the flag is cleared.
144
145  Unfortunately the mystery operation that validates the SBE also leaves
146  it in a bad state and unable to be used for timer operations. To
147  workaround this the flag is checked as soon as possible (ie. when IPMI
148  and the console are set up), and once complete the system is rebooted.
149- Add P9 DIO interrupt support
150
151  On P9 there are GPIO port 0, 1, 2 for GPIO interrupt, and DIO interrupt
152  is used to handle the interrupts.
153
154  Add support to the DIO interrupts:
155
156  1. Add dio_interrupt_register(chip, port, callback) to register the
157     interrupt
158  2. Add dio_interrupt_deregister(chip, port, callback) to deregister;
159  3. When interrupt on the port occurs, callback is invoked, and the
160     interrupt status is cleared.
161
162
163Removed features
164----------------
165
166Since skiboot v6.3:
167
168- pci/iov: Remove skiboot VF tracking
169
170  This feature was added a few years ago in response to a request to make
171  the MaxPayloadSize (MPS) field of a Virtual Function match the MPS of the
172  Physical Function that hosts it.
173
174  The SR-IOV specification states the the MPS field of the VF is "ResvP".
175  This indicates the VF will use whatever MPS is configured on the PF and
176  that the field should be treated as a reserved field in the config space
177  of the VF. In other words, a SR-IOV spec compliant VF should always return
178  zero in the MPS field.  Adding hacks in OPAL to make it non-zero is...
179  misguided at best.
180
181  Additionally, there is a bug in the way pci_device structures are handled
182  by VFs that results in a crash on fast-reboot that occurs if VFs are
183  enabled and then disabled prior to rebooting. This patch fixes the bug by
184  removing the code entirely. This patch has no impact on SR-IOV support on
185  the host operating system.
186- Remove POWER7 and POWER7+ support
187
188  It's been a good long while since either OPAL POWER7 user touched a
189  machine, and even longer since they'd have been okay using an old
190  version rather than tracking master.
191
192  There's also been no testing of OPAL on POWER7 systems for an awfully
193  long time, so it's pretty safe to assume that it's very much bitrotted.
194
195  It also saves a whole 14kb of xz compressed payload space.
196- Remove remnants of :ref:`OPAL_PCI_GET_PHB_DIAG_DATA`
197
198  Never present in a public OPAL release, and only kernels prior to 3.11
199  would ever attempt to call it.
200- Remove unused :ref:`OPAL_GET_XIVE_SOURCE`
201
202  While this call was technically implemented by skiboot, no code has ever called
203  it, and it was only ever implemented for the p7ioc-phb back-end (i.e. POWER7).
204  Since this call was unused in Linux, and that  POWER7 with OPAL was only ever
205  available internally, so it should be safe to remove the call.
206- Remove unused :ref:`OPAL_PCI_GET_XIVE_REISSUE` and :ref:`OPAL_PCI_SET_XIVE_REISSUE`
207
208  These seem to be remnants of one of the OPAL incarnations prior to
209  OPALv3. These calls have never been implemented in skiboot, and never
210  used by an upstream kernel (nor a PowerKVM kernel).
211
212  It's rather safe to just document them as never existing.
213- Remove never implemented :ref:`OPAL_PCI_SET_PHB_TABLE_MEMORY` and document why
214
215  Not ever used by upstream linux or PowerKVM tree. Never implemented in
216  skiboot (not even in ancient internal only tree).
217
218  So, it's incredibly safe to remove.
219- Remove unused :ref:`OPAL_PCI_EEH_FREEZE_STATUS2`
220
221  This call was introduced all the way back at the end of 2012, before
222  OPAL was public. The #define for the OPAL call was introduced to the
223  Linux kernel in June 2013, and the call was never used in any kernel
224  tree ever (as far as we can find).
225
226  Thus, it's quite safe to remove this completely unused and completely
227  untested OPAL call.
228- Document the long removed :ref:`OPAL_REGISTER_OPAL_EXCEPTION_HANDLER` call
229
230  I'm pretty sure this was removed in one of our first ever service packs.
231
232  Fixes: https://github.com/open-power/skiboot/issues/98
233- Remove last remnants of :ref:`OPAL_PCI_SET_PHB_TCE_MEMORY` and :ref:`OPAL_PCI_SET_HUB_TCE_MEMORY`
234
235  Since we have not supported p5ioc systems since skiboot 5.2, it's pretty
236  safe to just wholesale remove these OPAL calls now.
237- Remove remnants of :ref:`OPAL_PCI_SET_PHB_TCE_MEMORY`
238
239  There's no reason we need remnants hanging around that aren't used, so
240  remove them and save a handful of bytes at runtime.
241
242  Simultaneously, document the OPAL call removal.
243
244
245Secure and Trusted Boot
246-----------------------
247
248Since skiboot v6.3:
249
250- trustedboot: Change PCR and event_type for the skiboot events
251
252  The existing skiboot events are being logged as EV_ACTION, however, the
253  TCG PC Client spec says that EV_ACTION events should have one of the
254  pre-defined strings in the event field recorded in the event log. For
255  instance:
256
257  - "Calling Ready to Boot",
258  - "Entering ROM Based Setup",
259  - "User Password Entered", and
260  - "Start Option ROM Scan.
261
262  None of the EV_ACTION pre-defined strings are applicable to the existing
263  skiboot events. Based on recent discussions with other POWER teams, this
264  patch proposes a convention on what PCR and event types should be used
265  for skiboot events. This also changes the skiboot source code to follow
266  the convention.
267
268  The TCG PC Client spec defines several event types, other than
269  EV_ACTION. However, many of them are specific to UEFI events and some
270  others are related to platform or CRTM events, which is more applicable
271  to hostboot events.
272
273  Currently, most of the hostboot events are extended to PCR[0,1] and
274  logged as either EV_PLATFORM_CONFIG_FLAGS, EV_S_CRTM_CONTENTS or
275  EV_POST_CODE. The "Node Id" and "PAYLOAD" events, though, are extended
276  to PCR[4,5,6] and logged as EV_COMPACT_HASH.
277
278  For the lack of an event type that fits the specific purpose,
279  EV_COMPACT_HASH seems to be the most adequate one due to its
280  flexibility. According to the TCG PC Client spec:
281
282  - May be used for any PCR except 0, 1, 2 and 3.
283  - The event field may be informative or may be hashed to generate the
284    digest field, depending on the component recording the event.
285
286  Additionally, the PCR[4,5] seem to be the most adequate PCRs. They would
287  be used for skiboot and some skiroot events. According to the TCG PC
288  Client, PCR[4] is intended to represent the entity that manages the
289  transition between the pre-OS and OS-present state of the platform.
290  PCR[4], along with PCR[5], identifies the initial OS loader.
291
292  In summary, for skiboot events:
293
294  - Events that represents data should be extended to PCR 4.
295  - Events that represents config should be extended to PCR 5.
296  - For the lack of an event type that fits the specific purpose,
297    both data and config events should be logged as EV_COMPACT_HASH.
298
299Sensors
300-------
301
302Since skiboot v6.3:
303
304- occ-sensors: Check if OCC is reset while reading inband sensors
305
306  OCC may not be able to mark the sensor buffer as invalid while going
307  down RESET. If OCC never comes back we will continue to read the stale
308  sensor data. So verify if OCC is reset while reading the sensor values
309  and propagate the appropriate error.
310
311IPMI
312----
313
314Since skiboot v6.3:
315
316- ipmi: ensure forward progress on ipmi_queue_msg_sync()
317
318  BT responses are handled using a timer doing the polling. To hope to
319  get an answer to an IPMI synchronous message, the timer needs to run.
320
321  We can't just check all timers though as there may be a timer that
322  wants a lock that's held by a code path calling ipmi_queue_msg_sync(),
323  and if we did enforce that as a requirement, it's a pretty subtle
324  API that is asking to be broken.
325
326  So, if we just run a poll function to crank anything that the IPMI
327  backend needs, then we should be fine.
328
329  This issue shows up very quickly under QEMU when loading the first
330  flash resource with the IPMI HIOMAP backend.
331
332NPU2
333----
334
335Since skiboot v6.4-rc1:
336
337- witherspoon: Add nvlink peers in finalise_dt()
338
339  This information is consumed by Linux so it needs to be in the DT. Move
340  it to finalise_dt().
341
342Since skiboot v6.3:
343
344- npu2: Increase timeout for L2/L3 cache purging
345
346  On NVLink2 bridge reset, we purge all L2/L3 caches in the system.
347  This is an asynchronous operation, we have a 2ms timeout here. There are
348  reports that this is not enough and "PURGE L3 on core xxx timed out"
349  messages appear (for the reference: on the test setup this takes
350  280us..780us).
351
352  This defines the timeout as a macro and changes this from 2ms to 20ms.
353
354  This adds a tracepoint to tell how long it took to purge all the caches.
355- npu2: Purge cache when resetting a GPU
356
357  After putting all a GPU's links in reset, do a cache purge in case we
358  have CPU cache lines belonging to the now-unaccessible GPU memory.
359- npu2-opencapi: Mask 2 XSL errors
360
361  Commit f8dfd699f584 ("hw/npu2: Setup an error interrupt on some
362  opencapi FIRs") converted some FIR bits default action from system
363  checkstop to raising an error interrupt. For 2 XSL error events that
364  can be triggered by a misbehaving AFU, the error interrupt is raised
365  twice, once for each link (the XSL logic in the NPU is shared between
366  2 links). So a badly behaving AFU could impact another, unsuspecting
367  opencapi adapter.
368
369  It doesn't look good and it turns out we can do better. We can mask
370  those 2 XSL errors. The error will also be picked up by the OTL logic,
371  which is per link. So we'll still get an error interrupt, but only on
372  the relevant link, and the other opencapi adapter can stay functional.
373- npu2: Clear fence state for a brick being reset
374
375  Resetting a GPU before resetting an NVLink leads to occasional HMIs
376  which fence some bricks and prevent the "reset_ntl" procedure from
377  succeeding at the "reset_ntl_release" step - the host system requires
378  reboot; there may be other cases like this as well.
379
380  This adds clearing of the fence bit in NPU.MISC.FENCE_STATE for
381  the NVLink which we are about to reset.
382- npu2: Fix clearing the FIR bits
383
384  FIR registers are SCOM-only so they cannot be accesses with the indirect
385  write, and yet we use SCOM-based addresses for these; fix this.
386
387- npu2: Reset NVLinks when resetting a GPU
388
389  Resetting a V100 GPU brings its NVLinks down and if an NPU tries using
390  those, an HMI occurs. We were lucky not to observe this as the bare metal
391  does not normally reset a GPU and when passed through, GPUs are usually
392  before NPUs in QEMU command line or Libvirt XML and because of that NPUs
393  are naturally reset first. However simple change of the device order
394  brings HMIs.
395
396  This defines a bus control filter for a PCI slot with a GPU with NVLinks
397  so when the host system issues secondary bus reset to the slot, it resets
398  associated NVLinks.
399- npu2: Reset PID wildcard and refcounter when mapped to LPID
400
401  Since 105d80f85b "npu2: Use unfiltered mode in XTS tables" we do not
402  register every PID in the XTS table so the table has one entry per LPID.
403  Then we added a reference counter to keep track of the entry use when
404  switching GPU between the host and guest systems (the "Fixes:" tag below).
405
406  The POWERNV platform setup creates such entries and references them
407  at the boot time when initializing IOMMUs and only removes it when
408  a GPU is passed through to a guest. This creates a problem as POWERNV
409  boots via kexec and no defererencing happens; the XTS table state remains
410  undefined. So when the host kernel boots, skiboot thinks there are valid
411  XTS entries and does not update the XTS table which breaks ATS.
412
413  This adds the reference counter and the XTS entry reset when a GPU is
414  assigned to LPID and we cannot rely on the kernel to clean that up.
415
416PHB4
417----
418
419Since skiboot v6.3:
420
421- hw/phb4: Make phb4_training_trace() more general
422
423  phb4_training_trace() is used to monitor the Link Training Status
424  State Machine (LTSSM) of the PHB's data link layer. Currently it is only
425  used to observe the LTSSM while bringing up the link, but sometimes it's
426  useful to see what's occurring in other situations (e.g. link disable, or
427  secondary bus reset). This patch renames it to phb4_link_trace() and
428  allows the target LTSSM state and a flexible timeout to help in these
429  situations.
430- hw/phb4: Make pci-tracing print at PR_NOTICE
431
432  When pci-tracing is enabled we print each trace status message and the
433  final trace status at PR_ERROR. The final status messages are similar to
434  those printed when we fail to train in the non-pci-tracing path and this
435  has resulted in spurious op-test failures.
436
437  This patch reduces the log-level of the tracing message to PR_NOTICE so
438  they're not accidently interpreted as actual error messages. PR_NOTICE
439  messages are still printed to the console during boot.
440- hw/phb4: Use read/write_reg in assert_perst
441
442  While the PHB is fenced we can't use the MMIO interface to access PHB
443  registers. While processing a complete reset we inject a PHB fence to
444  isolate the PHB from the rest of the system because the PHB won't
445  respond to MMIOs from the rest of the system while being reset.
446
447  We assert PERST after the fence has been erected which requires us to
448  use the XSCOM indirect interface to access the PHB registers rather than
449  the MMIO interface. Previously we did that when asserting PERST in the
450  CRESET path. However in b8b4c79d4419 ("hw/phb4: Factor out PERST
451  control"). This was re-written to use the raw in_be64() accessor. This
452  means that CRESET would not be asserted in the reset path. On some
453  Mellanox cards this would prevent them from re-loading their firmware
454  when the system was fast-reset.
455
456  This patch fixes the problem by replacing the raw {in|out}_be64()
457  accessors with the phb4_{read|write}_reg() functions.
458
459- hw/phb4: Assert Link Disable bit after ETU init
460
461  The cursed RAID card in ozrom1 has a bug where it ignores PERST being
462  asserted. The PCIe Base spec is a little vague about what happens
463  while PERST is asserted, but it does clearly specify that when
464  PERST is de-asserted the Link Training and Status State Machine
465  (LTSSM) of a device should return to the initial state (Detect)
466  defined in the spec and the link training process should restart.
467
468  This bug was worked around in 9078f8268922 ("phb4: Delay training till
469  after PERST is deasserted") by setting the link disable bit at the
470  start of the FRESET process and clearing it after PERST was
471  de-asserted. Although this fixed the bug, the patch offered no
472  explaination of why the fix worked.
473
474  In b8b4c79d4419 ("hw/phb4: Factor out PERST control") the link disable
475  workaround was moved into phb4_assert_perst(). This is called
476  always in the CRESET case, but a following patch resulted in
477  assert_perst() not being called if phb4_freset() was entered following a
478  CRESET since p->skip_perst was set in the CRESET handler. This is bad
479  since a side-effect of the CRESET is that the Link Disable bit is
480  cleared.
481
482  This, combined with the RAID card ignoring PERST results in the PCIe
483  link being trained by the PHB while we're waiting out the 100ms
484  ETU reset time. If we hack skiboot to print a DLP trace after returning
485  from phb4_hw_init() we get: ::
486
487     PHB#0001[0:1]: Initialization complete
488     PHB#0001[0:1]: TRACE:0x0000102101000000  0ms presence GEN1:x16:polling
489     PHB#0001[0:1]: TRACE:0x0000001101000000 23ms          GEN1:x16:detect
490     PHB#0001[0:1]: TRACE:0x0000102101000000 23ms presence GEN1:x16:polling
491     PHB#0001[0:1]: TRACE:0x0000183101000000 29ms training GEN1:x16:config
492     PHB#0001[0:1]: TRACE:0x00001c5881000000 30ms training GEN1:x08:recovery
493     PHB#0001[0:1]: TRACE:0x00001c5883000000 30ms training GEN3:x08:recovery
494     PHB#0001[0:1]: TRACE:0x0000144883000000 33ms presence GEN3:x08:L0
495     PHB#0001[0:1]: TRACE:0x0000154883000000 33ms trained  GEN3:x08:L0
496     PHB#0001[0:1]: CRESET: wait_time = 100
497     PHB#0001[0:1]: FRESET: Starts
498     PHB#0001[0:1]: FRESET: Prepare for link down
499     PHB#0001[0:1]: FRESET: Assert skipped
500     PHB#0001[0:1]: FRESET: Deassert
501     PHB#0001[0:1]: TRACE:0x0000154883000000  0ms trained  GEN3:x08:L0
502     PHB#0001[0:1]: TRACE: Reached target state
503     PHB#0001[0:1]: LINK: Start polling
504     PHB#0001[0:1]: LINK: Electrical link detected
505     PHB#0001[0:1]: LINK: Link is up
506     PHB#0001[0:1]: LINK: Went down waiting for stabilty
507     PHB#0001[0:1]: LINK: DLP train control: 0x0000105101000000
508     PHB#0001[0:1]: CRESET: Starts
509
510  What has happened here is that the link is trained to 8x Gen3 33ms after
511  we return from phb4_init_hw(), and before we've waitined to 100ms
512  that we normally wait after re-initialising the ETU. When we "deassert"
513  PERST later on in the FRESET handler the link in L0 (normal) state. At
514  this point we try to read from the Vendor/Device ID register to verify
515  that the link is stable and immediately get a PHB fence due to a PCIe
516  Completion Timeout. Skiboot attempts to recover by doing another CRESET,
517  but this will encounter the same issue.
518
519  This patch fixes the problem by setting the Link Disable bit (by calling
520  phb4_assert_perst()) immediately after we return from phb4_init_hw().
521  This prevents the link from being trained while PERST is asserted which
522  seems to avoid the Completion Timeout. With the patch applied we get: ::
523
524     PHB#0001[0:1]: Initialization complete
525     PHB#0001[0:1]: TRACE:0x0000102101000000  0ms presence GEN1:x16:polling
526     PHB#0001[0:1]: TRACE:0x0000001101000000 23ms          GEN1:x16:detect
527     PHB#0001[0:1]: TRACE:0x0000102101000000 23ms presence GEN1:x16:polling
528     PHB#0001[0:1]: TRACE:0x0000909101000000 29ms presence GEN1:x16:disabled
529     PHB#0001[0:1]: CRESET: wait_time = 100
530     PHB#0001[0:1]: FRESET: Starts
531     PHB#0001[0:1]: FRESET: Prepare for link down
532     PHB#0001[0:1]: FRESET: Assert skipped
533     PHB#0001[0:1]: FRESET: Deassert
534     PHB#0001[0:1]: TRACE:0x0000001101000000  0ms          GEN1:x16:detect
535     PHB#0001[0:1]: TRACE:0x0000102101000000  0ms presence GEN1:x16:polling
536     PHB#0001[0:1]: TRACE:0x0000001101000000 24ms          GEN1:x16:detect
537     PHB#0001[0:1]: TRACE:0x0000102101000000 36ms presence GEN1:x16:polling
538     PHB#0001[0:1]: TRACE:0x0000183101000000 97ms training GEN1:x16:config
539     PHB#0001[0:1]: TRACE:0x00001c5881000000 97ms training GEN1:x08:recovery
540     PHB#0001[0:1]: TRACE:0x00001c5883000000 97ms training GEN3:x08:recovery
541     PHB#0001[0:1]: TRACE:0x0000144883000000 99ms presence GEN3:x08:L0
542     PHB#0001[0:1]: TRACE: Reached target state
543     PHB#0001[0:1]: LINK: Start polling
544     PHB#0001[0:1]: LINK: Electrical link detected
545     PHB#0001[0:1]: LINK: Link is up
546     PHB#0001[0:1]: LINK: Link is stable
547     PHB#0001[0:1]: LINK: Card [9005:028c] Optimal Retry:disabled
548     PHB#0001[0:1]: LINK: Speed Train:GEN3 PHB:GEN4 DEV:GEN3
549     PHB#0001[0:1]: LINK: Width Train:x08 PHB:x08 DEV:x08
550     PHB#0001[0:1]: LINK: RX Errors Now:0 Max:8 Lane:0x0000
551
552
553Simulators
554----------
555
556Since skiboot v6.3:
557
558- external/mambo: Bump default POWER9 to Nimbus DD2.3
559- external/mambo: fix tcl startup code for mambo bogus net (repost)
560
561  This fixes a couple issues with external/mambo/skiboot.tcl so I can use the
562  mambo bogus net.
563
564  * newer distros (ubuntu 18.04) allow tap device to have a user specified
565    name instead of just tapN so we need to pass in a name not a number.
566  * need some kind of default for net_mac, and need the mconfig for it
567    to be set from an env var.
568- skiboot.tcl: Add option to wait for GDB server connection
569
570  Add an environment variable which makes Mambo wait for a connection
571  from gdb prior to starting simulation.
572- mambo: Integrate addr2line into backtrace command
573
574  Gives nice output like this: ::
575
576       systemsim % bt
577       pc:                             0xC0000000002BF3D4      _savegpr0_28+0x0
578       lr:                             0xC00000000004E0F4      opal_call+0x10
579       stack:0x000000000041FAE0        0xC00000000004F054      opal_check_token+0x20
580       stack:0x000000000041FB50        0xC0000000000500CC      __opal_flush_console+0x88
581       stack:0x000000000041FBD0        0xC000000000050BF8      opal_flush_console+0x24
582       stack:0x000000000041FC00        0xC0000000001F9510      udbg_opal_putc+0x88
583       stack:0x000000000041FC40        0xC000000000020E78      udbg_write+0x7c
584       stack:0x000000000041FC80        0xC0000000000B1C44      console_unlock+0x47c
585       stack:0x000000000041FD80        0xC0000000000B2424      register_console+0x320
586       stack:0x000000000041FE10        0xC0000000003A5328      register_early_udbg_console+0x98
587       stack:0x000000000041FE80        0xC0000000003A4F14      setup_arch+0x68
588       stack:0x000000000041FEF0        0xC0000000003A0880      start_kernel+0x74
589       stack:0x000000000041FF90        0xC00000000000AC60      start_here_common+0x1c
590
591- mambo: Add addr2func for symbol resolution
592
593  If you supply a VMLINUX_MAP/SKIBOOT_MAP/USER_MAP addr2func can guess
594  at your symbol name. i.e. ::
595
596      systemsim % p pc
597      0xC0000000002A68F8
598      systemsim % addr2func [p pc]
599      fdt_offset_ptr+0x78
600
601- lpc-port80h: Don't write port 80h when running under Simics
602
603  Simics doesn't model LPC port 80h. Writing to it terminates the
604  simulation due to an invalid LPC memory access. This patch adds a
605  check to ensure port 80h isn't accessed if we are running under
606  Simics.
607- device-tree: speed up fdt building on slow simulators
608
609  Trade size for speed and avoid de-duplicating strings in the fdt.
610  This costs about 2kB in fdt size, and saves about 8 million instructions
611  (almost half of all instructions) booting skiboot in mambo.
612- fast-reboot:: skip read-only memory checksum for slow simulators
613
614  Skip the fast reboot checksum, which costs about 4 million cycles
615  booting skiboot in mambo.
616- nx: remove check on the "qemu, powernv" property
617
618  commit 95f7b3b9698b ("nx: Don't abort on missing NX when using a QEMU
619  machine") introduced a check on the property "qemu,powernv" to skip NX
620  initialization when running under a QEMU machine.
621
622  The QEMU platforms now expose a QUIRK_NO_RNG in the chip. Testing the
623  "qemu,powernv" property is not necessary anymore.
624- plat/qemu: add a POWER8 and POWER9 platform
625
626  These new QEMU platforms have characteristics closer to real OpenPOWER
627  systems that we use today and define a different BMC depending on the
628  CPU type. New platform properties are introduced for each,
629  "qemu,powernv8", "qemu,powernv9" and these should be compatible with
630  existing QEMUs which only expose the "qemu,powernv" property
631- libc/string: speed up common string functions
632
633  Use compiler builtins for the string functions, and compile the
634  libc/string/ directory with -O2.
635
636  This reduces instructions booting skiboot in mambo by 2.9 million in
637  slow-sim mode, or 3.8 in normal mode, for less than 1kB image size
638  increase.
639
640  This can result in the compiler warning more cases of string function
641  problems.
642- external/mambo: Add an option to exit Mambo when the system is shutdown
643
644  Automatically exiting can be convenient for scripting. Will also exit
645  due to a HW crash (eg. unhandled exception).
646
647VESNIN platform
648---------------
649
650Since skiboot v6.3:
651
652- platforms/vesnin: PCI inventory via IPMI OEM
653
654  Replace raw protocol with OEM message supported by OpenBMC's IPMI
655  plugins.
656
657  BMC-side implementation (IPMI plug-in):
658  https://github.com/YADRO-KNS/phosphor-pci-inventory
659
660Utilities
661---------
662
663Since skiboot v6.3:
664
665- opal-gard: Account for ECC size when clearing partition
666
667  When 'opal-gard clear all' is run, it works by erasing the GUARD then
668  using blockevel_smart_write() to write nothing to the partition. This
669  second write call is needed because we rely on libflash to set the ECC
670  bits appropriately when the partition contained ECCed data.
671
672  The API for this is a little odd with the caller specifying how much
673  actual data to write, and libflash writing size + size/8 bytes
674  since there is one additional ECC byte for every eight bytes of data.
675
676  We currently do not account for the extra space consumed by the ECC data
677  in reset_partition() which is used to handle the 'clear all' command.
678  Which results in the paritition following the GUARD partition being
679  partially overwritten when the command is used. This patch fixes the
680  problem by reducing the length we would normally write by the number
681  of ECC bytes required.
682
683
684Build and debugging
685-------------------
686
687Since skiboot v6.3:
688
689- Disable -Waddress-of-packed-member for GCC9
690
691  We throw a bunch of errors in errorlog code otherwise, which we should
692  fix, but we don't *have* to yet.
693
694- Fix a lot of sparse warnings
695- With new GCC comes larger GCOV binaries
696
697  So we need to change our heap size to make more room for data/bss
698  without having to change where the console is or have more fun moving
699  things about.
700- Intentionally discard fini_array sections
701
702  Produced in a SKIBOOT_GCOV=1 build, and never called by skiboot.
703- external/trace: Add follow option to dump_trace
704
705  When monitoring traces, an option like the tail command's '-f' (follow)
706  is very useful. This option continues to append to the output as more
707  data arrives. Add an '-f' option to allow dump_trace to operate
708  similarly.
709
710  Tail also provides a '-s' (sleep time) option that
711  accompanies '-f'.  This controls how often new input will be polled. Add
712  a '-s' option that will make dump_trace sleep for N milliseconds before
713  checking for new input.
714- external/trace: Add support for dumping multiple buffers
715
716  dump_trace only can dump one trace buffer at a time. It would be handy
717  to be able to dump multiple buffers and to see the entries from these
718  buffers displayed in correct timestamp order. Each trace buffer is
719  already sorted by timestamp so use a heap to implement an efficient
720  k-way merge. Use the CCAN heap to implement this sort. However the CCAN
721  heap does not have a 'heap_replace' operation. We need to 'heap_pop'
722  then 'heap_push' to replace the root which means rebalancing twice
723  instead of once.
724- external/trace: mmap trace buffers in dump_trace
725
726  The current lseek/read approach used in dump_trace does not correctly
727  handle certain aspects of the buffers. It does not use the start and end
728  position that is part of the buffer so it will not begin from the
729  correct location. It does not move back to the beginning of the trace
730  buffer file as the buffer wraps around. It also does not handle the
731  overflow case of the writer overwriting when the reader is up to.
732
733  Mmap the trace buffer file so that the existing reading functions in
734  extra/trace.c can be used. These functions already handle the cases of
735  wrapping and overflow.  This reduces code duplication and uses functions
736  that are already unit tested. However this requires a kernel where the
737  trace buffer sysfs nodes are able to be mmaped (see
738  https://patchwork.ozlabs.org/patch/1056786/)
739- core/trace: Export trace buffers to sysfs
740
741  Every property in the device-tree under /ibm,opal/firmware/exports has a
742  sysfs node created in /firmware/opal/exports. Add properties with the
743  physical address and size for each trace buffer so they are exported.
744- core/trace: Add pir number to debug_descriptor
745
746  The names given to the trace buffers when exported to sysfs should show
747  what cpu they are associated with to make it easier to understand there
748  output.  The debug_descriptor currently stores the address and length of
749  each trace buffer and this is used for adding properties to the device
750  tree. Extend debug_descriptor to include a cpu associated with each
751  trace. This will be used for creating properties in the device-tree
752  under /ibm,opal/firmware/exports/.
753- core/trace: Change trace buffer size
754
755  We want to be able to mmap the trace buffers to be used by the
756  dump_trace tool. As mmaping is done in terms of pages it makes sense
757  that the size of the trace buffers should be page aligned.  This is
758  slightly complicated by the space taken up by the header at the
759  beginning of the trace and the room left for an extra trace entry at the
760  end of the buffer. Change the size of the buffer itself so that the
761  entire trace buffer size will be page aligned.
762- core/trace: Change buffer alignment from 4K to 64K
763
764  We want to be able to mmap the trace buffers to be used by the
765  dump_trace tool. This means that the trace bufferes must be page
766  aligned.  Currently they are aligned to 4K. Most power systems have a
767  64K page size. On systems with a 4K page size, 64K aligned will still be
768  page aligned.  Change the allocation of the trace buffers to be 64K
769  aligned.
770
771  The trace_info struct that contains the trace buffer is actually what is
772  allocated aligned memory. This means the trace buffer itself is not
773  actually aligned and this is the address that is currently exposed
774  through sysfs.  To get around this change the address that is exposed to
775  sysfs to be the trace_info struct. This means the lock in trace_info is
776  now visible too.
777- external/trace: Use correct width integer byte swapping
778
779  The trace_repeat struct uses be16 for storing the number of repeats.
780  Currently be32_to_cpu conversion is used to display this member. This
781  produces an incorrect value. Use be16_to_cpu instead.
782- core/trace: Put boot_tracebuf in correct location.
783
784  A position for the boot_tracebuf is allocated in skiboot.lds.S.
785  However, without a __section attribute the boot trace buffer is not
786  placed in the correct location, meaning that it also will not be
787  correctly aligned.  Add the __section attribute to ensure it will be
788  placed in its allocated position.
789- core/lock: Add debug options to store backtrace of where lock was taken
790
791  Contrary to popular belief, skiboot developers are imperfect and
792  occasionally write locking bugs. When we exit skiboot, we check if we're
793  still holding any locks, and if so, we print an error with a list of the
794  locks currently held and the locations where they were taken.
795
796  However, this only tells us the location where lock() was called, which may
797  not be enough to work out what's going on. To give us more to go on with,
798  we can store backtrace data in the lock and print that out when we
799  unexpectedly still hold locks.
800
801  Because the backtrace data is rather big, we only enable this if
802  DEBUG_LOCKS_BACKTRACE is defined, which in turn is switched on when
803  DEBUG=1.
804
805  (We disable DEBUG_LOCKS_BACKTRACE in some of the memory allocation tests
806  because the locks used by the memory allocator take up too much room in the
807  fake skiboot heap.)
808- libfdt: upgrade to upstream dtc.git 243176c
809
810  Upgrade libfdt/ to github.com/dgibson/dtc.git 243176c ("Fix bogus
811  error on rebuild")
812
813  This copies dtc/libfdt/ to skiboot/libfdt/, with the only change in
814  that directory being the addition of README.skiboot and Makefile.inc.
815
816  This adds about 14kB text, 2.5kB compressed xz. This could be reduced
817  or mostly eliminated by cutting out fdt version checks and unused
818  code, but tracking upstream is a bigger benefit at the moment.
819
820  This loses commits:
821
822  - 14ed2b842f61 ("libfdt: add basic sanity check to fdt_open_into")
823  - bc7bb3d12bc1 ("sparse: fix declaration of fdt_strerror")
824
825  As well as some prehistoric similar kinds of things, which is the
826  punishment for us not being good downstream citizens and sending
827  things upstream! Syncing to upstream will make that effort simpler
828  in future.
829
830General Fixes
831-------------
832
833Since skiboot v6.4-rc1:
834
835- libflash: Fix broken continuations
836
837  Some of the libflash debug messages don't print a newlines at the end of
838  the line and assume that the next print will be contigious with the
839  last. This isn't true in skiboot since log messages are prefixed with a
840  timestamp. This results in funny looking output such as: ::
841
842    LIBFLASH: Verifying...
843    LIBFLASH:   reading page 0x01963000..0x01964000...[3.084846885,7]  same !
844    LIBFLASH:   reading page 0x01964000..0x01965000...[3.086164489,7]  same !
845
846  Fix this by moving the "same !" debug message to a new line with the
847  prefix "LIBFLASH:   ..." to indicate it's a continuation of the last
848  statement.
849
850  First reported in https://github.com/open-power/skiboot/issues/51
851