1.. _skiboot-6.1-rc1:
2
3skiboot-6.1-rc1
4===============
5
6skiboot v6.1-rc1 was released on Friday June 22nd 2018. It is the first
7release candidate of skiboot 6.1, which will become the new stable release
8of skiboot following the 6.0 release, first released May 11th 2018.
9
10Skiboot 6.1 will mark the basis for op-build v2.1.
11
12skiboot v6.1-rc1 contains all bug fixes as of :ref:`skiboot-6.0.4`,
13and :ref:`skiboot-5.4.9` (the currently maintained
14stable releases).
15
16For how the skiboot stable releases work, see :ref:`stable-rules` for details.
17
18This release contains a lot of small cleanups and fixes all over the place,
19which is possibly a sign that we've shipped our big POWER9 GA release and
20now get to breathe for a moment to look at what we ended up with.
21Since this is a really small incremental release, there will unlikely be
22many release candidates.
23
24Over skiboot 6.0, we have the following changes:
25
26General changes and bug fixes
27-----------------------------
28
29- GCC8 build fixes
30- Add prepare_hbrt_update to hbrt interfaces
31
32  Add placeholder support for prepare_hbrt_update call into
33  hostboot runtime (opal-prd) code.  This interface is only
34  called as part of a concurrent code update on a FSP based
35  system.
36- cpu: Clear PCR SPR in opal_reinit_cpus()
37
38  Currently if Linux boots with a non-zero PCR, things can go bad where
39  some early userspace programs can take illegal instructions. This is
40  being fixed in Linux, but in the mean time, we should cleanup in
41  skiboot also.
42- pci: Fix PCI_DEVICE_ID()
43
44  The vendor ID is 16 bits not 8. This error leaves the top of the vendor
45  ID in the bottom bits of the device ID, which resulted in e.g. a failure
46  to run the PCI quirk for the AST VGA device.
47- Quieten console output on boot
48
49  We print out a whole bunch of things on boot, most of which aren't
50  interesting, so we should *not* print them instead.
51
52  Printing things like what CPUs we found and what PCI devices we found
53  *are* useful, so continue to do that. But we don't need to splat out
54  a bunch of things that are always going to be true.
55- core/console: fix deadlock when printing with console lock held
56
57  Some debugging options will print while the console lock is held,
58  which is why the console lock is taken as a recursive lock.
59  However console_write calls __flush_console, which will drop and
60  re-take the lock non-recursively in some cases.
61
62  Just set con_need_flush and return from __flush_console if we are
63  holding the console lock already.
64
65  This stack usage message (taken with this patch applied) could lead
66  to a deadlock without this: ::
67
68    CPU 0000 lowest stack mark 11768 bytes left pc=300cb808 token=0
69    CPU 0000 Backtrace:
70    S: 0000000031c03370 R: 00000000300cb808   .list_check_node+0x1c
71    S: 0000000031c03410 R: 00000000300cb910   .list_check+0x38
72    S: 0000000031c034b0 R: 00000000300190ac   .try_lock_caller+0xb8
73    S: 0000000031c03540 R: 00000000300192e0   .lock_caller+0x80
74    S: 0000000031c03600 R: 0000000030012c70   .__flush_console+0x134
75    S: 0000000031c036d0 R: 00000000300130cc   .console_write+0x68
76    S: 0000000031c03780 R: 00000000300347bc   .vprlog+0xc8
77    S: 0000000031c03970 R: 0000000030034844   ._prlog+0x50
78    S: 0000000031c03a00 R: 00000000300364a4   .log_simple_error+0x74
79    S: 0000000031c03b90 R: 000000003004ab48   .occ_pstates_init+0x184
80    S: 0000000031c03d50 R: 000000003001480c   .load_and_boot_kernel+0x38c
81    S: 0000000031c03e30 R: 000000003001571c   .main_cpu_entry+0x62c
82    S: 0000000031c03f00 R: 0000000030002700   boot_entry+0x1c0
83- opal-prd: Do not error out on first failure for soft/hard offline.
84
85  The memory errors (CEs and UEs) that are detected as part of background
86  memory scrubbing are reported by PRD asynchronously to opal-prd along with
87  affected memory ranges. hservice_memory_error() converts these ranges into
88  page granularity before hooking up them to soft/hard offline-ing
89  infrastructure.
90
91  But the current implementation of hservice_memory_error() does not hookup
92  all the pages to soft/hard offline-ing if any of the page offline action
93  fails. e.g hard offline can fail for:
94
95  - Pages that are not part of buddy managed pool.
96  - Pages that are reserved by kernel using memblock_reserved()
97  - Pages that are in use by kernel.
98
99  But for the pages that are in use by user space application, the hard
100  offline marks the page as hwpoison, sends SIGBUS signal to kill the
101  affected application as recovery action and returns success.
102
103  Hence, It is possible that some of the pages in that memory range are in
104  use by application or free. By stopping on first error we loose the
105  opportunity to hwpoison the subsequent pages which may be free or in use by
106  application. This patch fixes this issue.
107- libflash/blocklevel_write: Fix missing error handling
108
109  Caught by scan-build, we seem to trap the errors in rc, but
110  not take any recovery action during blocklevel_write.
111
112I2C
113^^^
114- p8-i2c: fix wrong request status when a reset is needed
115
116  If the bus is found in error state when starting a new request, the
117  engine is reset and we enter recovery. However, once complete, the
118  reset operation shows a status of complete in the status register. So
119  any badly-timed called to check_status() will think the current top
120  request is complete, even though it hasn't run yet.
121
122  So don't update any request status while we are in recovery, as
123  nothing useful for the request is supposed to happen in that state.
124- p8-i2c: Remove force reset
125
126  Force reset was added as an attempt to work around some issues with TPM
127  devices locking up their I2C bus. In that particular case the problem
128  was that the device would hold the SCL line down permanently due to a
129  device firmware bug. The force reset doesn't actually do anything to
130  alleviate the situation here, it just happens to reset the internal
131  master state enough to make the I2C driver appear to work until
132  something tries to access the bus again.
133
134  On P9 systems with secure boot enabled there is the added problem
135  of the "diagostic mode" not being supported on I2C masters A,B,C and
136  D. Diagnostic mode allows the SCL and SDA lines to be driven directly
137  by software. Without this force reset is impossible to implement.
138
139  This patch removes the force reset functionality entirely since:
140
141  a) it doesn't do what it's supposed to, and
142  b) it's butt ugly code
143
144  Additionally, turn p8_i2c_reset_engine() into p8_i2c_reset_port().
145  There's no need to reset every port on a master in response to an
146  error that occurred on a specific port.
147- libstb/i2c-driver: Bump max timeout
148
149  We have observed some TPMs clock streching the I2C bus for signifigant
150  amounts of time when processing commands. The same TPMs also have
151  errata that can result in permernantly locking up a bus in response to
152  an I2C transaction they don't understand. Using an excessively long
153  timeout to prevent this in the field.
154- hdata: Add TPM timeout workaround
155
156  Set the default timeout for any bus containing a TPM to one second. This
157  is needed to work around a bug in the firmware of certain TPMs that will
158  clock strech the I2C port the for up to a second. Additionally, when the
159  TPM is clock streching it responds to a STOP condition on the bus by
160  bricking itself. Clearing this error requires a hard power cycle of the
161  system since the TPM is powered by standby power.
162- p8-i2c: Allow a per-port default timeout
163
164  Add support for setting a default timeout for the I2C port to the
165  device-tree. This is consumed by skiboot.
166
167IPMI Watchdog
168^^^^^^^^^^^^^
169- ipmi-watchdog: Support handling re-initialization
170
171  Watchdog resets can return an error code from the BMC indicating that
172  the BMC watchdog was not initialized. Currently we abort skiboot due to
173  a missing error handler. This patch implements handling
174  re-initialization for the watchdog, automatically saving the last
175  watchdog set values and re-issuing them if needed.
176- ipmi-watchdog: The stop action should disable reset
177
178  Otherwise it is possible for the reset timer to elapse and trigger the
179  watchdog to wake back up. This doesn't affect the behavior of the
180  system since we are providing a NONE action to the BMC. However we would
181  like to avoid the action from taking place if possible.
182- ipmi-watchdog: Add a flag to determine if we are still ticking
183
184  This makes it easier for future changes to ensure that the watchdog
185  stops ticking and doesn't requeue itself for execution in the
186  background. This way it is safe for resets to be performed after the
187  ticks are assumed to be stopped and it won't start the timer again.
188- ipmi-watchdog: (prepare for) not disabling at shutdown
189
190  The op-build linux kernel has been configured to support the ipmi
191  watchdog. This driver will always handle the watchdog by either leaving
192  it enabled if configured, or by disabling it during module load if no
193  configuration is provided. This increases the coverage of the watchdog
194  during the boot process. The watchdog should no longer be disabled at
195  any point during skiboot execution.
196
197  We're not enabling this by default yet as people can (and do, at least in
198  development) mix and match old BOOTKERNEL with new skiboot and we don't
199  want to break that too obviously.
200- ipmi-watchdog: Don't reset the watchdog twice
201
202  There is no clarification for why this change was needed, but presumably
203  this is due to a buggy BMC implementation where the Watchdog Set command
204  was processed concurrently or after the initial Watchdog Reset. This
205  inversion would cause the watchdog to stop since the DONT_STOP bit was
206  not set. Since we are now using the DONT_STOP bit during initialization,
207  the watchdog should not be stopped even if an inversion occurs.
208- ipmi-watchdog: Make it possible to set DONT_STOP
209
210  The IPMI standard supports setting a DONT_STOP bit during an Watchdog
211  Set operation. Most of the time we don't want to stop the Watchdog when
212  updating the settings so we should be using this bit. This patch makes
213  it possible for callers of set_wdt to prevent the watchdog from being
214  stopped. This only changes the behavior of the watchdog during the
215  initial settings update when initializing skiboot. The watchdog is no
216  longer disabled and then immediately re-enabled.
217- ipmi-watchdog: WD_POWER_CYCLE_ACTION -> WD_RESET_ACTION
218
219  The IPMI specification denotes that action 0x1 is Host Reset and 0x3 is
220  Host Power Cycle. Use the correct name for Reset in our watchdog code.
221
222
223POWER8 platforms
224----------------
225
226- astbmc: Enable mbox depending on scratch reg
227
228  P8 boxes can opt in for mbox pnor support if they set the scratch
229  register bit to indicate it is supported.
230
231Simulator platforms
232-------------------
233- plat/qemu: add PNOR support
234
235  To access the PNOR, OPAL/skiboot drives the BMC SPI controller using
236  the iLPC2AHB device of the BMC SuperIO controller and accesses the
237  flash contents using the LPC FW address space on which the PNOR is
238  remapped.
239
240  The QEMU PowerNV machine now integrates such models (SuperIO
241  controller, iLPC2AHB device) and also a pseudo Aspeed SoC AHB memory
242  space populated with the SPI controller registers (same model as for
243  ARM). The AHB window giving access to the contents of the BMC SPI
244  controller flash modules is mapped on the LPC FW address space.
245
246  The change should be compatible for machine without PNOR support.
247- external/mambo: Add support for readline if it exists
248
249  Add support for tclreadline package if it is present.
250  This patch loads the package and uses it when the
251  simulation stops for any reason.
252
253
254FSP based platforms
255-------------------
256
257- Disable fast reboot on FSP IPL side change
258
259  If FSP changes next IPL side, then disable fast reboot.
260
261  sample output: ::
262
263      [  620.196442259,5] FSP: Got sysparam update, param ID 0xf0000007
264      [  620.196444501,5] CUPD: FW IPL side changed. Disable fast reboot
265      [  620.196445389,5] CUPD: Next IPL side : perm
266- fsp/console: Always establish OPAL console API backend
267
268  Currently we only call set_opal_console() to establish the backend
269  used by the OPAL console API if we find at least one FSP serial
270  port in HDAT.
271
272  On systems where there is none (IPMI only), we fail to set it,
273  causing the console code to try to use the dummy console causing
274  an assertion failure during boot due to clashing on the device-tree
275  node names.
276
277  So always set it if an FSP is present
278
279AST BMC based platforms
280-----------------------
281
282- AMI BMC: use 0x3a as OEM command
283
284  The 0x3a OEM command is for IBM commands, while 0x32 was for AMI ones.
285  Sometime in the P8 timeframe, AMI BMCs were changed to listen for our
286  commands on either 0x32 or 0x3a. Since 0x3a is the direction forward,
287  we'll use that, as P9 machines with AMI BMCs probably also want these
288  to work, and let's not bet that 0x32 will continue to be okay.
289- astbmc: Set romulus BMC type to OpenBMC
290- platform/astbmc: Do not delete compatible property
291
292  P9 onwards OPAL is building device tree for BMC based system using
293  HDAT. We are populating bmc/compatible node with bmc version. Hence
294  do not delete this property.
295
296Utilities
297---------
298- external/xscom-utils: Add python library for xscom access
299
300  Patch adds a simple python library module for xscom access.
301  It directly manipulate the '/access' file for scom read
302  and write from debugfs 'scom' directory.
303
304  Example on how to generate a getscom using this module:
305
306  .. code-block:: python
307
308     from adu_scoms import *
309     getscom = GetSCom()
310     getscom.parse_args()
311     getscom.run_command()
312
313  Sample output for above getscom.py:
314
315  .. code-block:: console
316
317    # ./getscom.py -l
318    Chip ID  | Rev   | Chip type
319    ---------|-------|-----------
320    00000008 | DD2.0 | P9 (Nimbus) processor
321    00000000 | DD2.0 | P9 (Nimbus) processor
322- ffspart: Don't require user to create blank partitions manually
323
324  Add '--allow-empty' which allows the filename for a given partition to
325  be blank. If set ffspart will set that part of the PNOR file 'blank' and
326  set ECC bits if required.
327  Without this option behaviour is unchanged and ffspart will return an
328  error if it can not find the partition file.
329- pflash: Use correct prefix when installing
330
331  pflash uses lowercase prefix when running make install in it's
332  direcetory, but uppercase PREFIX when running it in shared. Use
333  lowercase everywhere.
334
335  With this the OpenBMC bitbake recipie can drop an out of tree patch it's
336  been carrying for years.
337
338
339POWER9
340------
341
342- occ-sensor: Avoid using uninitialised struct cpu_thread
343
344  When adding the sensors in occ_sensors_init, if the type is not
345  OCC_SENSOR_LOC_CORE, then the loop to find 'c' will not be executed.
346  Then c->pir is used for both of the the add_sensor_node calls below.
347
348  This provides a default value of 0 instead.
349- NX: Add NX coprocessor init opal call
350
351  The read offset (4:11) in Receive FIFO control register is incremented
352  by FIFO size whenever CRB read by NX. But the index in RxFIFO has to
353  match with the corresponding entry in FIFO maintained by VAS in kernel.
354  VAS entry is reset to 0 when opening the receive window during driver
355  initialization. So when NX842 is reloaded or in kexec boot, possibility
356  of mismatch between RxFIFO control register and VAS entries in kernel.
357  It could cause CRB failure / timeout from NX.
358
359  This patch adds nx_coproc_init opal call for kernel to initialize
360  readOffset (4:11) and Queued (15:23) in RxFIFO control register.
361- SLW: Remove stop1_lite and stop2_lite
362
363  stop1_lite has been removed since it adds no additional benefit
364  over stop0_lite. stop2_lite has been removed since currently it adds
365  minimal benefit over stop2. However, the benefit is eclipsed by the time
366  required to ungate the clocks
367
368  Moreover, Lite states don't give up the SMT resources, can potentially
369  have a performance impact on sibling threads.
370
371  Since current OSs (Linux) aren't smart enough to make good decisions
372  with these stop states, we're (temporarly) removing them from what
373  we expose to the OS, the idea being to bring them back in a new
374  DT representation so that only an OS that knows what to do will
375  do things with them.
376- cpu: Use STOP1 on POWER9 for idle/sleep inside OPAL
377
378  The current code requests STOP3, which means it gets STOP2 in practice.
379
380  STOP2 has proven to occasionally be unreliable depending on FW
381  version and chip revision, it also requires a functional CME,
382  so instead, let's use STOP1. The difference is rather minimum
383  for something that is only used a few seconds during boot.
384
385NPU2 (NVLink2 and OpenCAPI)
386^^^^^^^^^^^^^^^^^^^^^^^^^^^
387
388- npu2: Reset NVLinks on hot reset
389
390  This effectively fences GPU RAM on GPU reset so the host system
391  does not have to crash every time we stop a KVM guest with a GPU
392  passed through.
393- npu2-opencapi: reduce number of retries to train the link
394
395  We've been reliably training the opencapi link on the first attempt
396  for quite a while. Furthermore, if it doesn't train on the first
397  attempt, retries haven't been that useful. So let's reduce the number
398  of attempts we do to train the link.
399
400  2 retries = 3 attempts to train.
401
402  Each (failed) training sequence costs about 3 seconds.
403- opal/hmi: Display correct chip id while printing NPU FIRs.
404
405  HMIs for NPU xstops are broadcasted to all chips. All cores on all the
406  chips receive HMI. HMI handler correctly identifies and extracts the
407  NPU FIR details from affected chip, but while printing FIR data it
408  prints chip id and location code details of this_cpu()->chip_id which
409  may not be correct. This patch fixes this issue.
410- npu2-opencapi: Fix link state to report link down
411
412  The PHB callback 'get_link_state' is always reporting the link width,
413  irrespective of the link status and even when the link is down. It is
414  causing too much work (and failures) when the PHB is probed during pci
415  init.
416  The fix is to look at the link status first and report the link as
417  down when appropriate.
418- npu2-opencapi: Cleanup traces printed during link training
419
420  Now that links may train in parallel, traces shown during training can
421  be all mixed up. So add a prefix to all the traces to clearly identify
422  the chip and link the trace refers to: ::
423
424    OCAPI[<chip id>:<link id>]: this is a very useful message
425
426  The lower-level hardware procedures (npu2-hw-procedures.c) also print
427  traces which would need work. But that code is being reworked to be
428  better integrated with opencapi and nvidia, so leave it alone for now.
429- npu2-opencapi: Train links on fundamental reset
430
431  Reorder our link training steps so that they are executed on
432  fundamental reset instead of during the initial setup. Skiboot always
433  call a fundamental reset on all the PHBs during pci init.
434
435  It is done through a state machine, similarly to what is done for
436  'real' PHBs.
437
438  This is the first step for a longer term goal to be able to trigger an
439  adapter reset from linux. We'll need the reset callbacks of the PHB to
440  be defined. We have to handle the various delays differently, since a
441  linux thread shouldn't stay stuck waiting in opal for too long.
442- npu2-opencapi: Rework adapter reset
443
444  Rework a bit the code to reset the opencapi adapter:
445
446  - make clearer which i2c pin is resetting which device
447  - break the reset operation in smaller chunks. This is really to
448    prepare for a future patch.
449
450  No functional changes.
451- npu2-opencapi: Use presence detection
452
453  Presence detection is not part of the opencapi specification. So each
454  platform may choose to implement it the way it wants.
455
456  All current platforms implement it through an i2c device where we can
457  query a pin to know if a device is connected or not. ZZ and Zaius have
458  a similar design and even use the same i2c information and pin
459  numbers.
460  However, presence detection on older ZZ planar (older than v4) doesn't
461  work, so we don't activate it for now, until our lab systems are
462  upgraded and it's better tested.
463
464  Presence detection on witherspoon is still being worked on. It's
465  shaping up to be quite different, so we may have to revisit the topic
466  in a later patch.
467