1.. _skiboot-5.4.0-rc1:
2
3skiboot-5.4.0-rc1
4=================
5
6skiboot-5.4.0-rc1 was released on Monday October 17th 2016. It is the first
7release candidate of skiboot 5.4, which will become the new stable release
8of skiboot following the 5.3 release, first released August 2nd 2016.
9
10skiboot-5.4.0-rc1 contains all bug fixes as of :ref:`skiboot-5.3.7`
11and :ref:`skiboot-5.1.18` (the currently maintained stable releases).
12
13For how the skiboot stable releases work, see :ref:`stable-rules` for details.
14
15The current plan is to release a new release candidate every week until we
16feel good about it. The aim is for skiboot-5.4.x to be in op-build v1.13, which
17is due by November 23rd 2016.
18
19Over skiboot-5.3, we have the following changes:
20
21New Features
22------------
23- Initial Trusted Boot support (see :ref:`stb-overview`).
24  There are several limitations with this initial release:
25
26    - CAPP partition is not measured correctly
27    - Only Nuvoton TPM 2.0 is supported
28    - Requires hardware rework on late revision Habanero or Firestone boards
29      in order to install TPM.
30
31  - Add i2c Nuvoton TPM 2.0 Driver
32  - romcode driver for POWER8 secure ROM
33  - See Device tree docs for tpm and ibm,secureboot nodes
34  - See main secure and trusted boot documentation.
35
36
37- Fast reboot for P8
38
39  This makes reboot take an *awful* lot less time, somewhere between four
40  and ten times faster than a full IPL. It is currently experimental and not
41  enabled by default.
42  You can enable the experimental support via nvram option: ::
43
44   # nvram -p ibm,skiboot --update-config experimental-fast-reset=feeling-lucky
45
46  **WARNING**: This has *known* bugs. For example, if you have used a device
47  in CAPI mode, we will currently *NOT* reset it back to plain PCI. There
48  are also some known issues in most simulators.
49
50- Support ``ibm,skiboot`` NVRAM partition with skiboot configuration options.
51
52  - These should generally only be used if you either completely know what
53    you are doing or need to work around a skiboot bug. They are **not**
54    intended for end users.
55  - Add support for supplying the kernel boot arguments from the ``bootargs``
56    configuration string in the ``ibm,skiboot`` NVRAM partition.
57  - Enabling the experimental fast reset feature is done via this method.
58
59- Add support for nap mode on P8 while in skiboot
60
61  - While nap has been exposed to the Operating System since day 1, we have
62    not utilized low power states when in skiboot itself, leading to higher
63    power consumption during boot.
64    We only enable the functionality after the 0x100 vector has been
65    patched, and we disable it before transferring control to Linux.
66
67- libflash: add 128MB MX66L1G45G part
68
69- Pointer validation of OPAL API call arguments.
70
71  - If the kernel called an OPAL API with vmalloc'd address
72    or any other address range in real mode, we would hit
73    a problem with aliasing. Since the top 4 bits are ignored
74    in real mode, pointers from 0xc.. and 0xd.. (and other ranges)
75    could collide and lead to hard to solve bugs. This patch
76    adds the infrastructure for pointer validation and a simple
77    test case for testing the API
78  - The checks validate pointers sent in using ``opal_addr_valid()``
79
80Documentation
81-------------
82
83There have been a number of documentation fixes this release. Most prominent
84is the switch to Sphinx (from the Python project) and ReStructured Text (RST)
85as the documentation format. RST and Sphinx enable both production of pretty
86documentation in HTML and PDF formats while remaining readable in their raw
87form to those with no knowledge of RST.
88
89You can build a HTML site by doing the following: ::
90
91 cd doc/
92 make html
93
94As always, documentation patches are very, *very* welcome as we attempt to
95document the OPAL API, the device tree bindings and important parts of
96OPAL internals.
97
98We would like the Device Tree documentation to follow the style that can be
99included in the Device Tree Specification.
100
101
102General
103-------
104- Make console-log time more readable: seconds rather than timebase
105  Log format is now ``[SECONDS.(tb%512000000),LEVEL]``
106
107- Flash (PNOR) code improvements
108
109  - flash: Make size 64 bit safe
110    This makes the size of flash 64 bit safe so that we can have flash
111    devices greater than 4GB. This is especially useful for mambo disks
112    passed through to Linux.
113  - core/flash.c: load actual partition size
114    We are downloading 0x20000 bytes from PNOR for CAPP, but currently the
115    CAPP lid is only 40K.
116  - flash: Rework error paths and messages for multiple flash controllers
117    Now that we have mambo bogusdisk flash, we can have many flash chips.
118    This is resulting in some confusing output messages.
119
120- core/init: Fix "failure of getting node in the free list" warning on boot.
121- slw: improve error message for SLW timer stuck
122
123- Centaur / XSCOM error handling
124
125  - print message on disabling xscoms to centaur due to many errors
126  - Mark centaur offline after 10 consecutive access errors
127
128- XSCOM improvements
129
130  - xscom: Map all HMER status codes to OPAL errors
131  - xscom: Initialize the data to a known value in ``xscom_read``
132    In case of error, don't leave the data random. It helps debugging when
133    the user fails to check the error code. This happens due to a bug in the
134    PRD wrapper app.
135  - chip: Add a quirk for when core direct control XSCOMs are missing
136
137- p8-i2c: Don't crash if a centaur errored out
138
139- cpu: Make endian switch message more informative
140- cpu: Display number of started CPUs during boot
141- core/init: ensure that HRMOR is zero at boot
142- asm: Fix backtrace for unexpected exception
143
144- cpu: Remove pollers calling heuristics from ``cpu_wait_job``
145  This will be handled by ``time_wait_ms()``. Also remove a useless
146  ``smt_medium()``.
147  Note that this introduce a difference in behaviour: time_wait
148  will only call the pollers on the boot CPU while ``cpu_wait_job()``
149  could call them on any. However, I can't think of a case where
150  this is a problem.
151
152- cpu: Remove global job queue
153  Instead, target a specific CPU for a global job at queuing time.
154  This will allow us to wake up the target using an interrupt when
155  implementing nap mode.
156  The algorithm used is to look for idle primary threads first, then
157  idle secondaries, and finally the less loaded thread. If nothing can
158  be found, we fallback to a synchronous call.
159- lpc: Log LPC SYNC errors as unrecoverable ones for manufacturing
160- lpc: Optimize SerIRQ dispatch based on which PSI IRQ fired
161- interrupts: Add new source ``->attributes()`` callback
162    This allows a given source to provide per-interrupt attributes
163    such as whether it targets OPAL or Linux and it's estimated
164    frequency.
165
166    The former allows to get rid of the double set of ops used to
167    decide which interrupts go where on some modules like the PHBs
168    and the latter will be eventually used to implement smart
169    caching of the source lookups.
170- opal/hmi: Fix a TOD HMI failure during a race condition.
171- platform: Add BT to Generic platform
172
173
174NVRAM
175-----
176- Support ``ibm,skiboot`` partition for skiboot specific configuration options
177- flash: Size NVRAM based on ECC for OpenPOWER platforms
178    If NVRAM has ECC (as per the ffs header) then the actual size of the
179    partition is less than reported by the ffs header in the PNOR then the
180    actual size of the partition is less than reported by the ffs header.
181
182NVLink/NPU
183----------
184
185- Fix reserved PE#
186- NPU bdfn allocation bugfix
187- Fix bad PE number check
188    NPUs have 4 PEs which are zero indexed, so {0, 1, 2, 3}.  A bad PE number
189    check in npu_err_inject checks if the PE number is greater than 4 as a
190    fail case, so it would wrongly perform operations on a non-existant PE 4.
191- Use PCI virtual device
192- assert the NPU irq min is aligned.
193- program NPU BUID reg properly
194- npu: reword "error" to indicate it's actually a warning
195   Incorrect FWTS annotation.
196   Without this patch, you get spurious FirmWare Test Suite (FWTS) warnings
197   about NVLink not working on machines that aren't fully populated with
198   GPUs.
199- external: NPU hardware procedure script
200   Performing NPU hardware procedures requires some config space magic.
201   Put all that magic into a script, so you can just specify the target
202   device and the procedure number.
203
204PCI
205---
206
207- Generic fixes
208
209  - Claim surprise hotplug capability
210  - Reserve PCI buses for RC's slot
211  - Update PCI topology after power change
212  - Return slot cached power state
213  - Cache power state on slot without power control
214  - Avoid hot resets at boot time
215  - Fix initial PCIe slot power state
216  - Print CRS retry times
217    It's useful to know the CRS retry times before the PCI device is
218    detected successfully. In PCI hot add case, it usually indicates
219    time consumed for the adapter's firmware to be partially ready
220    (responsive PCI config space).
221  - core/pci: Fix the power-off timeout in ``pci_slot_power_off()``
222    The timeout should be 1000ms instead of 1000 ticks while powering
223    off PCI slot in ``pci_slot_power_off()``. Otherwise, it's likely to
224    hit timeout powering off the PCI slot as below skiboot logs reveal: ::
225
226      [5399576870,5] PHB#0005:02:11.0 Timeout powering off slot
227
228- PHB3
229
230  - Override root slot's ``prepare_link_change()`` with PHB's
231  - Disable surprise link down event on PCI slots
232  - Disable ECRC on Broadcom adapter behind PMC switch
233
234- astbmc platforms
235
236  - Support dynamic PCI slot. We might insert a PCIe switch to PHB direct slot
237    and the downstream ports of the PCIe switch supports PCI hotplug.
238
239
240CAPI
241----
242
243- hw/phb3: Update capi initialization sequence
244    The capi initialization sequence was revised in a circumvention
245    document when a 'link down' error was converted from fatal to Endpoint
246    Recoverable. Other, non-capi, register setup was corrected even before
247    the initial open-source release of skiboot, but a few capi-related
248    registers were not updated then, so this patch fixes it.
249
250IPMI
251----
252
253- core/ipmi: Set interrupt-parent property
254    This allows ipmi-opal to properly use the OPAL irqchip rather than
255    falling back to the event interface in Linux.
256
257Mambo Simulator
258---------------
259
260- Helpers for POWER9 Mambo.
261- mambo: Advertise available RADIX page sizes
262- mambo: Add section for kernel command line boot args
263  Users can set kernel command line boot arguments for Mambo in a tcl
264  script.
265- mambo: add exception and qtrace helpers
266- external/mambo: Update skiboot.tcl to add page-sizes nodes to device tree
267
268Simics Simulator
269----------------
270
271- chiptod: Enable ChipTOD in SIMICS
272
273Utilities
274---------
275
276- pflash
277
278  - fix harmless buffer overflow: ``fl_total_size`` was ``uint32_t`` not ``uint64_t``.
279  - Don't try to write protect when writing to flash file
280  - Misc small improvements to code and code style
281  - makefile bug fixes
282
283
284- external/boot_tests
285
286  - remove lid from the BMC after flashing
287  - add the nobooting option -N
288  - add arbitrary lid option -F
289
290- ``getscom`` / ``getsram`` / ``putscom``: Parse chip-id as hex
291    We print the chip-id in hex (without a leading 0x), but we fail to
292    parse that same value correctly in ``getscom`` / ``getsram`` / ``putscom`` ::
293
294     # getscom -l
295     ...
296     80000000 | DD2.0 | Centaur memory buffer
297     # getscom -c 80000000 201140a
298     Error -19 reading XSCOM
299
300    Fix this by assuming base 16 when parsing chip-id.
301
302PRD
303---
304
305- opal-prd: Fix error code from ``scom_read`` and ``scom_write``
306- opal-prd: Add get_interface_capabilities to host interfaces
307- opal-prd: fix for 64-bit pnor sizes
308- occ/prd/opal-prd: Queue OCC_RESET event message to host in OpenPOWER
309    During an OCC reset cycle the system is forced to Psafe pstate.
310    When OCC becomes active, the system has to be restored to its
311    last pstate as requested by host. So host needs to be notified
312    of OCC_RESET event or else system will continue to remian in
313    Psafe state until host requests a new pstate after the OCC
314    reset cycle.
315
316IBM FSP Based Platforms
317-----------------------
318
319- fsp/console: Allocate irq for each hvc console
320    Allocate an irq number for each hvc console and set its interrupt-parent
321    property so that Linux can use the opal irqchip instead of the
322    OPAL_EVENT_CONSOLE_INPUT interface.
323- platforms/firenze: Fix clock frequency dt property: ::
324
325    [ 1.212366090,3] DT: Unexpected property length /xscom@3fc0000000000/i2cm@a0020/clock-frequency
326
327- HDAT: Fix typo in nest-frequency property
328    nest-frquency -> nest-frequency
329- platforms/ibm-fsp: Use power_ctl bit when determining slot reset method
330    The power_ctl bit is used to represent if power management is available.
331    If power_ctl is set to true, then the I2C based external power management
332    functionality will be populated on the PCI slot. Otherwise we will try to
333    use the inband PERST as the fundamental reset, as before.
334- FSP/ELOG: Fix elog timeout issue
335    Presently we set timeout value as soon as we add elog to queue. If
336    we have multiple elogs to write, it doesn't consider queue wait time.
337    Instead set timeout value when we are actually sending elog to FSP.
338- FSP/ELOG: elog_enable flag should be false by default
339    This issue is one of the corner case, which is related to recent change
340    went upstream and only observed in the petitboot prompt, where we see
341    only one error log instead of getting all error log in
342    ``/sys/firmware/opal/elog``.
343
344
345
346POWER9
347------
348
349- mambo: Make POWER9 look like DD2
350- flash: Move flash node under ``ibm,opal/flash/``
351    This changes the boot ABI, so it's only active for P9 and later systems,
352    even though it's unrelated to hardware changes. There is an associated
353    Linux change to properly search for this node as well.
354- core/cpu.c: Add OPAL call to setup Nest MMU
355- psi: On p9, create an interrupt-map for routing PSI interrupts
356- lpc: Add P9 LPC interrupts support
357- chiptod: Basic P9 support
358- psi: Add P9 support
359
360Testing and Debugging
361---------------------
362
363- test/qemu: bump qemu version used in CI, adds IPMI support
364- platform/qemu: add BT and IPMI support
365  Enables testing BT and IPMI functionality in the Qemu simulator
366- init: In debug builds, enable debug output to console
367- mem_region: Be a bit smarter about poisoning
368    Don't poison chunks that are already free and poison regions on
369    first allocation. This speeds things up dramatically.
370- libc: Use 8-bytes stores for non-0 memset too
371    Memory poisoning hammers this, so let's be a bit smart about it and
372    avoid falling back to byte stores when the data is not 0
373- fwts: add annotation for manufacturing mode
374- check: Fix bugs in mem region tests
375- Don't set -fstack-protector-all unconditionally
376    We set it already in DEBUG builds and we use -fstack-protector-strong
377    in release builds which provides most of the benefits and is more
378    efficient.
379- Build host programs (and checks) with debug enabled
380    This enables memory poisoning in allocations and list checking
381    among other things.
382- Add global DEBUG make flag
383
384
385Contributors
386------------
387
388Extending the analysis done for the last few releases, we can see our trends
389in code review across versions:
390
391======== ====== ======= ======= ======  ========
392Release	 csets	Ack	Reviews	Tested	Reported
393======== ====== ======= ======= ======  ========
3945.0	 329	 15	     20	     1	       0
3955.1	 372	 13	     38	     1	       4
3965.2-rc1	 334	 20	     34	     6	      11
3975.3-rc1  302     36          53      4         5
3985.4-rc1  278      8          19      0         4
399======== ====== ======= ======= ======  ========
400
401This release has fewer changesets over previous 5.x first release candidates,
402but that is not indicative of the size or complexity of these changes.
403
404
405Processed 278 csets from 31 developers
406A total of 17052 lines added, 4745 removed (delta 12307)
407
408Developers with the most changesets
409
410=========================== == =======
411=========================== == =======
412Stewart Smith               71 (25.5%)
413Benjamin Herrenschmidt      50 (18.0%)
414Claudio Carvalho            38 (13.7%)
415Gavin Shan                  20 (7.2%)
416Oliver O'Halloran           18 (6.5%)
417Mukesh Ojha                  9 (3.2%)
418Cyril Bur                    7 (2.5%)
419Russell Currey               7 (2.5%)
420Vasant Hegde                 7 (2.5%)
421Pridhiviraj Paidipeddi       6 (2.2%)
422Michael Neuling              6 (2.2%)
423Alistair Popple              4 (1.4%)
424Sam Mendoza-Jonas            3 (1.1%)
425Vipin K Parashar             3 (1.1%)
426Balbir Singh                 3 (1.1%)
427Mahesh Salgaonkar            3 (1.1%)
428Frederic Barrat              3 (1.1%)
429Chris Smart                  2 (0.7%)
430Jack Miller                  2 (0.7%)
431Patrick Williams             2 (0.7%)
432Jeremy Kerr                  2 (0.7%)
433Suraj Jitindar Singh         2 (0.7%)
434Milton Miller                2 (0.7%)
435Shilpasri G Bhat             1 (0.4%)
436Frederic Bonnard             1 (0.4%)
437Joel Stanley                 1 (0.4%)
438Breno Leitao                 1 (0.4%)
439Anton Blanchard              1 (0.4%)
440Nicholas Piggin              1 (0.4%)
441Nageswara R Sastry           1 (0.4%)
442Cédric Le Goater             1 (0.4%)
443=========================== == =======
444
445Developers with the most changed lines
446
447========================= ==== =======
448========================= ==== =======
449Claudio Carvalho          6817 (38.2%)
450Stewart Smith             4677 (26.2%)
451Benjamin Herrenschmidt    2586 (14.5%)
452Gavin Shan                1005 (5.6%)
453Cyril Bur                  509 (2.9%)
454Mukesh Ojha                361 (2.0%)
455Oliver O'Halloran          343 (1.9%)
456Russell Currey             343 (1.9%)
457Balbir Singh               227 (1.3%)
458Pridhiviraj Paidipeddi     194 (1.1%)
459Michael Neuling            121 (0.7%)
460Cédric Le Goater           115 (0.6%)
461Vipin K Parashar            68 (0.4%)
462Alistair Popple             66 (0.4%)
463Vasant Hegde                65 (0.4%)
464Shilpasri G Bhat            45 (0.3%)
465Suraj Jitindar Singh        41 (0.2%)
466Nicholas Piggin             34 (0.2%)
467Sam Mendoza-Jonas           33 (0.2%)
468Jack Miller                 32 (0.2%)
469Nageswara R Sastry          32 (0.2%)
470Jeremy Kerr                 23 (0.1%)
471Mahesh Salgaonkar           21 (0.1%)
472Chris Smart                 20 (0.1%)
473Milton Miller               19 (0.1%)
474Patrick Williams            11 (0.1%)
475Frederic Barrat              6 (0.0%)
476Anton Blanchard              3 (0.0%)
477Frederic Bonnard             2 (0.0%)
478Joel Stanley                 2 (0.0%)
479Breno Leitao                 2 (0.0%)
480========================= ==== =======
481
482Developers with the most lines removed
483
484========================= ==== =======
485========================= ==== =======
486Cyril Bur                  299 (6.3%)
487========================= ==== =======
488
489Developers with the most signoffs (total 226)
490
491========================= ==== =======
492========================= ==== =======
493Stewart Smith              219 (96.9%)
494Alistair Popple              4 (1.8%)
495Cyril Bur                    1 (0.4%)
496Jeremy Kerr                  1 (0.4%)
497Benjamin Herrenschmidt       1 (0.4%)
498========================= ==== =======
499
500Developers with the most reviews (total 19)
501
502========================= ==== =======
503========================= ==== =======
504Mukesh Ojha                  5 (26.3%)
505Andrew Donnellan             4 (21.1%)
506Vasant Hegde                 3 (15.8%)
507Russell Currey               3 (15.8%)
508Balbir Singh                 2 (10.5%)
509Cyril Bur                    1 (5.3%)
510Vaidyanathan Srinivasan      1 (5.3%)
511========================= ==== =======
512
513Developers with the most test credits (total 0)
514
515Developers who gave the most tested-by credits (total 0)
516
517Developers with the most report credits (total 4)
518
519========================= ==== =======
520========================= ==== =======
521Benjamin Herrenschmidt       1 (25.0%)
522Li Meng                      1 (25.0%)
523Pridhiviraj Paidipeddi       1 (25.0%)
524Gavin Shan                   1 (25.0%)
525========================= ==== =======
526
527Developers who gave the most report credits (total 4)
528
529========================= ==== =======
530========================= ==== =======
531Gavin Shan                   1 (25.0%)
532Vasant Hegde                 1 (25.0%)
533Russell Currey               1 (25.0%)
534Stewart Smith                1 (25.0%)
535========================= ==== =======
536