1.. _skiboot-5.4.0-rc1: 2 3skiboot-5.4.0-rc1 4================= 5 6skiboot-5.4.0-rc1 was released on Monday October 17th 2016. It is the first 7release candidate of skiboot 5.4, which will become the new stable release 8of skiboot following the 5.3 release, first released August 2nd 2016. 9 10skiboot-5.4.0-rc1 contains all bug fixes as of :ref:`skiboot-5.3.7` 11and :ref:`skiboot-5.1.18` (the currently maintained stable releases). 12 13For how the skiboot stable releases work, see :ref:`stable-rules` for details. 14 15The current plan is to release a new release candidate every week until we 16feel good about it. The aim is for skiboot-5.4.x to be in op-build v1.13, which 17is due by November 23rd 2016. 18 19Over skiboot-5.3, we have the following changes: 20 21New Features 22------------ 23- Initial Trusted Boot support (see :ref:`stb-overview`). 24 There are several limitations with this initial release: 25 26 - CAPP partition is not measured correctly 27 - Only Nuvoton TPM 2.0 is supported 28 - Requires hardware rework on late revision Habanero or Firestone boards 29 in order to install TPM. 30 31 - Add i2c Nuvoton TPM 2.0 Driver 32 - romcode driver for POWER8 secure ROM 33 - See Device tree docs for tpm and ibm,secureboot nodes 34 - See main secure and trusted boot documentation. 35 36 37- Fast reboot for P8 38 39 This makes reboot take an *awful* lot less time, somewhere between four 40 and ten times faster than a full IPL. It is currently experimental and not 41 enabled by default. 42 You can enable the experimental support via nvram option: :: 43 44 # nvram -p ibm,skiboot --update-config experimental-fast-reset=feeling-lucky 45 46 **WARNING**: This has *known* bugs. For example, if you have used a device 47 in CAPI mode, we will currently *NOT* reset it back to plain PCI. There 48 are also some known issues in most simulators. 49 50- Support ``ibm,skiboot`` NVRAM partition with skiboot configuration options. 51 52 - These should generally only be used if you either completely know what 53 you are doing or need to work around a skiboot bug. They are **not** 54 intended for end users. 55 - Add support for supplying the kernel boot arguments from the ``bootargs`` 56 configuration string in the ``ibm,skiboot`` NVRAM partition. 57 - Enabling the experimental fast reset feature is done via this method. 58 59- Add support for nap mode on P8 while in skiboot 60 61 - While nap has been exposed to the Operating System since day 1, we have 62 not utilized low power states when in skiboot itself, leading to higher 63 power consumption during boot. 64 We only enable the functionality after the 0x100 vector has been 65 patched, and we disable it before transferring control to Linux. 66 67- libflash: add 128MB MX66L1G45G part 68 69- Pointer validation of OPAL API call arguments. 70 71 - If the kernel called an OPAL API with vmalloc'd address 72 or any other address range in real mode, we would hit 73 a problem with aliasing. Since the top 4 bits are ignored 74 in real mode, pointers from 0xc.. and 0xd.. (and other ranges) 75 could collide and lead to hard to solve bugs. This patch 76 adds the infrastructure for pointer validation and a simple 77 test case for testing the API 78 - The checks validate pointers sent in using ``opal_addr_valid()`` 79 80Documentation 81------------- 82 83There have been a number of documentation fixes this release. Most prominent 84is the switch to Sphinx (from the Python project) and ReStructured Text (RST) 85as the documentation format. RST and Sphinx enable both production of pretty 86documentation in HTML and PDF formats while remaining readable in their raw 87form to those with no knowledge of RST. 88 89You can build a HTML site by doing the following: :: 90 91 cd doc/ 92 make html 93 94As always, documentation patches are very, *very* welcome as we attempt to 95document the OPAL API, the device tree bindings and important parts of 96OPAL internals. 97 98We would like the Device Tree documentation to follow the style that can be 99included in the Device Tree Specification. 100 101 102General 103------- 104- Make console-log time more readable: seconds rather than timebase 105 Log format is now ``[SECONDS.(tb%512000000),LEVEL]`` 106 107- Flash (PNOR) code improvements 108 109 - flash: Make size 64 bit safe 110 This makes the size of flash 64 bit safe so that we can have flash 111 devices greater than 4GB. This is especially useful for mambo disks 112 passed through to Linux. 113 - core/flash.c: load actual partition size 114 We are downloading 0x20000 bytes from PNOR for CAPP, but currently the 115 CAPP lid is only 40K. 116 - flash: Rework error paths and messages for multiple flash controllers 117 Now that we have mambo bogusdisk flash, we can have many flash chips. 118 This is resulting in some confusing output messages. 119 120- core/init: Fix "failure of getting node in the free list" warning on boot. 121- slw: improve error message for SLW timer stuck 122 123- Centaur / XSCOM error handling 124 125 - print message on disabling xscoms to centaur due to many errors 126 - Mark centaur offline after 10 consecutive access errors 127 128- XSCOM improvements 129 130 - xscom: Map all HMER status codes to OPAL errors 131 - xscom: Initialize the data to a known value in ``xscom_read`` 132 In case of error, don't leave the data random. It helps debugging when 133 the user fails to check the error code. This happens due to a bug in the 134 PRD wrapper app. 135 - chip: Add a quirk for when core direct control XSCOMs are missing 136 137- p8-i2c: Don't crash if a centaur errored out 138 139- cpu: Make endian switch message more informative 140- cpu: Display number of started CPUs during boot 141- core/init: ensure that HRMOR is zero at boot 142- asm: Fix backtrace for unexpected exception 143 144- cpu: Remove pollers calling heuristics from ``cpu_wait_job`` 145 This will be handled by ``time_wait_ms()``. Also remove a useless 146 ``smt_medium()``. 147 Note that this introduce a difference in behaviour: time_wait 148 will only call the pollers on the boot CPU while ``cpu_wait_job()`` 149 could call them on any. However, I can't think of a case where 150 this is a problem. 151 152- cpu: Remove global job queue 153 Instead, target a specific CPU for a global job at queuing time. 154 This will allow us to wake up the target using an interrupt when 155 implementing nap mode. 156 The algorithm used is to look for idle primary threads first, then 157 idle secondaries, and finally the less loaded thread. If nothing can 158 be found, we fallback to a synchronous call. 159- lpc: Log LPC SYNC errors as unrecoverable ones for manufacturing 160- lpc: Optimize SerIRQ dispatch based on which PSI IRQ fired 161- interrupts: Add new source ``->attributes()`` callback 162 This allows a given source to provide per-interrupt attributes 163 such as whether it targets OPAL or Linux and it's estimated 164 frequency. 165 166 The former allows to get rid of the double set of ops used to 167 decide which interrupts go where on some modules like the PHBs 168 and the latter will be eventually used to implement smart 169 caching of the source lookups. 170- opal/hmi: Fix a TOD HMI failure during a race condition. 171- platform: Add BT to Generic platform 172 173 174NVRAM 175----- 176- Support ``ibm,skiboot`` partition for skiboot specific configuration options 177- flash: Size NVRAM based on ECC for OpenPOWER platforms 178 If NVRAM has ECC (as per the ffs header) then the actual size of the 179 partition is less than reported by the ffs header in the PNOR then the 180 actual size of the partition is less than reported by the ffs header. 181 182NVLink/NPU 183---------- 184 185- Fix reserved PE# 186- NPU bdfn allocation bugfix 187- Fix bad PE number check 188 NPUs have 4 PEs which are zero indexed, so {0, 1, 2, 3}. A bad PE number 189 check in npu_err_inject checks if the PE number is greater than 4 as a 190 fail case, so it would wrongly perform operations on a non-existant PE 4. 191- Use PCI virtual device 192- assert the NPU irq min is aligned. 193- program NPU BUID reg properly 194- npu: reword "error" to indicate it's actually a warning 195 Incorrect FWTS annotation. 196 Without this patch, you get spurious FirmWare Test Suite (FWTS) warnings 197 about NVLink not working on machines that aren't fully populated with 198 GPUs. 199- external: NPU hardware procedure script 200 Performing NPU hardware procedures requires some config space magic. 201 Put all that magic into a script, so you can just specify the target 202 device and the procedure number. 203 204PCI 205--- 206 207- Generic fixes 208 209 - Claim surprise hotplug capability 210 - Reserve PCI buses for RC's slot 211 - Update PCI topology after power change 212 - Return slot cached power state 213 - Cache power state on slot without power control 214 - Avoid hot resets at boot time 215 - Fix initial PCIe slot power state 216 - Print CRS retry times 217 It's useful to know the CRS retry times before the PCI device is 218 detected successfully. In PCI hot add case, it usually indicates 219 time consumed for the adapter's firmware to be partially ready 220 (responsive PCI config space). 221 - core/pci: Fix the power-off timeout in ``pci_slot_power_off()`` 222 The timeout should be 1000ms instead of 1000 ticks while powering 223 off PCI slot in ``pci_slot_power_off()``. Otherwise, it's likely to 224 hit timeout powering off the PCI slot as below skiboot logs reveal: :: 225 226 [5399576870,5] PHB#0005:02:11.0 Timeout powering off slot 227 228- PHB3 229 230 - Override root slot's ``prepare_link_change()`` with PHB's 231 - Disable surprise link down event on PCI slots 232 - Disable ECRC on Broadcom adapter behind PMC switch 233 234- astbmc platforms 235 236 - Support dynamic PCI slot. We might insert a PCIe switch to PHB direct slot 237 and the downstream ports of the PCIe switch supports PCI hotplug. 238 239 240CAPI 241---- 242 243- hw/phb3: Update capi initialization sequence 244 The capi initialization sequence was revised in a circumvention 245 document when a 'link down' error was converted from fatal to Endpoint 246 Recoverable. Other, non-capi, register setup was corrected even before 247 the initial open-source release of skiboot, but a few capi-related 248 registers were not updated then, so this patch fixes it. 249 250IPMI 251---- 252 253- core/ipmi: Set interrupt-parent property 254 This allows ipmi-opal to properly use the OPAL irqchip rather than 255 falling back to the event interface in Linux. 256 257Mambo Simulator 258--------------- 259 260- Helpers for POWER9 Mambo. 261- mambo: Advertise available RADIX page sizes 262- mambo: Add section for kernel command line boot args 263 Users can set kernel command line boot arguments for Mambo in a tcl 264 script. 265- mambo: add exception and qtrace helpers 266- external/mambo: Update skiboot.tcl to add page-sizes nodes to device tree 267 268Simics Simulator 269---------------- 270 271- chiptod: Enable ChipTOD in SIMICS 272 273Utilities 274--------- 275 276- pflash 277 278 - fix harmless buffer overflow: ``fl_total_size`` was ``uint32_t`` not ``uint64_t``. 279 - Don't try to write protect when writing to flash file 280 - Misc small improvements to code and code style 281 - makefile bug fixes 282 283 284- external/boot_tests 285 286 - remove lid from the BMC after flashing 287 - add the nobooting option -N 288 - add arbitrary lid option -F 289 290- ``getscom`` / ``getsram`` / ``putscom``: Parse chip-id as hex 291 We print the chip-id in hex (without a leading 0x), but we fail to 292 parse that same value correctly in ``getscom`` / ``getsram`` / ``putscom`` :: 293 294 # getscom -l 295 ... 296 80000000 | DD2.0 | Centaur memory buffer 297 # getscom -c 80000000 201140a 298 Error -19 reading XSCOM 299 300 Fix this by assuming base 16 when parsing chip-id. 301 302PRD 303--- 304 305- opal-prd: Fix error code from ``scom_read`` and ``scom_write`` 306- opal-prd: Add get_interface_capabilities to host interfaces 307- opal-prd: fix for 64-bit pnor sizes 308- occ/prd/opal-prd: Queue OCC_RESET event message to host in OpenPOWER 309 During an OCC reset cycle the system is forced to Psafe pstate. 310 When OCC becomes active, the system has to be restored to its 311 last pstate as requested by host. So host needs to be notified 312 of OCC_RESET event or else system will continue to remian in 313 Psafe state until host requests a new pstate after the OCC 314 reset cycle. 315 316IBM FSP Based Platforms 317----------------------- 318 319- fsp/console: Allocate irq for each hvc console 320 Allocate an irq number for each hvc console and set its interrupt-parent 321 property so that Linux can use the opal irqchip instead of the 322 OPAL_EVENT_CONSOLE_INPUT interface. 323- platforms/firenze: Fix clock frequency dt property: :: 324 325 [ 1.212366090,3] DT: Unexpected property length /xscom@3fc0000000000/i2cm@a0020/clock-frequency 326 327- HDAT: Fix typo in nest-frequency property 328 nest-frquency -> nest-frequency 329- platforms/ibm-fsp: Use power_ctl bit when determining slot reset method 330 The power_ctl bit is used to represent if power management is available. 331 If power_ctl is set to true, then the I2C based external power management 332 functionality will be populated on the PCI slot. Otherwise we will try to 333 use the inband PERST as the fundamental reset, as before. 334- FSP/ELOG: Fix elog timeout issue 335 Presently we set timeout value as soon as we add elog to queue. If 336 we have multiple elogs to write, it doesn't consider queue wait time. 337 Instead set timeout value when we are actually sending elog to FSP. 338- FSP/ELOG: elog_enable flag should be false by default 339 This issue is one of the corner case, which is related to recent change 340 went upstream and only observed in the petitboot prompt, where we see 341 only one error log instead of getting all error log in 342 ``/sys/firmware/opal/elog``. 343 344 345 346POWER9 347------ 348 349- mambo: Make POWER9 look like DD2 350- flash: Move flash node under ``ibm,opal/flash/`` 351 This changes the boot ABI, so it's only active for P9 and later systems, 352 even though it's unrelated to hardware changes. There is an associated 353 Linux change to properly search for this node as well. 354- core/cpu.c: Add OPAL call to setup Nest MMU 355- psi: On p9, create an interrupt-map for routing PSI interrupts 356- lpc: Add P9 LPC interrupts support 357- chiptod: Basic P9 support 358- psi: Add P9 support 359 360Testing and Debugging 361--------------------- 362 363- test/qemu: bump qemu version used in CI, adds IPMI support 364- platform/qemu: add BT and IPMI support 365 Enables testing BT and IPMI functionality in the Qemu simulator 366- init: In debug builds, enable debug output to console 367- mem_region: Be a bit smarter about poisoning 368 Don't poison chunks that are already free and poison regions on 369 first allocation. This speeds things up dramatically. 370- libc: Use 8-bytes stores for non-0 memset too 371 Memory poisoning hammers this, so let's be a bit smart about it and 372 avoid falling back to byte stores when the data is not 0 373- fwts: add annotation for manufacturing mode 374- check: Fix bugs in mem region tests 375- Don't set -fstack-protector-all unconditionally 376 We set it already in DEBUG builds and we use -fstack-protector-strong 377 in release builds which provides most of the benefits and is more 378 efficient. 379- Build host programs (and checks) with debug enabled 380 This enables memory poisoning in allocations and list checking 381 among other things. 382- Add global DEBUG make flag 383 384 385Contributors 386------------ 387 388Extending the analysis done for the last few releases, we can see our trends 389in code review across versions: 390 391======== ====== ======= ======= ====== ======== 392Release csets Ack Reviews Tested Reported 393======== ====== ======= ======= ====== ======== 3945.0 329 15 20 1 0 3955.1 372 13 38 1 4 3965.2-rc1 334 20 34 6 11 3975.3-rc1 302 36 53 4 5 3985.4-rc1 278 8 19 0 4 399======== ====== ======= ======= ====== ======== 400 401This release has fewer changesets over previous 5.x first release candidates, 402but that is not indicative of the size or complexity of these changes. 403 404 405Processed 278 csets from 31 developers 406A total of 17052 lines added, 4745 removed (delta 12307) 407 408Developers with the most changesets 409 410=========================== == ======= 411=========================== == ======= 412Stewart Smith 71 (25.5%) 413Benjamin Herrenschmidt 50 (18.0%) 414Claudio Carvalho 38 (13.7%) 415Gavin Shan 20 (7.2%) 416Oliver O'Halloran 18 (6.5%) 417Mukesh Ojha 9 (3.2%) 418Cyril Bur 7 (2.5%) 419Russell Currey 7 (2.5%) 420Vasant Hegde 7 (2.5%) 421Pridhiviraj Paidipeddi 6 (2.2%) 422Michael Neuling 6 (2.2%) 423Alistair Popple 4 (1.4%) 424Sam Mendoza-Jonas 3 (1.1%) 425Vipin K Parashar 3 (1.1%) 426Balbir Singh 3 (1.1%) 427Mahesh Salgaonkar 3 (1.1%) 428Frederic Barrat 3 (1.1%) 429Chris Smart 2 (0.7%) 430Jack Miller 2 (0.7%) 431Patrick Williams 2 (0.7%) 432Jeremy Kerr 2 (0.7%) 433Suraj Jitindar Singh 2 (0.7%) 434Milton Miller 2 (0.7%) 435Shilpasri G Bhat 1 (0.4%) 436Frederic Bonnard 1 (0.4%) 437Joel Stanley 1 (0.4%) 438Breno Leitao 1 (0.4%) 439Anton Blanchard 1 (0.4%) 440Nicholas Piggin 1 (0.4%) 441Nageswara R Sastry 1 (0.4%) 442Cédric Le Goater 1 (0.4%) 443=========================== == ======= 444 445Developers with the most changed lines 446 447========================= ==== ======= 448========================= ==== ======= 449Claudio Carvalho 6817 (38.2%) 450Stewart Smith 4677 (26.2%) 451Benjamin Herrenschmidt 2586 (14.5%) 452Gavin Shan 1005 (5.6%) 453Cyril Bur 509 (2.9%) 454Mukesh Ojha 361 (2.0%) 455Oliver O'Halloran 343 (1.9%) 456Russell Currey 343 (1.9%) 457Balbir Singh 227 (1.3%) 458Pridhiviraj Paidipeddi 194 (1.1%) 459Michael Neuling 121 (0.7%) 460Cédric Le Goater 115 (0.6%) 461Vipin K Parashar 68 (0.4%) 462Alistair Popple 66 (0.4%) 463Vasant Hegde 65 (0.4%) 464Shilpasri G Bhat 45 (0.3%) 465Suraj Jitindar Singh 41 (0.2%) 466Nicholas Piggin 34 (0.2%) 467Sam Mendoza-Jonas 33 (0.2%) 468Jack Miller 32 (0.2%) 469Nageswara R Sastry 32 (0.2%) 470Jeremy Kerr 23 (0.1%) 471Mahesh Salgaonkar 21 (0.1%) 472Chris Smart 20 (0.1%) 473Milton Miller 19 (0.1%) 474Patrick Williams 11 (0.1%) 475Frederic Barrat 6 (0.0%) 476Anton Blanchard 3 (0.0%) 477Frederic Bonnard 2 (0.0%) 478Joel Stanley 2 (0.0%) 479Breno Leitao 2 (0.0%) 480========================= ==== ======= 481 482Developers with the most lines removed 483 484========================= ==== ======= 485========================= ==== ======= 486Cyril Bur 299 (6.3%) 487========================= ==== ======= 488 489Developers with the most signoffs (total 226) 490 491========================= ==== ======= 492========================= ==== ======= 493Stewart Smith 219 (96.9%) 494Alistair Popple 4 (1.8%) 495Cyril Bur 1 (0.4%) 496Jeremy Kerr 1 (0.4%) 497Benjamin Herrenschmidt 1 (0.4%) 498========================= ==== ======= 499 500Developers with the most reviews (total 19) 501 502========================= ==== ======= 503========================= ==== ======= 504Mukesh Ojha 5 (26.3%) 505Andrew Donnellan 4 (21.1%) 506Vasant Hegde 3 (15.8%) 507Russell Currey 3 (15.8%) 508Balbir Singh 2 (10.5%) 509Cyril Bur 1 (5.3%) 510Vaidyanathan Srinivasan 1 (5.3%) 511========================= ==== ======= 512 513Developers with the most test credits (total 0) 514 515Developers who gave the most tested-by credits (total 0) 516 517Developers with the most report credits (total 4) 518 519========================= ==== ======= 520========================= ==== ======= 521Benjamin Herrenschmidt 1 (25.0%) 522Li Meng 1 (25.0%) 523Pridhiviraj Paidipeddi 1 (25.0%) 524Gavin Shan 1 (25.0%) 525========================= ==== ======= 526 527Developers who gave the most report credits (total 4) 528 529========================= ==== ======= 530========================= ==== ======= 531Gavin Shan 1 (25.0%) 532Vasant Hegde 1 (25.0%) 533Russell Currey 1 (25.0%) 534Stewart Smith 1 (25.0%) 535========================= ==== ======= 536