1.. _skiboot-5.7:
2
3skiboot-5.7
4===========
5
6skiboot v5.7 was released on Tuesday July 25th 2017. It follows two
7release candidates of skiboot 5.7, and is now the new stable release
8of skiboot following the 5.6 release, first released 24th May 2017.
9
10skiboot v5.7 contains all bug fixes as of :ref:`skiboot-5.4.6`
11and :ref:`skiboot-5.1.19` (the currently maintained stable releases). We
12do not currently expect to do any 5.6.x stable releases.
13
14For how the skiboot stable releases work, see :ref:`stable-rules` for details.
15
16POWER9 is still in development, and thus all POWER9 users must upgrade
17to skiboot v5.7.
18
19This is the second release using the new regular six week release cycle,
20similar to op-build, but slightly offset to allow for a short stabilisation
21period. Expected release dates and contents are tracked using GitHub milestone
22and issues: https://github.com/open-power/skiboot/milestones
23
24New Features
25------------
26
27Since :ref:`skiboot-5.6.0`, we have a few new features:
28
29New features in this release for POWER9 systems:
30
31- In Memory Counters (IMC) (See :ref:`imc` for details)
32- phb4: Activate shared PCI slot on witherspoon (see :ref:`Shared Slot <shared-slot-rn>`)
33- phb4 capi (i.e. CAPI2): Enable capi mode for PHB4 (see :ref:`CAPI on PHB4 <capi2-rn>`)
34
35New feature for IBM FSP based systems:
36
37- fsp/tpo: Provide support for disabling TPO alarm
38
39  This patch adds support for disabling a preconfigured
40  Timed-Power-On(TPO) alarm on FSP based systems. Presently once a TPO alarm
41  is configured from the kernel it will be triggered even if its
42  subsequently disabled.
43
44  With this patch a TPO alarm can be disabled by passing
45  y_m_d==hr_min==0 to fsp_opal_tpo_write(). A branch is added to the
46  function to handle this case by sending FSP_CMD_TPO_DISABLE message to
47  the FSP instead of usual FSP_CMD_TPO_WRITE message. The kernel is
48  expected to call opal_tpo_write() with y_m_d==hr_min==0 to request
49  opal to disable TPO alarm.
50
51
52POWER9
53------
54There are many important changes for POWER9 DD1 and DD2 systems. POWER9 support
55should be considered in development and skiboot 5.7 is certainly **NOT**
56suitable for POWER9 production environments.
57
58Since :ref:`skiboot-5.7-rc2`:
59
60- platform/witherspoon: Enable eSEL logging
61
62  OpenBMC stack added IPMI OEM extension to log eSEL events.
63  Lets enable eSEL logging from OPAL side.
64
65  See: https://github.com/openbmc/openpower-host-ipmi-oem/blob/d9296050bcece5c2eca5ede0932d944b0ced66c9/oemhandler.cpp#L142
66  (yes, that is the documentation)
67- hdat/i2c: Fix array version check
68- mem_region: Check for no-map in reserved nodes
69
70  Regions with the no-map property should be handled seperately to
71  "normal" firmware reservations. When creating mem_region regions
72  from a reserved-memory DT node use the no-map property to select
73  the right reservation type.
74
75- hdata/memory: Add memory reservations to the DT
76
77  Currently we just add these to a list of pre-boot reserved regions
78  which is then converted into a the contents of the /reserved-memory/
79  node just before Skiboot jumps into the firmware kernel.
80
81  This approach is insufficent because we need to add the ibm,prd-instance
82  labels to the various hostboot reserved regions. To do this we want to
83  create these resevation nodes inside the HDAT parser rather than having
84  the mem_region flattening code handle it. On P8 systems Hostboot placed
85  its memory reservations under the /ibm,hostboot/ node and this patch
86  makes the HDAT parser do the same.
87
88Since Since :ref:`skiboot-5.7-rc1`:
89
90- HDAT: Add IPMI sensor data under /bmc node
91- numa/associativity: Add a new level of NUMA for GPU's
92
93  Today we have an issue where the NUMA nodes corresponding
94  to GPU's have the same affinity/distance as normal memory
95  nodes. Our reference-points today supports two levels
96  [0x4, 0x4] for normal systems and [0x4, 0x3] for Power8E
97  systems. This patch adds a new level [0x4, X, 0x2] and
98  uses node-id as at all levels for the GPU.
99- xive: Enable memory backing of queues
100
101  This dedicates 6x64k pages of memory permanently for the XIVE to
102  use for internal queue overflow. This allows the XIVE to deal with
103  some corner cases where the internal queues might prove insufficient.
104
105- xive: Properly get rid of donated indirect pages during reset
106
107  Otherwise they keep being used accross kexec causing memory
108  corruption in subsequent kernels once KVM has been used.
109
110- cpu: Better handle unknown flags in opal_reinit_cpus()
111
112  At the moment, if we get passed flags we don't know about, we
113  return OPAL_UNSUPPORTED but we still perform whatever actions
114  was requied by the flags we do support. Additionally, on P8,
115  we attempt a SLW re-init which hasn't been supported since
116  Murano DD2.0 and will crash your system.
117
118  It's too late to fix on existing systems so Linux will have to
119  be careful at least on P8, but to avoid future issues let's clean
120  that up, make sure we only use slw_reinit() when HILE isn't
121  supported.
122- cpu: Unconditionally cleanup TLBs on P9 in opal_reinit_cpus()
123
124  This can work around problems where Linux fails to properly
125  cleanup part or all of the TLB on kexec.
126
127- Fix scom addresses for power9 nx checkstop hmi handling.
128
129  Scom addresses for NX status, DMA & ENGINE FIR and PBI FIR has changed
130  for Power9. Fixup thoes while handling nx checkstop for Power9.
131- Fix scom addresses for power9 core checkstop hmi handling.
132
133  Scom addresses for CORE FIR (Fault Isolation Register) and Malfunction
134  Alert Register has changed for Power9. Fixup those while handling core
135  checkstop for Power9.
136
137  Without this change HMI handler fails to check for correct reason for
138  core checkstop on Power9.
139
140- core/mem_region: check return value of add_region
141
142  The only sensible thing to do if this fails is to abort() as we've
143  likely just failed reserving reserved memory regions, and nothing
144  good comes from that.
145
146Since Since :ref:`skiboot-5.6.0`:
147
148- hdata: Reserve Trace Areas
149
150  When hostboot is configured to setup in memory tracing it will reserve
151  some memory for use by the hardware tracing facility. We need to mark
152  these areas as off limits to the operating system and firmware.
153- hdata: Make out-of-range idata print at PR_DEBUG
154
155  Some fields just aren't populated on some systems.
156
157- hdata: Ignore unnamed memory reservations.
158
159  Hostboot should name any and all memory reservations that it provides.
160  Currently some hostboots export a broken reservation covering the first
161  256MB of memory and this causes the system to crash at boot due to an
162  invalid free because this overlaps with the static "ibm,os-reserve"
163  region (which covers the first 768MB of memory).
164
165  According to the hostboot team unnamed reservations are invalid and can
166  be ignored.
167
168- hdata: Check the Host I2C devices array version
169
170  Currently this is not populated on FSP machines which causes some
171  obnoxious errors to appear in the boot log. We also only want to
172  parse version 1 of this structure since future versions will completely
173  change the array item format.
174
175- Ensure P9 DD1 workarounds apply only to Nimbus
176
177  The workarounds for P9 DD1 are only needed for Nimbus. P9 Cumulus will
178  be DD1 but don't need these same workarounds.
179
180  This patch ensures the P9 DD1 workarounds only apply to Nimbus. It
181  also renames some things to make clear what's what.
182
183- cpu: Cleanup AMR and IAMR when re-initializing CPUs
184
185  There's a bug in current Linux kernels leaving crap in those registers
186  accross kexec and not sanitizing them on boot. This breaks kexec under
187  some circumstances (such as booting a hash kernel from a radix one
188  on P9 DD2.0).
189
190  The long term fix is in Linux, but this workaround is a reasonable
191  way of "sanitizing" those SPRs when Linux calls opal_reinit_cpus()
192  and shouldn't have adverse effects.
193
194  We could also use that same mechanism to cleanup other things as
195  well such as restoring some other SPRs to their default value in
196  the future.
197
198- Set POWER9 RPR SPR to 0x00000103070F1F3F.  Same value as P8.
199
200  Without this, thread priorities inside a core don't work.
201
202- cpu: Support setting HID[RADIX] and set it by default on P9
203
204  This adds new opal_reinit_cpus() flags to setup radix or hash
205  mode in HID[8] on POWER9.
206
207  By default HID[8] will be set. On P9 DD1.0, Linux will change
208  it as needed. On P9 DD2.0 hash works in radix mode (radix is
209  really "dual" mode) so KVM won't break and existing kernels
210  will work.
211
212  Newer kernels built for hash will call this to clear the HID bit
213  and thus get the full size of the TLB as an optimization.
214
215- Add "cleanup_global_tlb" for P9 and later
216
217  Uses broadcast TLBIE's to cleanup the TLB on all cores and on
218  the nest MMU
219
220- xive: DD2.0 updates
221
222  Add support for StoreEOI, fix StoreEOI MMIO offset in ESB page,
223  and other cleanups
224
225- Update default TSCR value for P9 as recommended by HW folk.
226
227- xive: Fix initialisation of xive_cpu_state struct
228
229  When using XIVE emulation with DEBUG=1, we run into crashes in log_add()
230  due to the xive_cpu_state->log_pos being uninitialised (and thus, with
231  DEBUG enabled, initialised to the poison value of 0x99999999).
232
233
234PHB4
235^^^^
236
237Since :ref:`skiboot-5.7-rc2`:
238
239- phb4: Add link training trace mode
240
241  Add a mode to PHB4 to trace training process closely. This activates
242  as soon as PERST is deasserted and produces human readable output of
243  the process.
244
245  This may increase training times since it duplicates some of the
246  training code.  This code has it's own simple checks for fence and
247  timeout but will fall through to the default training code once done.
248
249  Output produced, looks like the "TRACE:" lines below: ::
250
251      [    3.410799664,7] PHB#0001[0:1]: FRESET: Starts
252      [    3.410802000,7] PHB#0001[0:1]: FRESET: Prepare for link down
253      [    3.410806624,7] PHB#0001[0:1]: FRESET: Assert skipped
254      [    3.410808848,7] PHB#0001[0:1]: FRESET: Deassert
255      [    3.410812176,3] PHB#0001[0:1]: TRACE: 0x0000000101000000  0ms
256      [    3.417170176,3] PHB#0001[0:1]: TRACE: 0x0000100101000000 12ms presence
257      [    3.436289104,3] PHB#0001[0:1]: TRACE: 0x0000180101000000 49ms training
258      [    3.436373312,3] PHB#0001[0:1]: TRACE: 0x00001d0811000000 49ms trained
259      [    3.436420752,3] PHB#0001[0:1]: TRACE: Link trained.
260      [    3.436967856,7] PHB#0001[0:1]: LINK: Start polling
261      [    3.437482240,7] PHB#0001[0:1]: LINK: Electrical link detected
262      [    3.437996864,7] PHB#0001[0:1]: LINK: Link is up
263      [    4.438000048,7] PHB#0001[0:1]: LINK: Link is stable
264
265  Enabled via nvram using: ::
266
267      nvram -p ibm,skiboot --update-config pci-tracing=true
268
269- phb4: Improve reset and link training timing
270
271  This improves PHB reset and link training timing.
272
273- phb4: Add phb4_check_reg() to sanity check failures
274
275  This adds a function phb4_check_reg() to sanity check when we do MMIO
276  reads from the PHB to make sure it's not fenced.
277
278- phb4: Remove retry on electrical link timeout
279
280  Currently we retry if we don't detect an electrical link. This is
281  pointless as all devices should respond in the given time.
282
283  This patches removes this retry and just returns OPAL_HARDWARE if we
284  don't detect an electrical link.
285
286  This has the additional benefit of improving boot times on machines
287  that have badly wired presence detect (ie. says a device is present
288  when there isn't).
289
290- phb4: Read PERST signal rather than assuming it's asserted
291
292  Currently we assume on boot that PERST is asserted so that we can skip
293  having to assert it ourselves.
294
295  This instead reads the PERST status and determines if we need to
296  assert it based on that.
297
298- phb4: Fix endian of TLP headers print
299
300  Byte swap TLP headers so they are the same as the PCIe spec.
301- phb4: Change timeouts prints to error level
302
303  If the link doesn't have a electrical link or the link doesn't train
304  we should make that more obvious to the user.
305- phb4: Better logs why the slot didn't work
306
307  Better logs why the slot didn't work and make it a PR_ERR so users
308  see it by default.
309
310- phb4: Force verbose EEH logging
311
312  Force verbose EEH. This is a heavy handed and we should turn if off
313  later as things stabilise, but is useful for now.
314- phb4: Initialization sequence updates
315
316  Mostly errata workarounds, some DD1 specific.
317
318  The step Init_5 was moved to Init_16, so the numbering was updated to
319  reflect this.
320
321Since :ref:`skiboot-5.7-rc1`:
322
323- phb4: Do more retries on link training failures
324  Currently we only retry once when we have a link training failure.
325  This changes this to be 3 retries as 1 retry is not giving us enough
326  reliablity.
327
328  This will increase the boot time, especially on systems where we
329  incorrectly detect a link presence when there really is nothing
330  present. I'll post a followup patch to optimise our timings to help
331  mitigate this later.
332
333- phb4: Workaround phy lockup by doing full PHB reset on retry
334
335  For PHB4 it's possible that the phy may end up in a bad state where it
336  can no longer recieve data. This can manifest as the link not
337  retraining. A simple PERST will not clear this. The PHB must be
338  completely reset.
339
340  This changes the retry state to CRESET to do this.
341
342  This issue may also manifest itself as the link training in a degraded
343  state (lower speed or narrower width). This patch doesn't attempt to
344  fix that (will come later).
345- pci: Add ability to trace timing
346
347  PCI link training is responsible for a huge chunk of the skiboot boot
348  time, so add the ability to trace it waiting in the main state
349  machine.
350- pci: Print resetting PHB notice at higher log level
351
352  Currently during boot there a long delay while we wait for the PHBs to
353  be reset and train. During this time, there is no output from skiboot
354  and the last message doesn't give an indication of what's happening.
355
356  This boosts the PHB reset message from info to notice so users can see
357  what's happening during this long period of waiting.
358- phb4: Only set one bit in nfir
359
360  The MPIPL procedure says to only set bit 26 when forcing the PEC into
361  freeze mode. Currently we set bits 24-27.
362
363  This changes the code to follow spec and only set bit 26.
364- phb4: Fix order of pfir/nfir clearing in CRESET
365
366  According to the workbook, pfir must be cleared before the nfir.
367  The way we have it now causes the nfir to not clear properly in some
368  error circumstances.
369
370  This swaps the order to match the workbook.
371- phb4: Remove incorrect state transition
372
373  When waiting in PHB4_SLOT_CRESET_WAIT_CQ for transations to end, we
374  incorrectly move onto the next state.  Generally we don't hit this as
375  the transactions have ended already anyway.
376
377  This removes the incorrect state transition.
378- phb4: Set default lane equalisation
379
380  Set default lane equalisation if there is nothing in the device-tree.
381
382  Default value taken from hdat and confirmed by hardware team. Neatens
383  the code up a bit too.
384- hdata: Fix phb4 lane-eq property generation
385
386  The lane-eq data we get from hdat is all 7s but what we end up in the
387  device tree is: ::
388
389    xscom@603fc00000000/pbcq@4010c00/stack@0/ibm,lane-eq
390                     00000000 31c339e0 00000000 0000000c
391                     00000000 00000000 00000000 00000000
392                     00000000 31c30000 77777777 77777777
393                     77777777 77777777 77777777 77777777
394
395  This fixes grabbing the properties from hdat and fixes the call to put
396  them in the device tree.
397- phb4: Fix PHB4 fence recovery.
398
399  We had a few problems:
400
401  - We used the wrong register to trigger the reset (spec bug)
402  - We should clear the PFIR and NFIR while the reset is asserted
403  - ... and in the right order !
404  - We should only apply the DD1 workaround after the reset has
405    been lifted.
406  - We should ensure we use ASB whenever we are fenced or doing a
407    CRESET
408  - Make config ops write with ASB
409- phb4: Verbose EEH options
410
411  Enabled via nvram pci-eeh-verbose=true. ie. ::
412
413    nvram -p ibm,skiboot --update-config pci-eeh-verbose=true
414- phb4: Print more info when PHB fences
415
416  For now at PHBERR level. We don't have room in the diags data
417  passed to Linux for these unfortunately.
418
419Since :ref:`skiboot-5.6.0`:
420
421- phb4: Fix number of index bits in IODA tables
422
423  On PHB4 the number of index bits in the IODA table address register
424  was bumped to 10 bits to accomodate for 1024 MSIs and 1024 TVEs (DD2).
425
426  However our macro only defined the field to be 9 bits, thus causing
427  "interesting" behaviours on some systems.
428
429- phb4: Harden init with bad PHBs
430
431  Currently if we read all 1's from the EEH or IRQ capabilities, we end
432  up train wrecking on some other random code (eg. an assert() in xive).
433
434  This hardens the PHB4 code to look for these bad reads and more
435  gracefully fails the init for that PHB alone.  This allows the rest of
436  the system to boot and ignore those bad PHBs.
437
438- phb4 capi (i.e. CAPI2): Handle HMI events
439
440  Find the CAPP on the chip associated with the HMI event for PHB4.
441  The recovery mode (re-initialization of the capp, resume of functional
442  operations) is only available with P9 DD2. A new patch will be provided
443  to support this feature.
444
445.. _capi2-rn:
446
447- phb4 capi (i.e. CAPI2): Enable capi mode for PHB4
448
449  Enable the Coherently attached processor interface. The PHB is used as
450  a CAPI interface.
451  CAPI Adapters can be connected to either PEC0 or PEC2. Single port
452  CAPI adapter can be connected to either PEC0 or PEC2, but Dual-Port
453  Adapter can be only connected to PEC2
454  * CAPP0 attached to PHB0(PEC0 - single port)
455  * CAPP1 attached to PHB3(PEC2 - single or dual port)
456
457- hw/phb4: Rework phb4_get_presence_state()
458
459  There are two issues in current implementation: It should return errcode
460  visibile to Linux, which has prefix OPAL_*. The code isn't very obvious.
461
462  This returns OPAL_HARDWARE when the PHB is broken. Otherwise, OPAL_SUCCESS
463  is always returned. In the mean while, It refactors the code to make it
464  obvious: OPAL_PCI_SLOT_PRESENT is returned when the presence signal (low active)
465  or PCIe link is active. Otherwise, OPAL_PCI_SLOT_EMPTY is returned.
466
467- phb4: Error injection for config space
468
469  Implement CFG (config space) error injection.
470
471  This works the same as PHB3.  MMIO and DMA error injection require a
472  rewrite, so they're unsupported for now.
473
474  While it's not feature complete, this at least provides an easy way to
475  inject an error that will trigger EEH.
476
477- phb4: Error clear implementation
478- phb4: Mask link down errors during reset
479
480  During a hot reset the PCI link will drop, so we need to mask link down
481  events to prevent unnecessary errors.
482- phb4: Implement root port initialization
483
484  phb4_root_port_init() was a NOP before, so fix that.
485- phb4: Complete reset implementation
486
487  This implements complete reset (creset) functionality for POWER9 DD1.
488
489  Only partially tested and contends with some DD1 errata, but it's a start.
490
491.. _shared-slot-rn:
492
493- phb4: Activate shared PCI slot on witherspoon
494
495  Witherspoon systems come with a 'shared' PCI slot: physically, it
496  looks like a x16 slot, but it's actually two x8 slots connected to two
497  PHBs of two different chips. Taking advantage of it requires some
498  logic on the PCI adapter. Only the Mellanox CX5 adapter is known to
499  support it at the time of this writing.
500
501  This patch enables support for the shared slot on witherspoon if a x16
502  adapter is detected. Each x8 slot has a presence bit, so both bits
503  need to be set for the activation to take place. Slot sharing is
504  activated through a gpio.
505
506  Note that there's no easy way to be sure that the card is indeed a
507  shared-slot compatible PCI adapter and not a normal x16 card. Plugging
508  a normal x16 adapter on the shared slot should be avoided on
509  witherspoon, as the link won't train on the second slot, resulting in
510  a timeout and a longer boot time. Only the first slot is usable and
511  the x16 adapter will end up using only half the lines.
512
513  If the PCI card plugged on the physical slot is only x8 (or less),
514  then the presence bit of the second slot is not set, so this patch
515  does nothing. The x8 (or less) adapter should work like on any other
516  physical slot.
517
518- phb4: Block D-state power management on direct slots
519
520  As current revisions of PHB4 don't properly handle the resulting
521  L1 link transition.
522
523- phb4: Call pci config filters
524
525- phb4: Mask out write-1-to-clear registers in RC cfg
526
527  The root complex config space only supports 4-byte accesses. Thus, when
528  the client requests a smaller size write, we do a read-modify-write to
529  the register.
530
531  However, some register have bits defined as "write 1 to clear".
532
533  If we do a RMW cycles on such a register and such bits are 1 in the
534  part that the client doesn't intend to modify, we will accidentally
535  write back those 1's and clear the corresponding bit.
536
537  This avoids it by masking out those magic bits from the "old" value
538  read from the register.
539
540- phb4: Properly mask out link down errors during reset
541- phb3/4: Silence a useless warning
542
543  PHB's don't have base location codes on non-FSP systems and it's
544  normal.
545
546- phb4: Workaround bug in spec 053
547
548  Wait for DLP PGRESET to clear *after* lifting the PCIe core reset
549
550- phb4: DD2.0 updates
551
552  Support StoreEOI, full complements of PEs (twice as big TVT)
553  and other updates.
554
555  Also renumber init steps to match spec 063
556
557NPU2
558^^^^
559
560Note that currently NPU2 support is limited to POWER9 DD1 hardware.
561
562Since :ref:`skiboot-5.6.0`:
563
564- platforms/astbmc/witherspoon.c: Add NPU2 slot mappings
565
566  For NVLink2 to function PCIe devices need to be associated with the right
567  NVLinks. This association is supposed to be passed down to Skiboot via HDAT but
568  those fields are still not correctly filled out. To work around this we add slot
569  tables for the NVLinks similar to what we have for P8+.
570
571- hw/npu2.c: Fix device aperture calculation
572
573  The POWER9 NPU2 implements an address compression scheme to compress 56-bit P9
574  physical addresses to 47-bit GPU addresses. System software needs to know both
575  addresses, unfortunately the calculation of the compressed address was
576  incorrect. Fix it here.
577
578- hw/npu2.c: Change MCD BAR allocation order
579
580  MCD BARs need to be correctly aligned to the size of the region. As GPU
581  memory is allocated from the top of memory down we should start allocating
582  from the highest GPU memory address to the lowest to ensure correct
583  alignment.
584
585- NPU2: Add flag to nvlink config space indicating DL reset state
586
587  Device drivers need to be able to determine if the DL is out of reset or
588  not so they can safely probe to see if links have already been trained.
589  This patch adds a flag to the vendor specific config space indicating if
590  the DL is out of reset.
591
592- hw/npu2.c: Hardcode MSR_SF when setting up npu XTS contexts
593
594  We don't support anything other than 64-bit mode for address translations so we
595  can safely hardcode it.
596
597- hw/npu2-hw-procedures.c: Add nvram option to override zcal calculations
598
599  In some rare cases the zcal state machine may fail and flag an error. According
600  to hardware designers it is sometimes ok to ignore this failure and use nominal
601  values for the calculations. In this case we add a nvram variable
602  (nv_zcal_override) which will cause skiboot to ignore the failure and use the
603  nominal value specified in nvram.
604- npu2: Fix npu2_{read,write}_4b()
605
606  When writing or reading 4-byte values, we need to use the upper half of
607  the 64-bit SCOM register.
608
609  Fix npu2_{read,write}_4b() and their callers to use uint32_t, and
610  appropriately shift the value being written or returned.
611
612
613- hw/npu2.c: Fix opal_npu_map_lpar to search for existing BDF
614- hw/npu2-hw-procedures.c: Fix running of zcal procedure
615
616    The zcal procedure should only be run once per obus (ie. once per group of 3
617    links). Clean up the code and fix the potential buffer overflow due to a typo.
618    Also updates the zcal settings to their proper values.
619- hw/npu2.c: Add memory coherence directory programming
620
621  The memory coherence directory (MCD) needs to know which system memory addresses
622  belong to the GPU. This amounts to setting a BAR and a size in the MCD to cover
623  the addresses assigned to each of the GPUs. To ease assignment we assume GPUs
624  are assigned memory in a contiguous block per chip.
625
626OCC/Power Management
627^^^^^^^^^^^^^^^^^^^^
628
629With this release, it's possible to boot POWER9 systems with the OCC
630enabled and change CPU frequencies. Doing so does require other firmware
631components to also support this (otherwise the frequency will not be set).
632
633Since :ref:`skiboot-5.6.0`:
634
635- occ: Skip setting cores to nominal frequency in P9
636
637  In P9, once OCC is up, it is supposed to setup the cores to nominal
638  frequency. So skip this step in OPAL.
639- occ: Fix Pstate ordering for P9
640
641  In P9 the pstate values are positive. They are continuous set of
642  unsigned integers [0 to +N] where Pmax is 0 and Pmin is N. The
643  linear ordering of pstates for P9 has changed compared to P8.
644  P8 has neagtive pstate values advertised as [0 to -N] where Pmax
645  is 0 and Pmin is -N. This patch adds helper routines to abstract
646  pstate comparison with pmax and adds sanity pstate limit checks.
647  This patch also fixes pstate arithmetic by using labs().
648- p8-i2c: occ: Add support for OCC to use I2C engines
649
650  This patch adds support to share the I2C engines with host and OCC.
651  OCC uses I2C engines to read DIMM temperatures and to communicate with
652  GPU. OCC Flag register is used for locking between host and OCC. Host
653  requests for the bus by setting a bit in OCC Flag register. OCC sends
654  an interrupt to indicate the change in ownership.
655
656opal-prd/PRD
657^^^^^^^^^^^^
658
659Since :ref:`skiboot-5.6.0`:
660
661- opal-prd: Handle SBE passthrough message passing
662
663  This patch adds support to send SBE pass through command to HBRT.
664- SBE: Add passthrough command support
665
666  SBE sends passthrough command. We have to capture this interrupt and
667  send event to HBRT via opal-prd (user space daemon).
668- opal-prd: hook up reset_pm_complex
669
670  This change provides the facility to invoke HBRT's reset_pm_complex, in
671  the same manner is done with process_occ_reset previously.
672
673  We add a control command for `opal-prd pm-complex reset`, which is just
674  an alias for occ_reset at this stage.
675
676- prd: Implement firmware side of opaque PRD channel
677
678  This change introduces the firmware side of the opaque HBRT <--> OPAL
679  message channel. We define a base message format to be shared with HBRT
680  (in include/prd-fw-msg.h), and allow firmware requests and responses to
681  be sent over this channel.
682
683  We don't currently have any notifications defined, so have nothing to do
684  for firmware_notify() at this stage.
685
686- opal-prd: Add firmware_request & firmware_notify implementations
687
688  This change adds the implementation of firmware_request() and
689  firmware_notify(). To do this, we need to add a message queue, so that
690  we can properly handle out-of-order messages coming from firmware.
691
692- opal-prd: Add support for variable-sized messages
693
694  With the introductuion of the opaque firmware channel, we want to
695  support variable-sized messages. Rather than expecting to read an
696  entire 'struct opal_prd_msg' in one read() call, we can split this
697  over mutiple reads, potentially expanding our message buffer.
698
699- opal-prd: Sync hostboot interfaces with HBRT
700
701  This change adds new callbacks defined for p9, and the base thunks for
702  the added calls.
703
704- opal-prd: interpret log level prefixes from HBRT
705
706  Interpret the (optional) \*_MRK log prefixes on HBRT messages, and set
707  the syslog log priority to suit.
708
709- opal-prd: Add occ reset to usage text
710- opal-prd: allow different chips for occ control actions
711
712  The `occ reset` and `occ error` actions can both take a chip id
713  argument, but we're currently just using zero. This change changes the
714  control message format to pass the chip ID from the control process to
715  the opal-prd daemon.
716
717
718IBM FSP based platforms
719-----------------------
720
721Since :ref:`skiboot-5.7-rc2`:
722
723- FSP/CONSOLE: Do not enable input irq in write path
724
725  We use irq for reading input from console, but not in output path.
726  Hence do not enable input irq in write path.
727
728  Fixes : 583c8203 (fsp/console: Allocate irq for each hvc console)
729
730Since :ref:`skiboot-5.6.0`:
731
732- FSP/CONSOLE: Fix possible NULL dereference
733- platforms/ibm-fsp/firenze: Fix PCI slot power-off pattern
734
735  When powering off the PCI slot, the corresponding bits should
736  be set to 0bxx00xx00 instead of 0bxx11xx11. Otherwise, the
737  specified PCI slot can't be put into power-off state. Fortunately,
738  it didn't introduce any side-effects so far.
739- FSP/CONSOLE: Workaround for unresponsive ipmi daemon
740
741  We use TCE mapped area to write data to console. Console header
742  (fsp_serbuf_hdr) is modified by both FSP and OPAL (OPAL updates
743  next_in pointer in fsp_serbuf_hdr and FSP updates next_out pointer).
744
745  Kernel makes opal_console_write() OPAL call to write data to console.
746  OPAL write data to TCE mapped area and sends MBOX command to FSP.
747  If our console becomes full and we have data to write to console,
748  we keep on waiting until FSP reads data.
749
750  In some corner cases, where FSP is active but not responding to
751  console MBOX message (due to buggy IPMI) and we have heavy console
752  write happening from kernel, then eventually our console buffer
753  becomes full. At this point OPAL starts sending OPAL_BUSY_EVENT to
754  kernel. Kernel will keep on retrying. This is creating kernel soft
755  lockups. In some extreme case when every CPU is trying to write to
756  console, user will not be able to ssh and thinks system is hang.
757
758  If we reset FSP or restart IPMI daemon on FSP, system recovers and
759  everything becomes normal.
760
761  This patch adds workaround to above issue by returning OPAL_HARDWARE
762  when cosole is full. Side effect of this patch is, we may endup dropping
763  latest console data. But better to drop console data than system hang.
764
765- FSP: Set status field in response message for timed out message
766
767  For timed out FSP messages, we set message status as "fsp_msg_timeout".
768  But most FSP driver users (like surviellance) are ignoring this field.
769  They always look for FSP returned status value in callback function
770  (second byte in word1). So we endup treating timed out message as success
771  response from FSP.
772
773  Sample output: ::
774
775    [69902.432509048,7] SURV: Sending the heartbeat command to FSP
776    [70023.226860117,4] FSP: Response from FSP timed out, word0 = d66a00d7, word1 = 0 state: 3
777    ....
778    [70023.226901445,7] SURV: Received heartbeat acknowledge from FSP
779    [70023.226903251,3] FSP: fsp_trigger_reset() entry
780
781  Here SURV code thought it got valid response from FSP. But actually we didn't
782  receive response from FSP.
783
784  This patch fixes above issue by updating status field in response structure.
785
786- FSP: Improve timeout message
787
788- FSP/RTC: Fix possible FSP R/R issue in rtc write path
789- hw/fsp/rtc: read/write cached rtc tod on fsp hir.
790
791  Currently fsp-rtc reads/writes the cached RTC TOD on an fsp
792  reset. Use latest fsp_in_rr() function to properly read the cached rtc
793  value when fsp reset initiated by the hir.
794
795  Below is the kernel trace when we set hw clock, when hir process starts. ::
796
797    [ 1727.775824] NMI watchdog: BUG: soft lockup - CPU#57 stuck for 23s! [hwclock:7688]
798    [ 1727.775856] Modules linked in: vmx_crypto ibmpowernv ipmi_powernv uio_pdrv_genirq ipmi_devintf powernv_op_panel uio ipmi_msghandler powernv_rng leds_powernv ip_tables x_tables autofs4 ses enclosure scsi_transport_sas crc32c_vpmsum lpfc ipr tg3 scsi_transport_fc
799    [ 1727.775883] CPU: 57 PID: 7688 Comm: hwclock Not tainted 4.10.0-14-generic #16-Ubuntu
800    [ 1727.775883] task: c000000fdfdc8400 task.stack: c000000fdfef4000
801    [ 1727.775884] NIP: c00000000090540c LR: c0000000000846f4 CTR: 000000003006dd70
802    [ 1727.775885] REGS: c000000fdfef79a0 TRAP: 0901   Not tainted  (4.10.0-14-generic)
803    [ 1727.775886] MSR: 9000000000009033 <SF,HV,EE,ME,IR,DR,RI,LE>
804    [ 1727.775889]   CR: 28024442  XER: 20000000
805    [ 1727.775890] CFAR: c00000000008472c SOFTE: 1
806                   GPR00: 0000000030005128 c000000fdfef7c20 c00000000144c900 fffffffffffffff4
807                   GPR04: 0000000028024442 c00000000090540c 9000000000009033 0000000000000000
808                   GPR08: 0000000000000000 0000000031fc4000 c000000000084710 9000000000001003
809                   GPR12: c0000000000846e8 c00000000fba0100
810    [ 1727.775897] NIP [c00000000090540c] opal_set_rtc_time+0x4c/0xb0
811    [ 1727.775899] LR [c0000000000846f4] opal_return+0xc/0x48
812    [ 1727.775899] Call Trace:
813    [ 1727.775900] [c000000fdfef7c20] [c00000000090540c] opal_set_rtc_time+0x4c/0xb0 (unreliable)
814    [ 1727.775901] [c000000fdfef7c60] [c000000000900828] rtc_set_time+0xb8/0x1b0
815    [ 1727.775903] [c000000fdfef7ca0] [c000000000902364] rtc_dev_ioctl+0x454/0x630
816    [ 1727.775904] [c000000fdfef7d40] [c00000000035b1f4] do_vfs_ioctl+0xd4/0x8c0
817    [ 1727.775906] [c000000fdfef7de0] [c00000000035bab4] SyS_ioctl+0xd4/0xf0
818    [ 1727.775907] [c000000fdfef7e30] [c00000000000b184] system_call+0x38/0xe0
819    [ 1727.775908] Instruction dump:
820    [ 1727.775909] f821ffc1 39200000 7c832378 91210028 38a10020 39200000 38810028 f9210020
821    [ 1727.775911] 4bfffe6d e8810020 80610028 4b77f61d <60000000> 7c7f1b78 3860000a 2fbffff4
822
823  This is found when executing the testcase
824  https://github.com/open-power/op-test-framework/blob/master/testcases/fspresetReload.py
825
826  With this fix ran fsp hir torture testcase in the above test
827  which is working fine.
828- occ: Set return variable to correct value
829
830  When entering this section of code rc will be zero. If fsp_mkmsg() fails
831  the code responsible for printing an error message won't be set.
832  Resetting rc should allow for the error case to trigger if fsp_mkmsg
833  fails.
834- capp: Fix hang when CAPP microcode LID is missing on FSP machine
835
836  When the LID is absent, we fail early with an error from
837  start_preload_resource. In that case, capp_ucode_info.load_result
838  isn't set properly causing a subsequent capp_lid_download() to
839  call wait_for_resource_loaded() on something that isn't being
840  loaded, thus hanging.
841
842- FSP: Add check to detect FSP R/R inside fsp_sync_msg()
843
844  OPAL sends MBOX message to FSP and updates message state from fsp_msg_queued
845  -> fsp_msg_sent. fsp_sync_msg() queues message and waits until we get response
846  from FSP. During FSP R/R we move outstanding MBOX messages from msgq to rr_queue
847  including inflight message (fsp_reset_cmdclass()). But we are not resetting
848  inflight message state.
849
850  In extreme croner case where we sent message to FSP via fsp_sync_msg() path
851  and FSP R/R happens before getting respose from FSP, then we will endup waiting
852  in fsp_sync_msg() until everything becomes normal.
853
854  This patch adds fsp_in_rr() check to fsp_sync_msg() and return error to caller
855    if FSP is in R/R.
856- FSP: Add check to detect FSP R/R inside fsp_sync_msg()
857
858  OPAL sends MBOX message to FSP and updates message state from fsp_msg_queued
859  -> fsp_msg_sent. fsp_sync_msg() queues message and waits until we get response
860  from FSP. During FSP R/R we move outstanding MBOX messages from msgq to rr_queue
861  including inflight message (fsp_reset_cmdclass()). But we are not resetting
862  inflight message state.
863
864  In extreme croner case where we sent message to FSP via fsp_sync_msg() path
865  and FSP R/R happens before getting respose from FSP, then we will endup waiting
866  in fsp_sync_msg() until everything becomes normal.
867
868  This patch adds fsp_in_rr() check to fsp_sync_msg() and return error to caller
869    if FSP is in R/R.
870- capp: Fix hang when CAPP microcode LID is missing on FSP machine
871
872  When the LID is absent, we fail early with an error from
873  start_preload_resource. In that case, capp_ucode_info.load_result
874  isn't set properly causing a subsequent capp_lid_download() to
875  call wait_for_resource_loaded() on something that isn't being
876  loaded, thus hanging.
877- FSP/CONSOLE: Do not free fsp_msg in error path
878
879  as we reuse same msg to send next output message.
880
881- platform/zz: Acknowledge OCC_LOAD mbox message in ZZ
882
883  In P9 FSP box, OCC image is pre-loaded. So do not handle the load
884  command and send SUCCESS to FSP on recieving OCC_LOAD mbox message.
885
886- FSP/RTC: Improve error log
887
888astbmc systems
889--------------
890
891Since :ref:`skiboot-5.6.0`:
892
893- platforms/astbmc: Don't validate model on palmetto
894
895  The platform isn't compatible with palmetto until the root device-tree
896  node's "model" property is NULL or "palmetto". However, we could have
897  "TN71-BP012" for the property on palmetto. ::
898
899       linux# cat /proc/device-tree/model
900       TN71-BP012
901
902  This skips the validation on root device-tree node's "model" property
903  on palmetto, meaning we check the "compatible" property only.
904
905
906General
907-------
908
909Since :ref:`skiboot-5.7-rc2`:
910
911- core/pci: Fix mem-leak on fast-reboot
912
913  Fast-reboot has a memory leak which causes the system to crash after about
914  250 fast-reboots. The patch fixes the memory leak.
915  The cause of the leak was the pci_device's being freed, without freeing
916  the pci_slot within it.
917
918- gcov: properly handle gard and pflash code coverage
919
920Since :ref:`skiboot-5.6.0`:
921
922- Reduce log level on non-error log messages
923
924  90% of what we print isn't useful to a normal user. This
925  dramatically reduces the amount of messages printed by
926  OPAL in normal circumstances.
927
928- init: Silence messages and call ourselves "OPAL"
929- psi: Switch to ESB mode later
930
931  There's an errata, if we switch to ESB mode before setting up
932  the various ESB mode related registers, a pending interrupts
933  can go wrong.
934
935- lpc: Enable "new" SerIRQ mode
936- hw/ipmi/ipmi-sel: missing newline in prlog warning
937
938- p8-i2c OCC lock: fix locking in p9_i2c_bus_owner_change
939- Convert important polling loops to spin at lowest SMT priority
940
941  The pattern of calling cpu_relax() inside a polling loop does
942  not suit the powerpc SMT priority instructions. Prefrred is to
943  set a low priority then spin until break condition is reached,
944  then restore priority.
945
946- Improve cpu_idle when PM is disabled
947
948  Split cpu_idle() into cpu_idle_delay() and cpu_idle_job() rather than
949  requesting the idle type as a function argument. Have those functions
950  provide a default polling (non-PM) implentation which spin at the
951  lowest SMT priority.
952
953- core/fdt: Always add a reserve map
954
955  Currently we skip adding the reserved ranges block to the generated
956  FDT blob if we are excluding the root node. This can result in a DTB
957  that dtc will barf on because the reserved memory ranges overlap with
958  the start of the dt_struct block. As an example: ::
959
960    $ fdtdump broken.dtb -d
961    /dts-v1/;
962    // magic:               0xd00dfeed
963    // totalsize:           0x7f3 (2035)
964    // off_dt_struct:       0x30  <----\
965    // off_dt_strings:      0x7b8       | this is bad!
966    // off_mem_rsvmap:      0x30  <----/
967    // version:             17
968    // last_comp_version:   16
969    // boot_cpuid_phys:     0x0
970    // size_dt_strings:     0x3b
971    // size_dt_struct:      0x788
972
973    /memreserve/ 0x100000000 0x300000004;
974    /memreserve/ 0x3300000001 0x169626d2c;
975    /memreserve/ 0x706369652d736c6f 0x7473000000000003;
976            *continues*
977
978  With this patch: ::
979
980    $ fdtdump working.dtb -d
981    /dts-v1/;
982    // magic:               0xd00dfeed
983    // totalsize:           0x803 (2051)
984    // off_dt_struct:       0x40
985    // off_dt_strings:      0x7c8
986    // off_mem_rsvmap:      0x30
987    // version:             17
988    // last_comp_version:   16
989    // boot_cpuid_phys:     0x0
990    // size_dt_strings:     0x3b
991    // size_dt_struct:      0x788
992
993    // 0040: tag: 0x00000001 (FDT_BEGIN_NODE)
994    / {
995    // 0048: tag: 0x00000003 (FDT_PROP)
996    // 07fb: string: phandle
997    // 0054: value
998        phandle = <0x00000001>;
999            *continues*
1000
1001- hw/lpc-mbox: Use message registers for interrupts
1002
1003  Currently the BMC raises the interrupt using the BMC control register.
1004  It does so on all accesses to the 16 'data' registers meaning that when
1005  the BMC only wants to set the ATTN (on which we have interrupts enabled)
1006  bit we will also get a control register based interrupt.
1007
1008  The solution here is to mask that interrupt permanantly and enable
1009  interrupts on the protocol defined 'response' data byte.
1010
1011PCI
1012---
1013
1014Since :ref:`skiboot-5.6.0`:
1015
1016- pci: Wait 20ms before checking presence detect on PCIe
1017
1018  As the PHB presence logic has a debounce timer that can take
1019  a while to settle.
1020
1021- phb3+iov: Fixup support for config space filters
1022
1023  The filter should be called before the HW access and its
1024  return value control whether to perform the access or not
1025- core/pci: Use PCI slot's power facality in pci_enable_bridge()
1026
1027  The current implmentation has incorrect assumptions: there is
1028  always a PCI slot associated with root port and PCIe switch
1029  downstream port and all of them are capable to change its
1030  power state by register PCICAP_EXP_SLOTCTL. Firstly, there
1031  might not a PCI slot associated with the root port or PCIe
1032  switch downstream port. Secondly, the power isn't controlled
1033  by standard config register (PCICAP_EXP_SLOTCTL). There are
1034  I2C slave devices used to control the power states on Tuleta.
1035
1036  In order to use the PCI slot's methods to manage the power
1037  states, this does:
1038
1039  * Introduce PCI_SLOT_FLAG_ENFORCE, indicates the request operation
1040    is enforced to be applied.
1041  * pci_enable_bridge() is split into 3 functions: pci_bridge_power_on()
1042    to power it on; pci_enable_bridge() as a place holder and
1043    pci_bridge_wait_link() to wait the downstream link to come up.
1044  * In pci_bridge_power_on(), the PCI slot's specific power management
1045    methods are used if there is a PCI slot associated with the PCIe
1046    switch downstream port or root port.
1047- platforms/astbmc/slots.c: Allow comparison of bus numbers when matching slots
1048
1049  When matching devices on multiple down stream PLX busses we need to compare more
1050  than just the device-id of the PCIe BDFN, so increase the mask to do so.
1051
1052Debugging, Tests and simulators
1053-------------------------------
1054
1055Since :ref:`skiboot-5.7-rc2`:
1056
1057- boot_tests: add PFLASH_TO_COPY for OpenBMC
1058- travis: Add debian stretch and unstable
1059
1060  At the moment, we mark them both as being able to fail, as we're
1061  hitting an assert in one of the unit tests on debian stretch, and
1062  that hasn't yet been chased down.
1063
1064- core/backtrace: Serialise printing backtraces
1065
1066  Add a lock so that only one thread can print a backtrace at a time.
1067  This should prevent multiple threads from garbaling each other's
1068  backtraces.
1069
1070Since :ref:`skiboot-5.7-rc1`:
1071
1072- lpc: remove double LPC prefix from messages
1073- opal-ci/fetch-debian-jessie-installer: follow redirects
1074  Fixes some CI failures
1075- test/qemu-jessie: bail out fast on kernel panic
1076- test/qemu-jessie: dump boot log on failure
1077- travis: add fedora26
1078- xz: add fallthrough annotations to silence GCC7 warning
1079
1080Since :ref:`skiboot-5.6.0`:
1081
1082- boot-tests: add OpenBMC support
1083- boot_test.sh: Add SMC BMC support
1084
1085  Your BMC needs a special debug image flashed to use this, the exact
1086  image and methods aren't something I can publish here, but if you work
1087  for IBM or SMC you can find out from the right sources.
1088
1089  A few things are needed to move around to be able to flash to a SMC BMC.
1090
1091  For a start, the SSH daemon will only accept connections after a special
1092  incantation (which I also can't share), but you should put that in the
1093  ~/.skiboot_boot_tests file along with some other default login information
1094  we don't publicise too broadly (because Security Through Obscurity is
1095  *obviously* a good idea....)
1096
1097  We also can't just directly "ssh /bin/true", we need an expect script,
1098  and we can't scp, but we can anonymous rsync!
1099
1100  You also need a pflash binary to copy over.
1101- hdata_to_dt: Add PVR overrides to the usage text
1102- mambo: Add a reservation for the initramfs
1103
1104  On most systems the initramfs is loaded inside the part of memory
1105  reserved for the OS [0x0-0x30000000] and skiboot will never touch it.
1106  On mambo it's loaded at 0x80000000 and if you're unlucky skiboot can
1107  allocate over the top of it and corrupt the initramfs blob.
1108
1109  There might be the downside that the kernel cannot re-use the initramfs
1110  memory since it's marked as reserved, but the kernel might also free it
1111  anyway.
1112- mambo: Update P9 PVR to reflect Scale out 24 core chips
1113
1114  The P9 PVR bits 48:51 don't indicate a revision but instead different
1115  configurations.  From BookIV we have:
1116
1117  ==== ===================
1118  Bits Configuration
1119  ==== ===================
1120     0 Scale out 12 cores
1121     1 Scale out 24 cores
1122     2 Scale up 12 cores
1123     3 Scale up 24 cores
1124  ==== ===================
1125
1126  Skiboot will mostly the use "Scale out 24 core" configuration
1127  (ie. SMT4 not SMT8) so reflect this in mambo.
1128- core: Move enable_mambo_console() into chip initialisation
1129
1130  Rather than having a wart in main_cpu_entry() that initialises the mambo
1131  console, we can move it into init_chips() which is where we discover that we're
1132  on mambo.
1133
1134- mambo: Create multiple chips when we have multiple CPUs
1135
1136  Currently when we boot mambo with multiple CPUs, we create multiple CPU nodes in
1137  the device tree, and each claims to be on a separate chip.
1138
1139  However we don't create multiple xscom nodes, which means skiboot only knows
1140  about a single chip, and all CPUs end up on it. At the moment mambo is not able
1141  to create multiple xscom controllers. We can create fake ones, just by faking
1142  the device tree up, but that seems uglier than this solution.
1143
1144  So create a mambo-chip for each CPU other than 0, to tell skiboot we want a
1145  separate chip created. This then enables Linux to see multiple chips: ::
1146
1147      smp: Brought up 2 nodes, 2 CPUs
1148      numa: Node 0 CPUs: 0
1149      numa: Node 1 CPUs: 1
1150
1151- chip: Add support for discovering chips on mambo
1152
1153  Currently the only way for skiboot to discover chips is by looking for xscom
1154  nodes. But on mambo it's currently not possible to create multiple xscom nodes,
1155  which means we can only simulate a single chip system.
1156
1157  However it seems we can fairly cleanly add support for a special mambo chip
1158  node, and use that to instantiate multiple chips.
1159
1160  Add a check in init_chip() that we're not clobbering an already initialised
1161  chip, now that we have two places that initialise chips.
1162- mambo: Make xscom claim to be DD 2.0
1163
1164  In the mambo tcl we set the CPU version to DD 2.0, because mambo is not
1165  bug compatible with DD 1.
1166
1167  But in xscom_read_cfam_chipid() we have a hard coded value, to work
1168  around the lack of the f000f register, which claims to be P9 DD 1.0.
1169
1170  This doesn't seem to cause crashes or anything, but at boot we do see: ::
1171
1172      [    0.003893084,5] XSCOM: chip 0x0 at 0x1a0000000000 [P9N DD1.0]
1173
1174  So fix it to claim that the xscom is also DD 2.0 to match the CPU.
1175
1176- mambo: Match whole string when looking up symbols with linsym/skisym
1177
1178  linsym/skisym use a regex to match the symbol name, and accepts a
1179  partial match against the entry in the symbol map, which can lead to
1180  somewhat confusing results, eg: ::
1181
1182      systemsim % linsym early_setup
1183      0xc000000000027890
1184      systemsim % linsym early_setup$
1185      0xc000000000aa8054
1186      systemsim % linsym early_setup_secondary
1187      0xc000000000027890
1188
1189  I don't think that's the behaviour we want, so append a $ to the name so
1190  that the symbol has to match against the whole entry, eg: ::
1191
1192      systemsim % linsym early_setup
1193      0xc000000000aa8054
1194
1195- Disable nap on P8 Mambo, public release has bugs
1196- mambo: Allow loading multiple CPIOs
1197
1198  Currently we have support for loading a single CPIO and telling Linux to
1199  use it as the initrd. But the Linux code actually supports having
1200  multiple CPIOs contiguously in memory, between initrd-start and end, and
1201  will unpack them all in order. That is a really nice feature as it means
1202  you can have a base CPIO with your root filesystem, and then tack on
1203  others as you need for various tests etc.
1204
1205  So expand the logic to handle SKIBOOT_INITRD, and treat it as a comma
1206  separated list of CPIOs to load. I chose comma as it's fairly rare in
1207  filenames, but we could make it space, colon, whatever. Or we could add
1208  a new environment variable entirely. The code also supports trimming
1209  whitespace from the values, so you can have "cpio1, cpio2".
1210- hdata/test: Add memory reservations to hdata_to_dt
1211
1212  Currently memory reservations are parsed, but since they are not
1213  processed until mem_region_init() they don't appear in the output
1214  device tree blob. Several bugs have been found with memory reservations
1215  so we want them to be part of the test output.
1216
1217  Add them and clean up several usages of printf() since we want only the
1218  dtb to appear in standard out.
1219
1220
1221pflash/libffs
1222-------------
1223
1224Since :ref:`skiboot-5.7-rc2`:
1225
1226- pflash option to retrieve PNOR partition flags
1227
1228  This commit extends pflash with an option to retrieve and print
1229  information for a particular partition, including the content from
1230  "pflash -i" and a verbose list of set miscellaneous flags. -i option
1231  is also updated to print a short list of flags in addition to the
1232  ECC flag, with one character per flag. A test of the new option is
1233  included in libflash/test.
1234
1235Since :ref:`skiboot-5.6.0`:
1236
1237- libflash/libffs: Zero checksum words
1238
1239  On writing ffs entries to flash libffs doesn't zero checksum words
1240  before calculating the checksum across the entire structure. This causes
1241  an inaccurate calculation of the checksum as it may calculate a checksum
1242  on non-zero checksum bytes.
1243
1244- libffs: Fix ffs_lookup_part() return value
1245
1246  It would return success when the part wasn't found
1247- libflash/libffs: Correctly update the actual size of the partition
1248
1249  libffs has been updating FFS partition information in the wrong place
1250  which leads to incomplete erases and corruption.
1251- libflash: Initialise entries list earlier
1252
1253  In the bail-out path we call ffs_close() to tear down the partially
1254  initialised ffs_handle. ffs_close() expects the entries list to be
1255  initialised so we need to do that earlier to prevent a null pointer
1256  dereference.
1257
1258mbox-flash
1259----------
1260
1261mbox-flash is the emerging standard way of talking to host PNOR flash
1262on POWER9 systems.
1263
1264- libflash/mbox-flash: Implement MARK_WRITE_ERASED mbox call
1265
1266  Version two of the mbox-flash protocol defines a new command:
1267  MARK_WRITE_ERASED.
1268
1269  This command provides a simple way to mark a region of flash as all 0xff
1270  without the need to go and write all 0xff. This is an optimisation as
1271  there is no need for an erase before a write, it is the responsibility of
1272  the BMC to deal with the flash correctly, however in v1 it was ambiguous
1273  what a client should do if the flash should be erased but not actually
1274  written to. This allows of a optimal path to resolve this problem.
1275
1276- libflash/mbox-flash: Update to V2 of the protocol
1277
1278  Updated version 2 of the protocol can be found at:
1279  https://github.com/openbmc/mboxbridge/blob/master/Documentation/mbox_protocol.md
1280
1281  This commit changes mbox-flash such that it will preferentially talk
1282  version 2 to any capable daemon but still remain capable of talking to
1283  v1 daemons.
1284
1285  Version two changes some of the command definitions for increased
1286  consistency and usability.
1287  Version two includes more attention bits - these are now dealt with at a
1288  simple level.
1289- libflash/mbox-flash: Implement MARK_WRITE_ERASED mbox call
1290
1291  Version two of the mbox-flash protocol defines a new command:
1292  MARK_WRITE_ERASED.
1293
1294  This command provides a simple way to mark a region of flash as all 0xff
1295  without the need to go and write all 0xff. This is an optimisation as
1296  there is no need for an erase before a write, it is the responsibility of
1297  the BMC to deal with the flash correctly, however in v1 it was ambiguous
1298  what a client should do if the flash should be erased but not actually
1299  written to. This allows of a optimal path to resolve this problem.
1300
1301- libflash/mbox-flash: Update to V2 of the protocol
1302
1303  Updated version 2 of the protocol can be found at:
1304  https://github.com/openbmc/mboxbridge/blob/master/Documentation/mbox_protocol.md
1305
1306  This commit changes mbox-flash such that it will preferentially talk
1307  version 2 to any capable daemon but still remain capable of talking to
1308  v1 daemons.
1309
1310  Version two changes some of the command definitions for increased
1311  consistency and usability.
1312  Version two includes more attention bits - these are now dealt with at a
1313  simple level.
1314
1315- hw/lpc-mbox: Use message registers for interrupts
1316
1317  Currently the BMC raises the interrupt using the BMC control register.
1318  It does so on all accesses to the 16 'data' registers meaning that when
1319  the BMC only wants to set the ATTN (on which we have interrupts enabled)
1320  bit we will also get a control register based interrupt.
1321
1322  The solution here is to mask that interrupt permanantly and enable
1323  interrupts on the protocol defined 'response' data byte.
1324
1325
1326Contributors
1327------------
1328
1329* Processed 232 csets from 29 developers.
1330* 1 employer found
1331* A total of 13043 lines added, 2517 removed (delta 10526)
1332
1333Extending the analysis done for some previous releases, we can see our trends
1334in code review across versions:
1335
1336======= ====== ======== ========= ========= ===========
1337Release	csets  Ack %    Reviews % Tested %  Reported %
1338======= ====== ======== ========= ========= ===========
13395.0	329    15 (5%)  20 (6%)   1 (0%)    0 (0%)
13405.1	372    13 (3%)  38 (10%)  1 (0%)    4 (1%)
13415.2-rc1	334    20 (6%)  34 (10%)  6 (2%)    11 (3%)
13425.3-rc1	302    36 (12%) 53 (18%)  4 (1%)    5 (2%)
13435.4	361    16 (4%)  28 (8%)   1 (0%)    9 (2%)
13445.5	408    11 (3%)  48 (12%)  14 (3%)   10 (2%)
13455.6	87     12 (14%)  6 (7%)   5 (6%)    2 (2%)
13465.7	232    30 (13%) 32 (14%)  5 (2%)    2 (1%)
1347======= ====== ======== ========= ========= ===========
1348
1349This cycle has been good for reviews/acks, scoring second highest percentage
1350ever on both, as well as being right up there on absolute numbers.
1351
1352
1353Developers with the most changesets
1354^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1355
1356========================= ==== =======
1357Developer                    #       %
1358========================= ==== =======
1359Benjamin Herrenschmidt      41 (17.7%)
1360Stewart Smith               31 (13.4%)
1361Michael Neuling             28 (12.1%)
1362Oliver O'Halloran           18 (7.8%)
1363Vasant Hegde                18 (7.8%)
1364Jeremy Kerr                 12 (5.2%)
1365Alistair Popple             11 (4.7%)
1366Gavin Shan                  10 (4.3%)
1367Russell Currey               9 (3.9%)
1368Michael Ellerman             9 (3.9%)
1369Madhavan Srinivasan          7 (3.0%)
1370Cyril Bur                    6 (2.6%)
1371Christophe Lombard           5 (2.2%)
1372Shilpasri G Bhat             5 (2.2%)
1373Andrew Donnellan             3 (1.3%)
1374Nicholas Piggin              3 (1.3%)
1375Mahesh Salgaonkar            2 (0.9%)
1376Anju T Sudhakar              2 (0.9%)
1377Hemant Kumar                 2 (0.9%)
1378Matt Brown                   1 (0.4%)
1379Michael Tritz                1 (0.4%)
1380Joel Stanley                 1 (0.4%)
1381Balbir Singh                 1 (0.4%)
1382Frederic Barrat              1 (0.4%)
1383Andrew Jeffery               1 (0.4%)
1384Pridhiviraj Paidipeddi       1 (0.4%)
1385Reza Arbab                   1 (0.4%)
1386Suraj Jitindar Singh         1 (0.4%)
1387Vaibhav Jain                 1 (0.4%)
1388========================= ==== =======
1389
1390
1391Developers with the most changed lines
1392^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1393
1394========================= ==== =======
1395Developer                    #       %
1396========================= ==== =======
1397Hemant Kumar              3056 (23.0%)
1398Stewart Smith             1826 (13.7%)
1399Benjamin Herrenschmidt    1348 (10.1%)
1400Christophe Lombard         937 (7.0%)
1401Shilpasri G Bhat           770 (5.8%)
1402Madhavan Srinivasan        755 (5.7%)
1403Jeremy Kerr                731 (5.5%)
1404Cyril Bur                  674 (5.1%)
1405Alistair Popple            477 (3.6%)
1406Gavin Shan                 414 (3.1%)
1407Russell Currey             396 (3.0%)
1408Michael Neuling            336 (2.5%)
1409Vasant Hegde               308 (2.3%)
1410Oliver O'Halloran          300 (2.3%)
1411Anju T Sudhakar            300 (2.3%)
1412Michael Tritz              167 (1.3%)
1413Frederic Barrat            113 (0.8%)
1414Nicholas Piggin             93 (0.7%)
1415Mahesh Salgaonkar           76 (0.6%)
1416Michael Ellerman            66 (0.5%)
1417Suraj Jitindar Singh        59 (0.4%)
1418Andrew Donnellan            53 (0.4%)
1419Joel Stanley                20 (0.2%)
1420Balbir Singh                12 (0.1%)
1421Reza Arbab                  10 (0.1%)
1422Vaibhav Jain                 9 (0.1%)
1423Pridhiviraj Paidipeddi       2 (0.0%)
1424Matt Brown                   1 (0.0%)
1425Andrew Jeffery               1 (0.0%)
1426========================= ==== =======
1427
1428Developers with the most signoffs
1429^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1430(total 242)
1431
1432========================= ==== =======
1433Developer                    #       %
1434========================= ==== =======
1435Stewart Smith              201 (83.1%)
1436Michael Neuling             29 (12.0%)
1437Madhavan Srinivasan          4 (1.7%)
1438Suraj Jitindar Singh         3 (1.2%)
1439Anju T Sudhakar              2 (0.8%)
1440Hemant Kumar                 2 (0.8%)
1441Cyril Bur                    1 (0.4%)
1442========================= ==== =======
1443
1444Developers with the most reviews
1445^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1446(total 32)
1447
1448========================= ==== =======
1449Developer                    #       %
1450========================= ==== =======
1451Vasant Hegde                 8 (25.0%)
1452Cyril Bur                    7 (21.9%)
1453Andrew Donnellan             5 (15.6%)
1454Frederic Barrat              5 (15.6%)
1455Andrew Jeffery               2 (6.2%)
1456Gavin Shan                   2 (6.2%)
1457Joel Stanley                 1 (3.1%)
1458Oliver O'Halloran            1 (3.1%)
1459Alistair Popple              1 (3.1%)
1460========================= ==== =======
1461
1462Developers with the most test credits
1463^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1464(total 5)
1465
1466========================== ==== =======
1467Developer                    #       %
1468========================== ==== =======
1469Vasant Hegde                  2 (40.0%)
1470Oliver O'Halloran             1 (20.0%)
1471Ananth N Mavinakayanahalli    1 (20.0%)
1472Michael Ellerman              1 (20.0%)
1473========================== ==== =======
1474
1475Developers who gave the most tested-by credits
1476^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1477(total 5)
1478
1479========================= ==== =======
1480Developer                    #       %
1481========================= ==== =======
1482Jeremy Kerr                  2 (40.0%)
1483Vasant Hegde                 1 (20.0%)
1484Oliver O'Halloran            1 (20.0%)
1485Michael Ellerman             1 (20.0%)
1486========================= ==== =======
1487
1488Developers with the most report credits
1489^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1490(total 2)
1491
1492========================= ==== =======
1493Developer                    #       %
1494========================= ==== =======
1495Oliver O'Halloran            1 (50.0%)
1496Alastair D'Silva             1 (50.0%)
1497========================= ==== =======
1498
1499Developers who gave the most report credits
1500^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1501(total 2)
1502
1503========================= ==== =======
1504Developer                    #       %
1505========================= ==== =======
1506Andrew Donnellan             1 (50.0%)
1507Stewart Smith                1 (50.0%)
1508========================= ==== =======
1509