1==============================================
2Ordering I/O writes to memory-mapped addresses
3==============================================
4
5On some platforms, so-called memory-mapped I/O is weakly ordered.  On such
6platforms, driver writers are responsible for ensuring that I/O writes to
7memory-mapped addresses on their device arrive in the order intended.  This is
8typically done by reading a 'safe' device or bridge register, causing the I/O
9chipset to flush pending writes to the device before any reads are posted.  A
10driver would usually use this technique immediately prior to the exit of a
11critical section of code protected by spinlocks.  This would ensure that
12subsequent writes to I/O space arrived only after all prior writes (much like a
13memory barrier op, mb(), only with respect to I/O).
14
15A more concrete example from a hypothetical device driver::
16
17		...
18	CPU A:  spin_lock_irqsave(&dev_lock, flags)
19	CPU A:  val = readl(my_status);
20	CPU A:  ...
21	CPU A:  writel(newval, ring_ptr);
22	CPU A:  spin_unlock_irqrestore(&dev_lock, flags)
23		...
24	CPU B:  spin_lock_irqsave(&dev_lock, flags)
25	CPU B:  val = readl(my_status);
26	CPU B:  ...
27	CPU B:  writel(newval2, ring_ptr);
28	CPU B:  spin_unlock_irqrestore(&dev_lock, flags)
29		...
30
31In the case above, the device may receive newval2 before it receives newval,
32which could cause problems.  Fixing it is easy enough though::
33
34		...
35	CPU A:  spin_lock_irqsave(&dev_lock, flags)
36	CPU A:  val = readl(my_status);
37	CPU A:  ...
38	CPU A:  writel(newval, ring_ptr);
39	CPU A:  (void)readl(safe_register); /* maybe a config register? */
40	CPU A:  spin_unlock_irqrestore(&dev_lock, flags)
41		...
42	CPU B:  spin_lock_irqsave(&dev_lock, flags)
43	CPU B:  val = readl(my_status);
44	CPU B:  ...
45	CPU B:  writel(newval2, ring_ptr);
46	CPU B:  (void)readl(safe_register); /* maybe a config register? */
47	CPU B:  spin_unlock_irqrestore(&dev_lock, flags)
48
49Here, the reads from safe_register will cause the I/O chipset to flush any
50pending writes before actually posting the read to the chipset, preventing
51possible data corruption.
52