1<?xml version="1.0" encoding="UTF-8"?>
2<!DOCTYPE section PUBLIC "-//Boost//DTD BoostBook XML V1.1//EN"
3"http://www.boost.org/tools/boostbook/dtd/boostbook.dtd">
4<section id="safe_numerics.safety_critical_embedded_controller">
5  <title>Safety Critical Embedded Controller</title>
6
7  <?dbhtml stop-chunking?>
8
9  <para>Suppose we are given the task of creating stepper motor driver
10  software to drive a robotic hand to be used in robotic micro surgery. The
11  processor that has been selected by the engineers is the <ulink
12  url="http://www.microchip.com/wwwproducts/en/PIC18F2520">PIC18F2520</ulink>
13  manufactured by <ulink url="http://www.microchip.com">Microchip
14  Corporation</ulink>. This processor has 32KB of program memory. On a
15  processor this small, it's common to use a mixture of 8, 16, and 32 bit data
16  types in order to minimize memory footprint and program run time. The type
17  <code>int</code> has 16 bits. It's programmed in C. Since this program is
18  going to be running life critical function, it must be demonstrably correct.
19  This implies that it needs to be verifiable and testable. Since the target
20  micro processor is inconvenient for accomplishing these goals, we will build
21  and test the code on the desktop.</para>
22
23  <section>
24    <title>How a Stepper Motor Works</title>
25
26    <figure float="0">
27      <title>Stepper Motor</title>
28
29      <mediaobject>
30        <imageobject>
31          <imagedata align="left" contentwidth="216"
32                     fileref="StepperMotor.gif" format="GIF" width="50%"/>
33        </imageobject>
34      </mediaobject>
35    </figure>
36
37    <para>A stepper motor controller emits a pulse which causes the motor to
38    move one step. It seems simple, but in practice it turns out to be quite
39    intricate to get right as one has to time the pulses individually to
40    smoothly accelerate the rotation of the motor from a standing start until
41    it reaches the some maximum velocity. Failure to do this will either limit
42    the stepper motor to very low speed or result in skipped steps when the
43    motor is under load. Similarly, a loaded motor must be slowly decelerated
44    down to a stop.</para>
45
46    <para><figure>
47        <title>Motion Profile</title>
48
49        <mediaobject>
50          <imageobject>
51            <imagedata fileref="stepper_profile.png" format="PNG" width="100%"/>
52          </imageobject>
53        </mediaobject>
54      </figure></para>
55
56    <para>This implies the the width of the pulses must decrease as the motor
57    accelerates. That is the pulse with has to be computed while the motor is
58    in motion. This is illustrated in the above drawing. A program to
59    accomplish this might look something like the following:</para>
60
61    <literallayout class="normal" linenumbering="unnumbered">setup registers and step to zero position
62
63specify target position
64set initial time to interrupt
65enable interrupts
66
67On interrupt
68    if at target position
69        disable interrupts and return
70    calculate width of next step
71    change current winding according to motor direction
72    set delay time to next interrupt to width of next step</literallayout>
73
74    <para>Already, this is turning it to a much more complex project than it
75    first seemed. Searching around the net, we find a popular <ulink
76    url="../../example/stepper-motor.pdf">article</ulink> on the operation of
77    stepper motors using simple micro controllers. The algorithm is very well
78    explained and it includes complete <ulink url="../../example/motor.c">code
79    we can test</ulink>. The engineers are still debugging the prototype
80    boards and hope to have them ready before the product actually ships. But
81    this doesn't have to keep us from working on our code.</para>
82  </section>
83
84  <section>
85    <title>Updating the Code</title>
86
87    <para>Inspecting this <ulink url="../../example/motor.c">code</ulink>, we
88    find that it is written in a dialect of C rather than C itself. At the
89    time this code was written, conforming versions of the C compiler were not
90    available for PIC processors. We want to compile this code on the <ulink
91    url="http://ww1.microchip.com/downloads/en/DeviceDoc/50002053G.pdf">Microchip
92    XC8 compiler</ulink> which, for the most part, is standards conforming. So
93    we made the following minimal changes:</para>
94
95    <para><itemizedlist>
96        <listitem>
97          <para>Factor into <ulink
98          url="../../example/motor1.c">motor1.c</ulink> which contains the
99          motor driving code and <ulink
100          url="../../example/motor_test1.c">motor_test1.c</ulink> which tests
101          that code.</para>
102        </listitem>
103
104        <listitem>
105          <para>Include header <code>&lt;xc.h&gt;</code> which contains
106          constants for the <ulink
107          url="http://www.microchip.com/wwwproducts/en/PIC18F2520">PIC18F2520</ulink>
108          processor</para>
109        </listitem>
110
111        <listitem>
112          <para>Include header <code>&lt;stdint.h&gt;</code> to include
113          standard Fixed width integer types.</para>
114        </listitem>
115
116        <listitem>
117          <para>Include header <code>&lt;stdbool.h&gt;</code> to include
118          keywords true and false in a C program.</para>
119        </listitem>
120
121        <listitem>
122          <para>The original has some anomalies in the names of types. For
123          example, int16 is assumed to be unsigned. This is an artifact of the
124          original C compiler being used. So type names in the code were
125          altered to standard ones while retaining the intent of the original
126          code.</para>
127        </listitem>
128
129        <listitem>
130          <para>Add in missing <code>make16</code> function.</para>
131        </listitem>
132
133        <listitem>
134          <para>Format code to personal taste.</para>
135        </listitem>
136
137        <listitem>
138          <para>Replaced enable_interrupts and disable_interrupts functions
139          with appropriate PIC commands.</para>
140        </listitem>
141      </itemizedlist></para>
142
143    <para>The resulting program can be checked to be identical to the original
144    but compiles on with the Microchip XC8 compiler. Given a development
145    board, we could hook it up to a stepper motor, download and boot the code
146    and verify that the motor rotates 5 revolutions in each direction with
147    smooth acceleration and deceleration. We don't have such a board yet, but
148    the engineers have promised a working board real soon now.</para>
149  </section>
150
151  <section>
152    <title>Refactor for Testing</title>
153
154    <para>In order to develop our test suite and execute the same code on the
155    desktop and the target system we factor out the shared code as a separate
156    module which will used in both environments without change. The shared
157    module <ulink url="../../example/motor2.c"><code><ulink
158    url="../../example/motor1.c">motor2.c</ulink></code></ulink> contains the
159    algorithm for handling the interrupts in such a way as to create the
160    smooth acceleration we require.</para>
161
162    <literallayout>    <ulink url="../../example/motor2.c"><code><ulink
163            url="../../example/motor_test2.c">motor_test2.c</ulink></code></ulink>        <ulink
164        url="../../example/motor2.c"><code><ulink
165            url="../../example/example92.cpp">example92.cpp</ulink></code></ulink>
166
167    #include ...         #include ...
168    PIC typedefs ...     desktop types ...
169            \               /
170             \             /
171            #include <ulink url="../../example/motor2.c"><code><ulink
172            url="../../example/motor2.c">motor2.c</ulink></code></ulink>
173             /             \
174            /               \
175    PIC test code        desktop test code</literallayout>
176  </section>
177
178  <section>
179    <title>Compiling on the Desktop</title>
180
181    <para>Using the target environment to run tests is often very difficult or
182    impossible due to limited resources. So software unit testing for embedded
183    systems is very problematic and often skipped. The C language on our
184    desktop is the same used by the <ulink
185    url="http://www.microchip.com/wwwproducts/en/PIC18F2520">PIC18F2520</ulink>.
186    So now we can also run and debug the code on our desktop machine. Once our
187    code passes all our tests, we can download the code to the embedded
188    hardware and run the code natively. Here is a program we use on the
189    desktop to do that:</para>
190
191    <programlisting><xi:include href="../../example/example92.cpp"
192        parse="text" xmlns:xi="http://www.w3.org/2001/XInclude"/></programlisting>
193
194    <para>Here are the essential features of the desktop version of the test
195    program.<orderedlist>
196        <listitem>
197          <para>Include headers required to support safe integers.</para>
198        </listitem>
199
200        <listitem>
201          <para>Specify a <link
202          linkend="safe_numerics.promotion_policy">promotion policy</link> to
203          support proper emulation of PIC types on the desktop.</para>
204
205          <para>The C language standard doesn't specify sizes for primitive
206          data types like <code>int</code>. They can and do differ between
207          environments. Hence, the characterization of C/C++ as "portable"
208          languages is not strictly true. Here we choose aliases for data
209          types so that they can be defined to be the same in both
210          environments. But this is not enough to emulate the <ulink
211          url="http://www.microchip.com/wwwproducts/en/PIC18F2520">PIC18F2520</ulink>
212          on the desktop. The problem is that compilers implicitly convert
213          arguments of C expressions to some common type before performing
214          arithmetic operations. Often, this common type is the native
215          <code>int</code> and the size of this native type is different in
216          the desktop and embedded environment. Thus, many arithmetic results
217          would be different in the two environments.</para>
218
219          <para>But now we can specify our own implicit promotion rules for
220          test programs on the development platform that are identical to
221          those on the target environment! So unit testing executed in the
222          development environment can now provide results relevant to the
223          target environment.</para>
224        </listitem>
225
226        <listitem>
227          <para>Define PIC integer type aliases to be safe integer types of he
228          same size.</para>
229
230          <para>Code tested in the development environment will use safe
231          numerics to detect errors. We need these aliases to permit the code
232          in <ulink url="../../example/motor2.c">motor2.c</ulink> to be tested
233          in the desktop environment. The same code run in the target system
234          without change.</para>
235        </listitem>
236
237        <listitem>
238          <para>Emulate PIC features on the desktop.</para>
239
240          <para>The PIC processor, in common with most micro controllers these
241          days, includes a myriad of special purpose peripherals to handle
242          things like interrupts, USB, timers, SPI bus, I^2C bus, etc.. These
243          peripherals are configured using special 8 bit words in reserved
244          memory locations. Configuration consists of setting particular bits
245          in these words. To facilitate configuration operations, the XC8
246          compiler includes a special syntax for setting and accessing bits in
247          these locations. One of our goals is to permit the testing of the
248          identical code with our desktop C++ compiler as will run on the
249          micro controller. To realize this goal, we create some C++ code
250          which implements the XC8 C syntax for setting bits in particular
251          memory locations.</para>
252        </listitem>
253
254        <listitem>
255          <para>include <ulink
256          url="../../example/motor1.c">motor1.c</ulink></para>
257        </listitem>
258
259        <listitem>
260          <para>Add test to verify that the motor will be able to keep track
261          of a position from 0 to 50000 steps. This will be needed to maintain
262          the position of out linear stage across a range from 0 to 500
263          mm.</para>
264        </listitem>
265      </orderedlist></para>
266
267    <para>Our first attempt to run this program fails by throwing an exception
268    from <ulink url="../../example/motor1.c">motor1.c</ulink> indicating that
269    the code attempts to left shift a negative number at the
270    statements:</para>
271
272    <programlisting>denom = ((step_no - move) &lt;&lt; 2) + 1;</programlisting>
273
274    <para>According to the C/C++ standards this is implementation defined
275    behavior. But in practice with all modern platforms (as far as I know),
276    this will be equivalent to a multiplication by 4. Clearly the intent of
277    the original author is to "micro optimize" the operation by substituting a
278    cheap left shift for a potentially expensive integer multiplication. But
279    on all modern compilers, this substitution will be performed automatically
280    by the compiler's optimizer. So we have two alternatives here:</para>
281
282    <itemizedlist>
283      <listitem>
284        <para>Just ignore the issue.</para>
285
286        <para>This will work when the code is run on the PIC. But, in order to
287        permit testing on the desktop, we need to inhibit the error detection
288        in that environment. With safe numerics, error handling is determined
289        by specifying an <link
290        linkend="safe_numerics.exception_policy">exception policy</link>. In
291        this example, we've used the default exception policy which traps
292        implementation defined behavior. To ignore this kind of behavior we
293        could define our own custom <link
294        linkend="safe_numerics.exception_policy">exception
295        policy</link>.</para>
296      </listitem>
297
298      <listitem>
299        <para>change the <code>&lt;&lt; 2</code> to <code>* 4</code>. This
300        will produce the intended result in an unambiguous, portable way. For
301        all known compilers, this change should not affect runtime performance
302        in any way. It will result in unambiguously portable code.</para>
303      </listitem>
304
305      <listitem>
306        <para>Alter the code so that the expression in question is never
307        negative. Depending on sizes of the operands and the size of the
308        native integer, this expression might return convert the operands to
309        int or result in an invalid result.</para>
310      </listitem>
311    </itemizedlist>
312
313    <para>Of these alternatives, the third seems the more definitive fix so
314    we'll choose that one. We also decide to make a couple of minor changes to
315    simplify the code and make mapping of the algorithm in the article to the
316    code more transparent. With these changes, our test program runs to the
317    end with no errors or exceptions. In addition, I made a minor change which
318    simplifies the handling of floating point values in format of 24.8. This
319    results in <ulink url="../../example/motor2.c">motor2.c</ulink> which
320    makes the above changes. It should be easy to see that these two versions
321    are otherwise identical.</para>
322
323    <para>Finally our range test fails. In order to handle the full range we
324    need, we'll have to change some data types used for holding step count and
325    position. We won't do that here as it would make our example too complex.
326    We'll deal with this on the next version.</para>
327  </section>
328
329  <section>
330    <title>Trapping Errors at Compile Time</title>
331
332    <para>We can test the same code we're going to load into our target system
333    on the desktop. We could build and execute a complete unit test suite. We
334    could capture the output and graph it. We have the ability to make are
335    code much more likely to be bug free. But:</para>
336
337    <itemizedlist>
338      <listitem>
339        <para>This system detects errors and exceptions on the test machine -
340        but it fails to address and detect such problems on the target system.
341        Since the target system is compiles only C code, we can't use the
342        exception/error facilities of this library at runtime.</para>
343      </listitem>
344
345      <listitem>
346        <para><ulink
347        url="https://en.wikiquote.org/wiki/Edsger_W._Dijkstra">Testing shows
348        the presence, not the absence of bugs</ulink>. Can we not prove that
349        all integer arithmetic is correct?</para>
350      </listitem>
351
352      <listitem>
353        <para>For at least some operations on safe integers there is runtime
354        cost in checking for errors. In this example, this is not really a
355        problem as the safe integer code is not included when the code is run
356        on the target - it's only a C compiler after all. But more generally,
357        using safe integers might incur an undesired runtime cost.</para>
358      </listitem>
359    </itemizedlist>
360
361    <para>Can we catch all potential problems at compiler time and therefore
362    eliminate all runtime cost?</para>
363
364    <para>Our first attempt consists of simply changing default exception
365    policy from the default runtime checking to the compile time trapping one.
366    Then we redefine the aliases for the types used by the PIC to use this
367    exception policy.</para>
368
369    <programlisting>// generate compile time errors if operation could fail
370using trap_policy = boost::numeric::loose_trap_policy;
371...
372typedef safe_t&lt;int8_t, trap_policy&gt; int8;
373...
374</programlisting>
375
376    <para>When we compile now, any expressions which could possibly fail will
377    be flagged as syntax errors. This occurs 11 times when compiling the
378    <ulink url="../../example/motor2.c">motor2.c</ulink> program. This is
379    fewer than one might expect. To understand why, consider the following
380    example:</para>
381
382    <para><programlisting>safe&lt;std::int8_t&gt; x, y;
383...
384safe&lt;std::int16_t&gt; z = x + y;
385</programlisting>C promotion rules and arithmetic are such that the z will
386    always contain an arithmetically correct result regardless of what values
387    are assigned to x and y. Hence there is no need for any kind of checking
388    of the arithmetic or result. The Safe Numerics library uses compile time
389    range arithmetic, C++ template multiprogramming and other techniques to
390    restrict invocation of checking code to only those operations which could
391    possible fail. So the above code incurs no runtime overhead.</para>
392
393    <para>Now we have 11 cases to consider. Our goal is to modify the program
394    so that this number of cases is reduced - hopefully to zero. Initially I
395    wanted to just make a few tweaks in the versions of
396    <code>example92.c</code>, <code>motor2.c</code> and
397    <code>motor_test2.c</code> above without actually having to understand the
398    code. It turns out that one needs to carefully consider what various types
399    and variables are used for. This can be a good thing or a bad thing
400    depending on one's circumstances, goals and personality. The programs
401    above evolved into <ulink
402    url="../../example/example93.c"><code>example93.c</code></ulink>,
403    <code><ulink url="../../example/motor3.c">motor3.c</ulink></code> and
404    <ulink
405    url="../../example/motor_test3.c"><code>motor_test3.c</code></ulink>.
406    First we'll look at <code>example93.c</code>:</para>
407
408    <programlisting><xi:include href="../../example/example93.cpp"
409        parse="text" xmlns:xi="http://www.w3.org/2001/XInclude"/></programlisting>
410
411    <para>Here are the changes we've made int the desktop test
412    program<orderedlist>
413        <listitem>
414          <para>Specify exception policies so we can generate a compile time
415          error whenever an operation MIGHT fail. We've aliased this policy
416          with the name <code>trap_policy</code>. The default policy of which
417          throws a runtime exception when an error is countered is aliased as
418          <code>exception_policy</code>. When creating safe types, we'll now
419          specify which type of checking, compile time or runtime, we want
420          done.</para>
421        </listitem>
422
423        <listitem>
424          <para>Create a macro named "literal" an integral value that can be
425          evaluated at compile time.</para>
426
427          <para>"literal" values are instances of safe numeric types which are
428          determined at compile time. They are <code>constexpr</code> values.
429          When used along with other instances of safe numeric types, the
430          compiler can calculate the range of the result and verify whether or
431          not it can be contained in the result type. To create "literal"
432          types we use the macro <code><link
433          linkend="safe_numerics.safe_literal.make_safe_literal">make_safe_literal</link>(n,
434          p, e)</code> where n is the value, p is the <link
435          linkend="safe_numerics.promotion_policy">promotion policy</link> and
436          e is the <link linkend="safe_numerics.exception_policy">exception
437          policy</link>.</para>
438
439          <para>When all the values in an expression are safe numeric values,
440          the compiler can calculate the narrowest range of the result. If all
441          the values in this range can be represented by the result type, then
442          it can be guaranteed that an invalid result cannot be produced at
443          runtime and no runtime checking is required.</para>
444
445          <para>Make sure that all literal values are x are replaced with the
446          macro invocation "literal(x)".</para>
447
448          <para>It's unfortunate that the "literal" macro is required as it
449          clutters the code. The good news is that is some future version of
450          C++, expansion of <code>constexpr</code> facilities may result in
451          elimination of this requirement.</para>
452        </listitem>
453
454        <listitem>
455          <para>Create special types for the motor program. These will
456          guarantee that values are in the expected ranges and permit compile
457          time determination of when exceptional conditions might occur. In
458          this example we create a special type c_t to the width of the pulse
459          applied to the motor. Engineering constraints (motor load inertia)
460          limit this value to the range of C0 to C_MIN. So we create a type
461          with those limits. By using limits no larger than necessary, we
462          supply enough information for the compiler to determine that the
463          result of a calculation cannot fall outside the range of the result
464          type. So less runtime checking is required. In addition, we get
465          extra verification at compile time that values are in reasonable
466          ranges for the quantity being modeled.</para>
467
468          <para>We call these types "strong types".</para>
469        </listitem>
470      </orderedlist></para>
471
472    <para>And we've made changes consistent with the above to <ulink
473    url="../../example/motor3.c">motor3.c</ulink> as well<programlisting><xi:include
474          href="../../example/motor3.c" parse="text"
475          xmlns:xi="http://www.w3.org/2001/XInclude"/></programlisting><orderedlist>
476        <listitem>
477          <para>Define variables using strong types</para>
478        </listitem>
479
480        <listitem>
481          <para>Surround all literal values with the "literal" keyword</para>
482        </listitem>
483
484        <listitem>
485          <para>Re-factor code to make it easier to understand and compare
486          with the algorithm as described in the original <ulink
487          url="../../example/stepper-motor.pdf">article</ulink>.</para>
488        </listitem>
489
490        <listitem>
491          <para>Rewrite interrupt handler in a way which mirrors the original
492          description of the algorithm and minimizes usage of state variable,
493          accumulated values, etc.</para>
494        </listitem>
495
496        <listitem>
497          <para>Distinguish all the statements which might invoke a runtime
498          exception with a comment. There are 12 such instances.</para>
499        </listitem>
500      </orderedlist></para>
501
502    <para>Finally we make a couple minor changes in <ulink
503    url="../../example/motor_test3.c">motor_test3.c</ulink> to verify that we
504    can compile the exact same version of motor3.c on the PIC as well as on
505    the desktop.</para>
506  </section>
507
508  <section>
509    <title>Summary</title>
510
511    <para>The intent of this case study is to show that the Safe Numerics
512    Library can be an essential tool in validating the correctness of C/C++
513    programs in all environments - including the most restricted.<itemizedlist>
514        <listitem>
515          <para>We started with a program written for a tiny micro controller
516          for controlling the acceleration and deceleration of a stepper
517          motor. The algorithm for doing this is very non-trivial and
518          difficult prove that it is correct.</para>
519        </listitem>
520
521        <listitem>
522          <para>We used the type promotion policies of the Safe Numerics
523          Library to test and validate this algorithm on the desk top. The
524          tested code is also compiled for the target micro controller.</para>
525        </listitem>
526
527        <listitem>
528          <para>We used <emphasis>strong typing</emphasis> features of Safe
529          Numerics to check that all types hold the values expected and invoke
530          no invalid implicit conversions. Again the tested code is compiled
531          for the target processor.</para>
532        </listitem>
533      </itemizedlist></para>
534
535    <para>What we failed to do is to create a version of the program which
536    uses the type system to prove that no results can be invalid. I turns out
537    that states such as</para>
538
539    <programlisting>++i;
540c = f(c);</programlisting>
541
542    <para>can't be proved not to overflow with this system. So we're left with
543    having to depend upon exhaustive testing. It's not what we hoped, but it's
544    the best we can do.</para>
545  </section>
546</section>
547