1Internals — Warts 2================= 3This document covers some design flaws in Rumur that it would be nice to find a 4solution to at some point. 5 6Non-minimal traces when running multithreaded 7--------------------------------------------- 8* `Github issue #131 “minimal trace mode”`_ 9 10Rumur defaults to generating a multithreaded verifier (``--threads 0``) that 11aborts as soon as it finds an error (``--max-errors 1``). This was seen to be 12pragmatic and aligning with what users typically want to do when bringing up a 13new model. That is, run as fast as possible and stop as soon as we know an 14exhaustive proof is impossible. 15 16When examining counterexample traces, you generally want to be presented with 17the shortest possible example. This happens naturally in a single-threaded 18verifier as it performs a breadth-first search. The multithreaded verifier is 19only approximately breadth-first because all the threads are essentially racing 20with each other in their exploration. As a result, the first problem found may 21not be the shallowest and may result in a non-minimal counterexample trace. 22 23Each thread encounters states in greater-than-or-equal-depth order, so a naive 24attempt at forcing a minimal trace is to temporarily stall a thread when it 25finds an error. We can unblock this thread and allow it to report its error when 26all other threads have either (a) finished exploration or (b) exceeded the depth 27of the error location found by the first thread. If a second thread finds an 28error at a shallower depth than the first, it can simply kill the first thread 29and then itself stall as described. 30 31Alas, this is insufficient. Taking the example from the Github issue, the state 32machine may look as follows: 33 34.. code-block:: 35 36 A -> B -> C -> ... 37 \ ^ 38 \______/ 39 40Thread 0 might discover ``C`` first via the path ``A -> B -> C``. Later thread 1 41might rediscover ``C`` via the path ``A -> C``. Thread 1 will notice its ``C`` 42is a duplicate and discard it. Now suppose one of the threads finds a deeper 43error whose counterexample traces back to ``C``. This counterexample will trace 44back through the ``A -> B`` prefix rather than the shorter ``A``. 45 46So far I have not come up with a design that would result in a minimal 47counterexample trace in a multithreaded verifier without degrading current 48performance. Solving this would, I believe, be of significant value to users. 49 50.. _`Github issue #131 “minimal trace mode”`: https://github.com/Smattr/rumur/issues/131 51 52Incomplete AST in recursive functions 53------------------------------------- 54The librumur Abstract Syntax Tree (AST) represents child nodes using smart 55pointers, ``rumur::Ptr``. This has constructor and assignment operator 56implementations that allow the programmer to avoid thinking too much about 57memory management. The general pattern is to freely copy the pointed-to object 58into a new ``rumur::Ptr``. This isn’t very efficient, but has the advantage of 59being simple. Moreover, there is little point tightly optimising AST 60manipulation when any AST that is large enough to see benefit from this most 61likely results from a model that is too large to verify. 62 63A side-effect of this “deep copy” pointer design is that the AST is inherently a 64Directed Acyclic Graph (DAG). It is not possible for an AST node to contain a 65reference to something above itself in the AST because assigning this reference 66takes a copy of the parent node. 67 68This means that during symbol resolution, a recursive function call has no way 69to store a fully resolved reference to its callee. To see why this is, follow 70through the symbol resolution process. To begin with, the containing function 71has a statement that includes an unresolved function call. The symbol resolution 72traversal finds this unresolved function call and resolves it to the containing 73function by copying it. However, this copy still contains the unresolved version 74of the function call as a child. We can descend into the callee function 75(``rumur::FunctionCall::function``) but we’ll merely repeat the process and have 76a deeper unresolved function call. 77 78The current design merely avoids descending into the callees of function calls. 79This is acceptable, but it leaves an AST containing unresolved function calls 80that may come as a surprise to later consumers. 81 82Obtaining a fully resolved AST while supporting recursion seems inherently at 83odds with the current design. It seems like it would be a net loss to permit 84cycles in the AST (we would have to manually manage pointers and memory) but 85perhaps there is another compromise we could find. 86