1/*! 2 3\page theory_api_howto HOWTO Write a Decision Procedure 4 5Note: This document is under construction. Some newer aspects of adding a 6decision procedure are not yet documented here. Please let us know if you have 7a question that is not answered here. 8 9\section theory_api_contents Contents 10 11<ul> 12<li>\ref theory_api_prelim 13 <ul> 14 <li>\ref theory_api_files 15 <li>\ref theory_api_naming 16 <li>\ref theory_api_general 17 </ul> 18<li>\ref theory_api_addnew 19 <ul> 20 <li>\ref theory_api_start 21 <li>\ref theory_api_config 22 <li>\ref theory_api_header 23 <li>\ref theory_api_expr 24 <ul> 25 <li>\ref theory_api_kind_decl 26 <li>\ref theory_api_kind_reg 27 <li>\ref theory_api_expr_subclass 28 <li>\ref theory_api_methods 29 </ul> 30 </ul> 31<li>\ref theory_api_invars 32 <ul> 33 <li>\ref theory_api_flow 34 <li>\ref theory_api_backtrack 35 <li>\ref theory_api_ileaves 36 <li>\ref theory_api_inputs 37 <ul> 38 <li>\ref theory_api_assertFact 39 <li>\ref theory_api_checkSat 40 <li>\ref theory_api_setup 41 <li>\ref theory_api_update 42 <li>\ref theory_api_addSharedTerm 43 <li>\ref theory_api_rewrite 44 <li>\ref theory_api_solve 45 <li>\ref theory_api_computeType 46 <li>\ref theory_api_computeTCC 47 <li>\ref theory_api_notifyInconsistent 48 <li>\ref theory_api_print 49 <li>\ref theory_api_parseExprOp 50 </ul> 51 <li>\ref theory_api_outputs 52 <ul> 53 <li>\ref theory_api_enqueueFact 54 <li>\ref theory_api_setInconsistent 55 <li>\ref theory_api_simplifyThm 56 <li>\ref theory_api_enqueueEquality 57 <li>\ref theory_api_inconsistentThm 58 <li>\ref theory_api_isTerm 59 <li>\ref theory_api_isLiteral 60 <li>\ref theory_api_isAtomic 61 <li>\ref theory_api_updateHelper 62 <li>\ref theory_api_getType 63 <li>\ref theory_api_getTCC 64 <li>\ref theory_api_parseExpr 65 </ul> 66 </ul> 67<li>\ref theory_api_proofs 68 <ul> 69 <li>\ref theory_api_proof_classes 70 <li>\ref theory_api_proof_rule 71 </ul> 72</ul> 73 74\section theory_api_prelim Preliminaries and Coding Guidelines 75 76\subsection theory_api_files Directory Structure and Files 77 78Each theory (a.k.a. decision procedure, or DP) must reside in a 79directory <tt>src/theory_foo</tt> (where <tt>foo</tt> is the theory 80name). All the sounce files and <tt>Makefile.in</tt> must be in that 81directory, except for some shared header files, which must reside in 82the common directory <tt>src/include</tt>. In particular, 83<tt>theory_foo.h</tt> must be placed in <tt>src/include</tt>. 84 85The master source file for a theory should be called 86<tt>theory_foo.cpp</tt>. Typically, a new theory needs its own set of 87proof rules, which are implemented as three files: 88<tt>foo_proof_rules.h</tt>, <tt>foo_theorem_producer.h</tt>, and 89<tt>foo_theorem_producer.cpp</tt>. The first one is the abstract 90interface to the rules, which the untrusted DP code can 91<tt>\#include</tt>. The other two files implement this abstract API, 92and comprise the trusted part of the code. All these files should be 93placed in <tt>src/theory_foo</tt>. 94 95\subsection theory_api_naming Naming Convensions 96 97Each individual theory is a subclass of a Theory class, and therefore, 98its name should be <tt>class TheoryFoo</tt>. 99 100A theory may define several methods for creating expressions in 101<tt>theory_foo.h</tt>. Typically, they are named as 102<tt>barExpr()</tt>, <tt>bazExpr()</tt>, etc. For instance, 103<tt>theory_arith.h</tt> defines plusExpr(), minusExpr(), etc. These 104functions are most useful if they are declared as non-member functions 105(not part of the class <tt>TheoryFoo</tt>), since then they can be 106called directly by the CVC3 library users. 107 108It is also convenient to define testers like <tt>isBar()</tt> and 109<tt>isBaz()</tt>, which return true for the corresponding expressions. 110For example, <tt>theory_arith.h</tt> defines <tt>isPlus()</tt>, 111<tt>isMinus()</tt>, etc. 112 113Of course, the rest of the code should follow our general naming and 114coding guidelines described at length in the project FAQ: 115http://verify.stanford.edu/savannah/faq/?group_id=1&question=How_Do_I_Become_A_CVC3_Developer.txt . 116 117\subsection theory_api_general General Guidelines 118 119The Theory API is designed to minimize the changes in the central 120(core) files when adding a new theory. Ideally, only 121<tt>src/vcl/vcl.cpp</tt> file needs to be modified to add a 122constructor call to the new theory. Everything else --- kinds, 123special expression subclasses, types, etc. --- can (and should) be 124defined locally in the theory-specific files. 125 126There are several important guidelines for writing 127proof rules specific to your theory: 128 129<ul> 130<li>Each proof rule must be mathematically sound. 131 132<li>It is a mortal sin to introduce a proof rule without a proper 133doxygen comment concisely and clearly describing what it does. 134 135<li>The code must contain enough <tt>CHECK_SOUND</tt> checks to 136guarantee that all the premises are of the right form, and all the 137side conditions of the rule are satisfied. That is, if these 138soundness checks pass, then the rule application is guaranteed to be 139sound mathematically. 140 141<li>Keep the code in each rule as clean and simple as possible, so 142that the correctness of the implementation w.r.t. the mathematical 143formulation can be easily checked by manual inspection. That is, many 144simple and independent rules are strongly preferred to a few big 145mega-rules. 146 147<li>Avoid calling other proof rules from within a proof rule. Rule of 148thumb: if your rule has to use other rules, most likely, it is a 149<em>strategy</em> (a derived rule), and should be implemented in the 150untrusted part of the code (e.g. in <tt>theory_foo.cpp</tt>). 151 152</ul> 153 154More details are given in the separate 155\ref theory_api_proofs "section on proof rules". 156 157Finally, and <em>most importantly</em>: <strong>document your 158code!!</strong> 159 160Every method, variable, class, or type should have a doxygen-style 161comment with <em>at least</em> a brief description: 162 163\verbatim 164//! Convert Blah expression to Baz value 165Baz Blah2Baz(const Blah& b); 166\endverbatim 167 168Imagine telling such a description to your colleague who has very 169faint idea of what you are coding. If a brief description does not 170make sense to him/her, or the function does something non-trivial or 171non-obvious, add a longer description: 172 173\verbatim 174//! Convert Blah expression to Baz value 175/*! 176 * Recursively descend into Blah, collect every Foo-leaf, order them in 177 * the descending order of Baz-ability, and wrap into the Baz expression. 178 * 179 * \param b is a non-trivial Blah value (otherwise we assert-fail). 180 */ 181Baz Blah2Baz(const Blah& b); 182\endverbatim 183 184Often it is convenient to keep the brief description in the 185<tt>*.h</tt> file, and the long description in the corresponding 186<tt>*.cpp</tt> file. 187 188The only exceptions are the \ref theory_api "Theory API methods", for 189which no documentation is necessary, since these are already 190documented in <tt>theory.h</tt>. 191 192\section theory_api_addnew Adding New Theory 193 194\subsection theory_api_start Before You Start... 195 196...Make sure you have a very good idea about what your new decision 197procedure is supposed to do, why it is useful, and what exactly it 198decides. 199 200Next, write down all the definitions mathematically and phrase the 201scope of the new theory in terms of a logic with a particular 202signature (the set of predefined interpreted and uninterpreted symbols 203that belong to this theory). Make sure this signature is disjoint 204from all the other theories (this is <strong>very important</strong>, 205since CVC3 is based on Nelson-Oppen combination result, and 206requires signatures to be disjoint). 207 208Your decision procedure should decide inconsistency of a set of 209formulas in the above logic. Make sure you actually know an algorithm 210for that. While it is possible to add undecidable or incomplete 211theories to CVC3 (and we already have some), those should only be 212used for a very good reason. Otherwise, make sure you know that your 213algorithm is sound, complete, and terminating. 214 215Take the above algorithm and make it an <em>online</em> one. That is, 216your algorithm should be able to accept a new formula at any point in 217time, and perform some incremental computation to take that formula 218into account. This is usually the most complicated step as far as the 219math is concerned. I will explain this step in detail in section \ref 220theory_api_invars "Theory API" below. 221 222If you are adding your theory to the official CVC3 code base, 223<strong>discuss your ideas on the mailing 224list!</strong> Yes, <strong>do</strong> it, I mean it. If you do not, 225we will notice your sneaky additions right away, and will ask many 226difficult questions in public. So, you may as well just announce it 227yourself. 228 229\subsection theory_api_config Configure and Makefiles 230 231Now you are ready to begin. Pick a name for your new theory, say, 232<em>Foo</em>, and do the following steps. 233 234- Create a new CVS branch. You already know how to do this, right? 235If not, read the FAQ: 236http://verify.stanford.edu/savannah/faq/?group=vergrp . 237Read it <strong>NOW</strong>. 238 239- Create a directory <tt>src/theory_foo</tt>, add it to CVS 240 241- Create the following files (either empty or copy them from an 242 existing theory), and add them <strong>to your new CVS branch</strong>: 243 <dl> 244 <dt>In <tt>src/include/</tt></dt> 245 <dd><tt>theory_foo.h</tt></dd> 246 <dt>In <tt>src/theory_foo/</tt></dt> 247 <dd><tt>theory_foo.cpp</tt><br> 248 <tt>Makefile.in</tt></dd> 249 </dl> 250 If your new theory needs to create new custom proof rules, also 251 create 252 <dl> 253 <dt>in <tt>src/theory_foo/</tt></dt> 254 <dd><tt>foo_proof_rules.h</tt><br> 255 <tt>foo_theorem_producer.h</tt><br> 256 <tt>foo_theorem_producer.cpp</tt></dd> 257 </dl> 258 While you at it, add the <strong>required header comment</strong> to 259 each source file (*.h and *.cpp), with the appropriately modified 260 file name, author, date, and an optional description <em>after</em> 261 the license excerpt. Look at any existing source file for a 262 template. 263- Edit <tt>src/theory_foo/Makefile.in</tt> appropriately (look for a 264 template from some other theory, for example, 265 <tt>theory_array</tt>). You need to modify the values of: 266 - <tt>LIBRARY</tt> 267 - <tt>LOCAL_OBJ_DIR</tt> and <tt>LOCAL_SRC_DIR</tt> 268 - <tt>SRC</tt> 269 - <tt>HEADERS</tt> (if you created any *.h files in 270 <tt>src/theory_foo/</tt>, such as <tt>foo_proof_rules.h</tt>). 271- Edit <tt>src/Makefile.in</tt> (it is a common Makefile for all theories): 272 - Add <tt>theory_foo.h</tt> to <tt>HEADER_NAMES</tt> 273 - Add <tt>cd theory_foo \&\& \$(MAKE) \$(TARGET)</tt> to 274 <tt>build:</tt> target 275 - Add <tt>theory_foo</tt> entry to <tt>CVC_LIB_NAMES</tt> 276- Edit <tt>configure.in</tt>: 277 - Add an entry <tt>\$CVC_SRC_SUBDIR/theory_foo/Makefile</tt> to 278 <tt>CVC_OUTPUT_FILES</tt> variable 279- Run <tt>autoheader</tt>, <tt>autoconf</tt>, <tt>./configure</tt>. 280- Edit <tt>src/vcl/vcl.cpp</tt> to add a call to your theory's constructor: 281 - Add <tt>\#include "theory_foo.h"</tt> at the top; 282 - Add <tt>d_theories.push_back(new TheoryFoo(this));</tt> to 283 the end of VCL::VCL() constructor. 284 285File <tt>src/vcl/vcl.cpp</tt> is <strong>the only source file</strong> 286in the core of CVC3 that you really need to touch in order to add 287a new theory. If you have to do something else, you are either doing it 288wrong, or there better be a very good reason for it. 289 290Now you are all set for compiling your new code, except that there is 291no code to compile yet... 292 293\subsection theory_api_header Header File: theory_foo.h 294 295The easiest way to start is to use an existing 296<tt>theory_whatever.h</tt> file as a template. This header file 297should declare the following elements: 298 299<ul> 300<li><tt>class TheoryFoo</tt> as a subclass of <tt>Theory</tt>; 301<li>Non-member (global) functions for creating expressions in your 302 theory, and possibly, testers for those expressions; for instance: 303 <tt>plusExpr()</tt>, <tt>multExpr()</tt>, ..., <tt>isPlus()</tt>, 304 <tt>isMult()</tt>, ... 305<li>Optionally, declaration of new kinds (an <tt>enum</tt> type) used 306in the above expressions. 307</ul> 308 309Declaration of <tt>TheoryFoo</tt> class is required. Kinds and 310functions for creating new expressions are only needed if you want the 311end-user of the CVC3 library to be able to create expressions from your 312theory and/or refer to their kinds directly. This is not always 313desirable. For instance, you may want to generate special terms or 314predicates that only make sense as temporary storage of information 315for <em>your</em> theory. The <tt>DARK_SHADOW</tt> and 316<tt>GRAY_SHADOW</tt> operators are good examples from the arithmetic 317theory. You have no clue what I'm talking about? That's right, you 318get it. 319 320\subsection theory_api_expr Kinds, Expressions, and Types 321 322The purpose of a new theory (or a <em>decision procedure</em>) is to 323extend the existing logic of CVC3 with new interpreted, and 324sometimes uninterpreted, operators and symbols. These new symbols 325comprise a <em>signature</em> of the theory, and it must be disjoint 326from the signatures of the other theories. 327 328In code, the new symbols and operators translate into new kinds of 329expressions. That is, new <em>kinds</em> and new values of the 330<tt>Expr</tt> class. 331 332A <em>kind</em> is a natural number which uniquely determines what 333sort of expression it is (variable, uninterpreted function symbol, 334specific operator like plus, minus, a type expression, etc.). Kinds 335are also used to define your <em>theory's signature</em>, and hence, 336they must be unique to your theory. 337 338In the simplest case, a kind represents the entire expression or type. 339Examples are simple types (like <tt>REAL</tt> and <tt>INT</tt>) and 340some constants (e.g. <tt>TRUE</tt> and <tt>FALSE</tt>). 341 342More typically, however, a kind represents an operator (<tt>PLUS</tt>, 343<tt>MINUS</tt>, ...), and children of an expression of that kind are 344the arguments of that operator. 345 346Finally, some kinds are used by more complex expressions with 347additional non-term attributes. For example, <tt>REC_LITERAL</tt> is 348used by record expressions { f1=e1, f2=e2, ... } whose children are 349the values of the fields (e1, e2, ...), and an additional attribute is 350the vector of field names: (f1, f2, ...). Other examples are 351quantifiers and lambda-terms. Such expressions require not only a new 352kind declaration, but also a special subclass of <tt>ExprValue</tt> 353(see below). 354 355\subsection theory_api_kind_decl Kind Declaration 356 357Kinds are declared as elements of an <tt>enum</tt> type (with doxygen 358comments): 359 360\verbatim 361//! Local kinds of TheoryFoo 362typedef enum { 363 FOO_TYPE = 3500, //!< Type FOO for the elements of this theory 364 BAR, //!< The binary (x | y) operator, where x,y: FOO 365 BAZ, //!< Unary buzz(x) predicate 366 BLEAH //!< An internal auxiliary term 367} FooKind; 368\endverbatim 369 370Notice, that the numbering in this example starts from 3500. This can 371be any integer which ensures that the new kinds do not clash with the 372existing kinds in other theories. Pick a random one, see if your 373kinds don't clash with others. You can check this either by 374inspecting <tt>src/include/kinds.h</tt> (the cental declaration of 375core kinds) and all the other theories, or compile and run 376<tt>cvc3</tt> to see if you get an error message about kinds 377registered twice. 378 379The above type declaration of <tt>FooKind</tt> can be either in 380<tt>src/include/theory_foo.h</tt> (usually the best place), or in some 381header file in <tt>src/theory_foo/</tt> directory, depending on 382whether you want to expose your kinds to the end library user or not. 383In the latter case, do not forget to add that header file to the 384<tt>HEADERS</tt> variable in your <tt>Makefile.in</tt> (see the \ref 385theory_api_config "previous section"). 386 387\subsection theory_api_kind_reg Kind Registration 388 389New kinds must be registered with the expression manager, which is 390done in your theory's constructor (see the 391\ref theory_api_methods "next section"). 392Registration is done by calling <tt>newKind()</tt> method of 393<tt>ExprManager</tt> for each kind: 394 395\verbatim 396getEM()->newKind(FOO_TYPE, "FOO"); 397getEM()->newKind(BAR, "||"); 398.... 399\endverbatim 400 401The string in the second argument is for printing the kind by the 402pretty-printer, and also for parsing, that is, turning a string back 403into a number (the kind). 404 405\subsection theory_api_expr_subclass New Expression Subclasses 406 407Sometimes an expression has a more complicated structure than a fixed 408operator with arguments, and more sophisticated data structures are 409necessary to represent it. CVC3 has an API for declaring and 410registering a new <em>expression subclass</em> just for that purpose. 411 412An expression subclass is a subclass of \ref ExprValue "ExprValue". 413You can declare it where appropriate in your theory 414files, using the same judgement as for the kinds. The new subclass 415needs to be registered with \c ExprManager by calling 416<tt>registerSubclass()</tt> method. 417 418A subclass of <tt>ExprValue</tt> <strong>must</strong> implement the 419following methods: 420 421\verbatim 422size_t computeHash() const; 423size_t getMMIndex() const; 424int isGeneric() const { return getMMIndex(); } 425bool isGeneric(int idx) const { return (idx == getMMIndex()); } 426// Syntactic equality of two expressions 427bool operator==(const ExprValue& ev2) const; 428// Make a clean copy of itself using the given memory manager 429ExprValue* copy(MemoryManager* mm, ExprIndex idx = 0); 430\endverbatim 431 432Also, each subclass must overload <tt>new()</tt> and <tt>delete()</tt> 433as follows: 434 435\verbatim 436void* operator new(size_t, MemoryManager* mm) { 437 return mm->newData(); 438} 439void operator delete(void*) { } 440\endverbatim 441 442Subclasses may overload other virtual methods of <tt>ExprValue</tt> as 443needed. For instance, <tt>arity()</tt>, <tt>getKids()</tt>, 444<tt>ExprValueGenericAttr "getXXXAttr"()</tt>, etc. Some standard 445attribute access functions can also be overloaded, 446e.g. <tt>getString()</tt>, <tt>getRational()</tt>. 447 448However, you should <strong>not</strong> overload testers such as 449<tt>isRecord()</tt>, <tt>isVar()</tt> and such (unless you really know 450what you are doing), since the core relies on specific properties of 451the corresponding subclasses. 452 453For example, let us say that the auxiliary term <tt>BLEAH</tt> in 454<tt>TheoryFoo</tt> needs a string attribute, which is a part of 455expression, but is not a child (and not even a term in the logic), and 456an integer attribute which is not part of the expression (e.g. it is 457needed for marking expressions during the algorithm run). An example 458of a new subclass for this expression is the following: 459 460\verbatim 461 class BleahExpr: public ExprValue { 462 private: 463 string d_str; //!< Data field; defines the value of expression 464 int d_int; //!< An attribute 465 size_t d_MMIndex; //!< The registration index for ExprManager 466 public: 467 //! Constructor 468 BleahExpr(ExprManager* em, const string& str, 469 size_t mmIndex, ExprIndex idx = 0) 470 : ExprValue(em, ARRAY_VAR, idx), d_str(str), 471 d_int(0), d_MMIndex(mmIndex) { } 472 473 //! String attribute (part of expression) 474 const std::string& getString() const { return d_str; } 475 476 //! Integer attribute (not part of expression) 477 int& getIntAttr(int idx) { return d_int; } 478 size_t getIntAttrSize() const { return 1; } 479 480 ExprValue* copy(MemoryManager* mm, ExprIndex idx = 0) const { 481 return new(mm) BleahExpr(d_em, d_str, d_MMIndex, idx); 482 } 483 484 size_t computeHash() const { 485 return PRIME*ExprValue::hash(BLEAH)+s_charHash(d_str.c_str()); 486 } 487 488 size_t getMMIndex() const { return d_MMIndex; } 489 size_t isGeneric() const { return getMMIndex(); } 490 bool isGeneric(size_t idx) const { return (idx == getMMIndex()); } 491 492 // Only compare the string, not the integer attribute 493 bool operator==(const ExprValue& ev2) const { 494 if(ev2.getMMIndex() != d_MMIndex) return false; 495 return (d_str == ev2.getString()); 496 } 497 498 void* operator new(size_t, MemoryManager* mm) { 499 return mm->newData(); 500 } 501 void operator delete(void*) { } 502 }; 503\endverbatim 504 505Registration of this subclass generates a memory manager index, which 506must be stored somewhere, usually in a class variable of 507<tt>TheoryFoo</tt>, say <tt>d_bleahIdx</tt> of type size_t. Registration in this 508case should be done in the constructor of <tt>TheoryFoo</tt>: 509 510\verbatim 511d_bleahIndex = getEM()->registerSubclass(sizeof(BleahExpr)); 512\endverbatim 513 514The approved and recommended way of <strong>creating 515expressions</strong> of <tt>BLEAH</tt> kind is the following: 516 517\verbatim 518Expr bleahExpr(const string& s) { 519 BleahExpr av(d_em, s, d_BleahIndex); 520 return newExpr(&av); 521} 522\endverbatim 523 524Using the new expression is now very easy: 525 526\verbatim 527Expr bleah = bleahExpr("cow moo"), b2 = bleahExpr("asdfqwerty"); 528bleah.getIntAttr(0) = 42; 529if(bleah != b2) 530 cout << bleah.getString() << b2.getIntAttr(0) << endl; 531\endverbatim 532 533Note that it is impossible to define a non-member expression creation 534function for <tt>BleahExpr</tt>, since the memory manager index is 535stored in the class-local variable of <tt>TheoryFoo</tt>. And 536<strong>don't even think</strong> of using a static variable to work 537around this limitation. 538 539In general, you should <strong>NEVER</strong> store anything in a 540static variable, since it may violate thread-safety of the library. 541Before you ever think of using a static variable for anything, think 542what happens if someone creates two copies of the system (with two 543different sets of expression and memory managers). Is it safe to 544share this variable among several system instances? In the case of 545the memory manager index, the answer is <em>definitely not</em>. 546 547One possible fix for this problem is to bind the memory manager to the 548kind(s) that the subclass uses. Let us know if you really need this 549feature, and it will be implemented. 550 551 552\subsection theory_api_methods Constructor and API Methods 553 554The only constructor for your new theory should have the following 555declaration: 556 557\verbatim 558TheoryFoo(VCL* vcl); 559\endverbatim 560 561The constructor has to: 562 563- Initialize the base class, 564- Register new kinds and expression subclasses (if any) with 565<tt>ExprManager</tt>, 566- Collect the kinds which belong to the new theory in a vector, and 567register the theory (method <tt>registerTheory()</tt>) with these 568kinds, 569- Initialize theory-specific data structures, if needed. 570 571Here is a typical example of the constructor implementation: 572 573\verbatim 574TheoryFoo::TheoryFoo(VCL *vcl): Theory(vcl, "Foo") { 575 d_rules = createProofRules(vcl); // instantiate our own rules 576 577 // Register new local kinds with ExprManager 578 getEM()->newKind(FOO_TYPE, "FOO"); 579 getEM()->newKind(BAR, "||"); 580 getEM()->newKind(BAZ, "BAZ"); 581 getEM()->newKind(BLEAH, "BLEAH"); 582 583 // Register our expression subclass 584 d_bleahIndex = getEM()->registerSubclass(sizeof(BleahExpr)); 585 586 vector<int> kinds; 587 kinds.push_back(FOO_TYPE); 588 kinds.push_back(BAR); 589 kinds.push_back(BAZ); 590 kinds.push_back(BLEAH); 591 592 registerTheory(this, kinds); 593} 594\endverbatim 595 596The following \ref theory_api "Theory API methods" are 597<strong>required</strong> in the subclass: 598 599\verbatim 600 void assertFact(const Theorem& e); 601 void checkSat(bool fullEffort); 602 void computeType(const Expr& e); 603\endverbatim 604 605Other methods are <strong>optional</strong>, but often needed: 606 607\verbatim 608 Theorem rewrite(const Expr& e); 609 void setup(const Expr& e); 610 void update(const Theorem& e, const Expr& d); 611 ExprStream& print(ExprStream& os, const Expr& e); 612 Expr parseExprOp(const Expr& e); 613\endverbatim 614 615Finally, a few other methods are rarely needed in practice: 616 617\verbatim 618 void addSharedTerm(const Expr& e); 619 Theorem solve(const Theorem& e); 620 void notifyInconsistent(const Theorem& thm); 621 Expr computeTCC(const Expr& e); 622\endverbatim 623 624The next section describes these API methods in more detail. You are 625also strongly encouraged to \ref theory_api "read the documentation" 626on each of these methods. 627 628\section theory_api_invars Theory API: The Very Important Invariants 629 630\subsection theory_api_flow High-Level Information Flow 631 632Every decision procedure communicates with the CVC3 Core 633interactively through several methods, most of which belong to the 634Theory class. Generally, those methods that your theory class 635re-implements carry information <em>from</em> the core, and others add 636new information generated in the DP <em>to</em> the core. Some 637methods (like <tt>rewrite()</tt> and <tt>solve()</tt>) return 638information through their return values. 639 640The chart below shows the flow of information to and from the decision 641procedure, and which parts of the core are responsible for collecting 642and generating it. Thick lines represent the most important methods, 643and dashed lines represent methods used only for very special 644occasions. 645 646\image html theory_api_flow.jpg 647\image latex theory_api_flow.eps "Flow of facts in CVC3" width=6in 648 649The most straightforward path of information, once it gets to the core 650through the external user input, is the following. First, all the 651relevant terms are typechecked (<tt>computeType()</tt> method). This 652method implements a step in the recursive typechecking algorithm, 653where the current expression is typechecked based on its structure and 654the types of its children. Typechecking of children is done by 655calling <tt>getType()</tt> or <tt>getBaseType()</tt>. The latter 656computes the largest supertype of the expression; for instance, the 657exact type of <tt>x</tt> may be a subrange <tt>0..5</tt>, which is a 658subtype of <tt>INT</tt>, but its base type is <tt>REAL</tt>. Exact 659and base types are cached on expressions, and are computed on demand. 660 661An important property of <tt>computeType()</tt> is that it must not 662only compute the exact type of the expression, but also verify that 663all subexpressions are type-correct, <em>relative to their base 664types</em>. For instance, <tt>x = y</tt> has type <tt>BOOLEAN</tt>, 665and it is only type-correct if <tt>getBaseType(x) == 666getBaseType(y)</tt>. If this property is violated, 667<tt>TypecheckException</tt> must be thrown with the appropriate 668message. Note, that the exact types of <tt>x</tt> and <tt>y</tt> may 669be different, and even disjoint. 670 671After the type checking, <em>Type Correctness Conditions</em> (TCCs) 672are generated and checked. TCC is a formula which is true if and only 673if any partial function in the original formula is used safely 674according to the Kleene semantics. That is, every partial function is 675either applied only to the arguments in its domain, or its value does 676not influence the value of the formula. 677 678TCCs are computed recursively by <tt>computeTCC()</tt> and 679<tt>getTCC()</tt> methods, very similar to computing the types. If 680your theory does not introduce partial functions explicitly (like 681division in arithmetic), then you do not need to re-implement 682<tt>computeTCC()</tt> in your theory; the default implementation will 683do the job. 684 685TCCs have a nice property that if they are true in the current 686context, then the corresponding user formulas can be safely 687interpreted by the total 2-valued models. Hence, 688<tt>computeTCC()</tt> is <strong>the only</strong> Theory API method 689that deals with partiality. All other methods consider any formula to 690be total (no partial functions) and 2-valued (only <tt>TRUE</tt> or 691<tt>FALSE</tt>, no undefined values. 692 693Once TCC has been proven valid in the current context, the new fact 694(formula) goes to the SAT solver, and if it is a literal (atomic 695formula or its negation), it is submitted to the decision procedure 696through <tt>assertFact()</tt>. The decision procedure processes this 697fact, updates its internal data structures, and possibly reports a 698contradiction (<tt>setInconsistent()</tt>) or new facts 699(<tt>enqueueFact()</tt>, and in special cases, 700<tt>enqueueEquality()</tt>). This completes the main loop of 701information flow. 702 703Note, that both <tt>enqueueFact()</tt> and <tt>setInconsistent()</tt> 704deliver information to the same place in the core, except that 705reporting the conflict bypasses the queue, and is taken care of 706immediately, rather than after all the previously enqueued facts are 707processed. This is the primary reason for having these two functions 708separated (as opposed to having only <tt>enqueueFact()</tt>). 709 710Inside this main loop, <tt>rewrite()</tt> and <tt>solve()</tt> are 711called to transform (or simplify) the facts before they reach the rest 712of the core. Normally, these functions do not have side-effects 713(except for caching results), and return new (simplified) facts 714through their return values. 715 716When the SAT solver runs out of facts, and the context is still 717satisfiable, it calls <tt>checkSat()</tt> with 718<tt>fullEffort==true</tt>. At this point, the decision procedure must 719determine whether all the information it has seen so far makes the 720context satisfiable or not w.r.t. its theory. Just like in the case 721of <tt>assertFact()</tt>, it may either report a contradiction, or 722enqueue a new fact. If any new fact is enqueued, it starts the main 723loop again. If <tt>checkSat()</tt> does not generate any new facts 724and does not find a contradiction, the core stops and reports the 725context to be satisfiable. 726 727Method <tt>checkSat()</tt> is also called every time the fact queue 728becomes empty, before the SAT solver asserts a new splitter. In this 729case, the <tt>fullEffort</tt> argument is set to <tt>false</tt>, and 730the decision procedure is not required to do anything. Many DPs, 731however, choose to perform some relatively inexpensive checks to 732detect inconsistencies and/or new facts, which increases performance. 733Similarly, if a new fact is enqueued, the main loop continues (without 734the SAT solver asserting new splitters) until the queue is empty. 735 736In CVC3, every term (non-formula expression) has a <em>canonical 737representative</em> in the union-find database. This database 738represents the equivalence classes of terms w.r.t. logical equality. 739All the terms in the formulas passing through the core are 740<em>simplified</em> by replacing them with their canonical 741representatives. 742 743Often, a decision procedure wants to be notified when a subexpression 744changes its canonical representative. For instance, if the DP has 745seen an term <tt>2*x+3*y</tt>, and <tt>x</tt> has changed its 746representative to <tt>y+2</tt>, then it is important to conclude that 747<tt>2*x+3*y == 2*(y+2) + 3*y == 5*y+4</tt>. For this purpose, the 748core maintains the <em>notify list</em> data structure, which is 749interfaced through <tt>setup()</tt> and <tt>update()</tt> methods. 750 751Every term in the core must be <em>setup</em>, and as a part of that 752process, the method <tt>setup()</tt> of the appropriate DP is called. 753Here the decision procedure has a chance to register notification 754requests related to the given expression. These requests are added to 755the <em>notify lists</em> of the relevant expressions using 756<tt>Expr::addToNotify()</tt> member method. 757 758Normally, a DP wants to be notified when immediate children of the 759expression change. For instance, for an expression <tt>2*x</tt>, if 760the variable <tt>x</tt> changes, the arithmetic DP wants to be 761notified about it. Therefore, in the <tt>setup(2*x)</tt> call, it 762adds <tt>2*x</tt> to the notify list of <tt>x</tt> by calling 763<tt>x.addToNotify(this, 2*x)</tt>. The first argument (<tt>this</tt>) 764is the reference to the current Theory subclass. 765 766Later, when <tt>x</tt> changes its canonical representative, say, to 767<tt>y+2</tt>, its notify list is consulted, and the <tt>update(x==y+2, 7682*x)</tt> call is made. The first argument is a directed equality 769informing the DP of what has changed, and the second is the expression 770for which this change is relevant. In this particular example, 771<tt>update()</tt> will enqueue a new fact: <tt>2*x==2*y+4</tt>. 772 773Similarly, when <tt>2*x+3*y</tt> is being setup, its immediate 774children (<tt>2*x</tt> and <tt>3*y</tt>) get the entire expression 775added to their notify lists. Later, when <tt>2*x</tt> changes its 776canonical representative to <tt>2*y+4</tt> due to the previous 777<tt>update()</tt> call, another <tt>update()</tt> call is made with 778<tt>2*x==2*y+4</tt> for <tt>2*x+3*y</tt>, and a new fact is 779enqueued: <tt>2*x+3*y==5*y+4</tt>, and so on. 780 781Note, that the notify list mechanism is not restricted to only 782immediate children. For instance, for high-degree monomials in 783non-linear arithmetics (e.g. <tt>x^2*y</tt>) it makes sense to 784register them with all factors (in this case, <tt>x</tt>, 785<tt>x^2</tt>, <tt>y</tt>, and <tt>x*y</tt>) which are not necessarily 786subexpressions of the original monomial (<tt>x^2*y</tt>). 787 788Finally, sometimes a decision procedure may want to know that the 789current context has become inconsistent, and this what 790<tt>notifyInconsistent()</tt> call is for. To date, only the 791quantifier theory uses it to find out which instantiations were useful 792in producing a conflict. Most likely, you do not need it. 793 794\subsection theory_api_backtrack Backtracking Data Structures 795 796Up to this point, the description assumed that new facts are always 797<em>added</em> to the current logical context, and <em>never 798removed</em>. This, of course, is not true in reality, since when a 799conflict is found, the SAT solver will <em>backtrack</em>, and 800<em>restore the context</em> to what it was before the assertion of a 801splitter. In particular, this means that each decision procedure 802needs to restore all its internal state to the same point. You may 803have noticed that there is no Theory API call to signal such 804backtracking. How the heck can a DP restore the state if it does not 805know when the core backtracks? 806 807The trick used in CVC3 is actually quite simple and elegant: DP 808<em>does not have to know about backtracking</em>, it indeed works 809under the assumption that facts can only be added to the context. 810However, all of its internal state must be stored in <em>backtracking 811data structures</em>, which backtrack automatically with the core. 812 813Such backtracking data structures are called <em>context-dependent 814objects</em> (CDO). There are currently three pre-defined 815context-dependent data structures: <em>CDO</em> (context-dependent 816object, <tt>cdo.h</tt>), <em>CDList</em> (backtracking stack, 817<tt>cdlist.h</tt>), and <em>CDMap</em> (backtracking map, similar to 818STL map, <tt>cdmap.h</tt>). 819 820Class CDO is a templated class for any C++ class which can be cleanly 821copied with <tt>operator=()</tt> and copy constructor, and which have 822the default constructor (this is how these objects are saved and 823restored on backtracking). CDO is best suited for individual 824variables (array indices, <tt>Expr</tt> or <tt>Theorem</tt> variables, 825etc.). 826 827Class CDList is a backtracking stack, and its API is very similar to 828that of STL vector. You can <tt>push_back()</tt> elements onto the 829stack, check the <tt>size()</tt> of the stack, and look up individual 830elements with <tt>operator[]</tt>. You <em>cannot</em>, however, 831modify or remove elements from the list. Keep in mind, that the size 832of the list may change between the API method calls, which means, you 833should keep any persistent indices to the list in backtracking 834variables (CDO). 835 836Class CDMap is a templated class very similar to STL <tt>map</tt>. 837You can add new key-value pairs to it, you can modify the value under 838a key, but you cannot remove a pair from the map. 839 840Let me repeat this again: <strong>all persistent data in a decision 841procedure MUST be stored in backtracking data structures!</strong> 842There are some rare exceptions to the rule (like storing the 843<tt>Expr</tt> representing the value "0" to avoid re-building it), but 844generally, you do not even want to know about backtracking. It is all 845done under the hood, and you should not care. 846 847\subsection theory_api_ileaves Variables and Foreign Terms (i-Leaves) 848 849Any subterm that your decision procedure cannot recognize must be 850treated as a variable. This is a very important point, and it may 851cause a long-lasting confusion for the beginning developers if not 852understood from the start. Read it carefully, several times, until 853you are sure you never forget it, even if I wake you up in the middle 854of the night. 855 856What CVC3 knows as a "variable" has nothing to do with what a 857decision procedure considers a "variable." These two are not very 858much related. 859 860A variable (or an <em>i-leaf</em>) from the DP point of view is either 861a CVC3 variable, or a <em>shared term</em> from some other theory. 862For instance, in <tt>2*arr[idx]-3*y</tt>, the subterm 863<tt>arr[idx]</tt> belongs to the theory of arrays, and therefore, is a 864variable (an i-leaf) as far as the theory of arithmetic is concerned. 865Similarly, <tt>y</tt> is a variable in the theory of arithmetic, 866because it is also a CVC3 variable. 867 868Such a definition does not provide a direct test for an i-leaf. 869Instead, you have to check whether this term is one of "yours" (one 870that your theory knows about), which is usually determined by the 871expression kind. If not, then it is a variable, as far as your theory 872is concerned. 873 874But never make any assumptions about an i-leaf; it can be any 875expression whatsoever, and <tt>Expr::isVar()</tt> tester will 876<em>not</em> necessarily return <tt>true</tt> for it. In other words, 877<em>there is no such thing as a variable</em> in your theory. There 878are only terms you cannot recognize, which you treat as variables. 879 880\subsection theory_api_inputs Methods Reimplemented in a Subclass 881 882\subsection theory_api_assertFact assertFact() 883 884There are no tricky invariants for this method. The only important 885property is that all the facts that are submitted to the DP through 886this call become part of the logical context which the DP must check 887for satisfiability. 888 889Mathematically, the asserted fact \f$\phi\f$ is added to the logical 890context \f$\Gamma\f$, and the job of the decision procedure is to 891check whether \f$\Gamma\f$ is satisfiable or not; in other words, 892we are solving the problem \f$\Gamma\models\bot\f$. 893 894When the decision procedure receives a new fact \f$\phi\f$, it may 895either save this fact in its internal database for later processing, 896or may immediately process it, and possibly derive new facts 897\f$\{\psi_1,\ldots,\psi_k\}\f$ from \f$\phi\f$ and submit them back to 898the core (<tt>enqueueFact()</tt>). 899 900In the case when the set of derived facts is equisatisfiable with the 901original fact \f$\phi\f$, the decision procedure does not need to keep 902\f$\phi\f$ in its database; the completeness will still be preserved. 903 904For instance, if the DP receives <tt>r1=r2</tt>, where <tt>r1</tt> and 905<tt>r2</tt> are records with fields <tt>f1</tt> and <tt>f2</tt>, then 906the two facts <tt>r1.f1=r2.f1</tt> and <tt>r1.f2=r2.f2</tt> 907(equalities of the individual fields) together are logically 908equivalent to the original fact <tt>r1=r2</tt>, and therefore, 909enqueuing them is sufficient for preserving completeness. The 910original fact need not be saved. 911 912\subsection theory_api_checkSat checkSat(bool fullEffort) 913 914The most important invariant (for completeness) is that when 915<tt>fullEffort</tt> is <tt>true</tt>, the DP must do all the work that 916it has postponed to find out if the current context is indeed 917satisfiable. In particular, if satisfiability can be achieved by 918making some of the shared terms equal, it must be done at this time 919(see \ref theory_api_addSharedTerm for more info on shared terms). 920 921This call is your last warning: if you do not act now, the whole 922system will stop and notify the user. However, the worst that can 923happen is that CVC3 becomes incomplete (it may report 924<tt>InValid</tt> when the query is actually valid). It still remains 925sound, however. That is, the <tt>Valid</tt> answer will still be 926correct. 927 928When <tt>fullEffort</tt> is <tt>false</tt>, the DP may choose to do as 929much or as little work as it wants. 930 931\subsection theory_api_setup setup() 932 933Add the given expression <tt>e</tt> to the notify list of all the 934expressions <tt>t<sub>1</sub>...t<sub>n</sub></tt> whose change would 935affect the value of <tt>e</tt>. Normally, such expressions are the 936immediate children of <tt>e</tt>. 937 938Whenever the canonical representative of any <tt>t<sub>i</sub></tt> 939changes in the union-find database, a corresponding call to 940<tt>update()</tt> will be made, and the DP will have a chance to 941re-process the expression <tt>e</tt> to keep it up-to-date. 942 943\subsection theory_api_update update() 944 945The property of this call is similar to <tt>assertFact()</tt>: the new 946fact becomes part of the logical context. However, the facts it 947receives do not necessarily belong to your theory, and are only 948reported because you asked the core to do so in <tt>setup()</tt>. 949 950Also, the new equalities that <tt>update()</tt> derives must be 951submitted through <tt>enqueueEquality()</tt> call. This also means 952that the right-hand side of the submitted equalities must be fully 953simplified. See <tt>\ref theory_api_enqueueEquality</tt> for more 954information. 955 956\subsection theory_api_addSharedTerm addSharedTerm() 957 958A term is called <em>shared</em> if it belongs to theory X, and appears 959as an i-leaf in a term from theory Y (that is, it's a Y-leaf). In 960this case, the term is shared by theories X and Y. For example, in 961<tt>2*arr[idx]-3</tt> the subterm <tt>arr[idx]</tt> belongs to the 962theory of arrays, but the entire term is an arithmetic expression; 963hence, <tt>arr[idx]</tt> is a shared term. 964 965When such a term appears in the system, the core notifies both 966theories about the term. 967 968Completeness of the CVC3 framework relies on the invariant that 969decision procedures propagate all the equalities between shared terms 970that can be derived in the current logical context. 971 972Often, the algorithms in DPs are designed to propagate all the 973equalities automatically (over all terms, including shared). In this 974case, <tt>addSharedTerm()</tt> need not be re-implemented. 975 976In some cases, however, the DP has to take extra effort to satisfy the 977above invariant, and it is more efficient to restrict this extra 978effort only to the set of shared terms. In this case, 979<tt>addSharedTerm()</tt> needs to collect the set of shared terms in a 980database 981(which, of course, has to be \ref theory_api_backtrack "backtrackable"), 982and use it in the <tt>checkSat()</tt> call. 983 984\subsection theory_api_rewrite rewrite(Expr e) 985 986This function must return a <em>rewrite Theorem</em> of the form 987<tt>e==e1</tt> (or <tt>e<=>e1</tt> if <tt>e</tt> is a formula), where 988<tt>e1</tt> is a logically equivalent term or formula. 989 990This function can assume that all the immediate children of <tt>e</tt> 991are already completely simplified and rewritten. The same property 992must hold for the result of the rewrite. 993 994Another invariant that <tt>rewrite()</tt> has to preserve is that if 995the result of a rewrite is an equality (you return 996<tt>e<=>(e1==e2)</tt>), then in the resulting equality <tt>e1 >= 997e2</tt> w.r.t. <tt>operator>=(Expr, Expr)</tt>, the fixed total 998ordering on all expressions given by the expression manager. This 999invariant is important for termination of the simplifier, since 1000equalities in CVC3 are used as (directed) rewrite rules, replacing 1001the left-hand side (<tt>e1</tt>) with the right-hand side 1002(<tt>e2</tt>). 1003 1004The core will call the <tt>rewrite()</tt> function iteratively on the 1005right-hand side of the result, until the expression does not change. 1006However, if the rewriting algorithm can guarantee that in a particular 1007case no further rewrites from this theory will change the expression, 1008the result can be flagged as a <em>normal rewrite</em>. In this case, 1009the core will not call <tt>rewrite()</tt> again, resulting in better 1010performance. The property that the expression indeed does not change 1011with further rewrites is checked in the "debug" build, and any 1012violation triggers assertion failures with ``<em>Simplify Error 10131</em>'' and ``<em>Simplify Error 2</em>'' messages. 1014 1015It is important to understand that the iterative call to 1016<tt>rewrite()</tt> only applies to the top-level node, and 1017<em>not</em> to subexpressions. That is, if <tt>rewrite()</tt> 1018changes the subexpressions (and not only the top-level operator), then 1019it may violate another invariant that all the children of the result 1020are completely rewritten and simplified. If this invariant cannot be 1021guaranteed, then <tt>rewrite()</tt> needs to call 1022<tt>simplifyThm()</tt> method explicitly. 1023 1024Here is an example of a rewrite function: 1025 1026\verbatim 1027Theorem TheoryFoo::rewrite(const Expr& e) { 1028 Theorem res; 1029 if(isBar(e)) { 1030 res = reflexivityRule(e); 1031 res.getRHS().setRewriteNormal(); // No more rewrites needed 1032 } else { 1033 // May need to rewrite several times 1034 res = < do real work > 1035 } 1036 return res; 1037} 1038\endverbatim 1039 1040\subsection theory_api_solve solve(Theorem e) 1041 1042This method takes an equality <tt>e</tt> (as a <tt>Theorem</tt> 1043object) and turns it into a logically equivalent <em>solved form</em>: 1044a conjunction of fully simplified equalities, possibly existentially 1045quantified. The terms on the left-hand sides cannot appear on any of 1046the right-hand side terms, and every free variable in the solved form 1047is also a free variable of <tt>e</tt>. (New variables in the solved 1048form must be existentially quantified). 1049 1050According to Clark Barrett's Ph.D. thesis, only one theory is allowed 1051to have a solver. In CVC3, such theory is the theory of 1052arithmetic. The restriction to a single solver in CVC3 is 1053somewhat relaxed, and several theories can have their own solvers, 1054provided that the solved form that such a secondary solver generates 1055is also a solved form w.r.t. the theory of arithmetic. This is the 1056only asymmetric and non-local invariant in the core of Theory API. 1057 1058 1059 1060\subsection theory_api_computeType computeType() 1061 1062The basic CVC3 type checking mechanism is a simple recursive 1063descent into the term structure, and it is implemented as a 1064<tt>getType()</tt> method in the base <tt>Theory</tt> class. 1065 1066When computing a type of an expression <tt>e</tt>, this method 1067determines which DP owns the expression, and calls the appropriate 1068<tt>computeType()</tt> method, which is expected to check the 1069expression for type consistency, and return the exact type of the 1070expression. The return type is then cached as an attribute on the 1071expression <tt>e</tt> for a fast look-up in the subsequent calls to 1072<tt>getType()</tt>. 1073 1074Each decision procedure must implement <tt>computeType()</tt> method 1075for all of its operators. For example, the theory of records has an 1076operator for constructing record literals, for extracting a field of a 1077record, and for updating a field of a record. This means that 1078<tt>computeType()</tt> needs to be able to compute the types for these 1079three operators, and verify that all subexpressions are of expected 1080types. 1081 1082Since subtypes in CVC3 are handled by TCCs, type consistency at 1083this stage is only checked with respect to the <em>base types</em>, 1084which is returned by <tt>getBaseType()</tt> method provided by the 1085base <tt>Theory</tt> class. For example, if a record expression 1086<tt>e</tt> has a field <tt>foo</tt> of type <tt>INT</tt>, and the 1087expression is a record update <tt>e WITH .foo := t</tt>, where 1088<tt>t</tt> is of type <tt>REAL</tt>, then this expression is 1089considered well-typed, since the base types of both <tt>INT</tt> and 1090<tt>REAL</tt> is <tt>REAL</tt>. 1091 1092 1093An important property of <tt>computeType()</tt> is that it must not 1094only compute the exact type of the expression, but also verify that 1095all subexpressions are type-correct, <em>relative to their base 1096types</em>. For instance, <tt>x = y</tt> has type <tt>BOOLEAN</tt>, 1097and it is only type-correct if <tt>getBaseType(x) == 1098getBaseType(y)</tt>. If this property is violated, 1099<tt>TypecheckException</tt> must be thrown with the appropriate 1100message. Note, that the exact types of <tt>x</tt> and <tt>y</tt> may 1101be different, and even disjoint. 1102 1103 1104 1105\subsection theory_api_computeTCC computeTCC() 1106 1107Type Correctness Condition (TCC) for an expression <em>e</em> (which 1108can be either a term or a formula) is a formula <em>D<sub>e</sub></em> 1109such that <em>D<sub>e</sub></em> is true if and only if <em>e</em> is 1110defined (or <em>denoting</em>) in the current logical context. 1111 1112For example, an expression <tt>x/y</tt> is undefined when 1113<tt>y=0</tt>, and is defined otherwise. Therefore, \f$D_{x/y}\equiv 1114y\ne 0\f$. 1115 1116\subsection theory_api_notifyInconsistent notifyInconsistent() 1117 1118 1119 1120\subsection theory_api_print print() 1121 1122The most important property of this method is that the printed 1123expressions have to be parsable by the appropriate CVC3 parser. 1124That is, CVC3 must be able to read what it prints. 1125 1126The recursive call to the global pretty-printer is implemented through 1127the overloaded <tt>operator<<</tt> for <tt>ExprStream</tt>. Read the 1128documentation on <tt>ExprStream</tt> class before coding. 1129 1130Once coded, <strong>test your printer code!</strong> Print all the 1131kinds of expressions from your theory, make the expressions large and 1132complex, interspersed with terms from other theories, etc. Make sure 1133it both looks good, and CVC3 can read every term it prints. 1134 1135Here's an example of the <tt>print()</tt> method: 1136 1137\verbatim 1138ExprStream& 1139TheoryFoo::print(ExprStream& os, const Expr& e) { 1140 switch(os.lang()) { 1141 case PRESENTATION_LANG: 1142 switch(e.getKind()) { 1143 case ARROW: 1144 os << "[" << push << e[0] << space << "-> " << e[1] << push << "]"; 1145 break; 1146 case EQ: 1147 os << "(" << push << e[0] << space << "= " << e[1] << push << ")"; 1148 break; 1149 case NOT: os << "NOT " << e[0]; break; 1150 ................. 1151 default: 1152 // Print the top node in the default LISP format, continue with 1153 // pretty-printing for children. 1154 e.print(os); 1155 } 1156 break; // end of case PRESENTATION_LANGUAGE 1157 case INTERNAL_LANG: 1158 .................. 1159 break; // end of case INTERNAL_LANG 1160 default: 1161 // Print the top node in the default LISP format, continue with 1162 // pretty-printing for children. 1163 e.print(os); 1164 } 1165 return os; 1166} 1167\endverbatim 1168 1169\subsection theory_api_parseExprOp parseExprOp() 1170 1171This method is not used yet, and is likely to change in the near 1172future. 1173 1174\subsection theory_api_outputs Methods Provided by Theory API 1175 1176The Theory API consists not only of methods to re-implement in the 1177subclass, but it also provides convenient methods in the base class 1178which subclasses can readily use. These methods subdivide into the 1179following categories: 1180 1181<ul> 1182<li>Methods for sending information to the core from a theory (a 1183decision procedure), 1184<li>\ref theory_api_core_proof_rules "Common proof rules", 1185<li>Other convenient methods. 1186</ul> 1187 1188\subsection theory_api_enqueueFact enqueueFact() 1189 1190Normally, a decision procedure should use this <em>and only this</em> 1191method for reporting newly derived facts back to the core. The only 1192exception is a contradiction (a <tt>FALSE</tt> Theorem), which should 1193be reported through <tt>setInconsistent()</tt> method for efficiency. 1194 1195Other exceptions include facts derived by the <tt>update()</tt> 1196function, which may be reported through 1197<tt>\ref theory_api_enqueueEquality</tt>. 1198 1199\subsection theory_api_setInconsistent setInconsistent() 1200 1201Similar to <tt>\ref theory_api_enqueueFact</tt>, except that it 1202requires the new Theorem to be <tt>FALSE</tt> (a contradiction). 1203 1204This method is used for more efficient processing of the derived 1205contradiction, bypassing the fact queue. 1206 1207\subsection theory_api_simplifyThm simplifyThm() and simplify() 1208 1209In rare cases, a decision procedure may want to simplify a given 1210expression w.r.t. the current context, and this is the function to 1211call. 1212 1213<strong>Be careful!</strong> This method may call your own 1214<tt>\ref theory_api_rewrite</tt> method recursively. 1215 1216It is also a relatively expensive function to call, so avoid it if 1217possible. 1218 1219\subsection theory_api_enqueueEquality enqueueEquality() 1220 1221This function can be used in <tt>\ref theory_api_update</tt> to 1222propagate the equalities induced by the given equality <tt>e1==e2</tt> 1223as the argument to <tt>update()</tt>. 1224 1225In CVC3, equalities are treated as <em>directional</em>, meaning 1226that left-hand side is always being replaced by the right-hand side. 1227This means that if in the expression <tt>d</tt> there is a 1228subexpression <tt>e1</tt>, then it must be replaced by <tt>e2</tt>. 1229It also means that the resulting expression <tt>d2</tt> must be 1230reported to be equal to the original one as <tt>d==d2</tt>, and 1231<em>not</em> the other way around. 1232 1233Since the core may occasionally swap equalities submitted through 1234<tt>\ref theory_api_enqueueFact</tt> (for termination reasons), it is 1235important to submit the above equality by-passing the swapping engine. 1236This is where <tt>enqueueEquality()</tt> is useful. 1237 1238<strong>Invariant:</strong> <tt>enqueueEquality()</tt> expects the 1239argument to be a Theorem of the form <tt>e1==e2</tt>, where 1240<tt>e2</tt> is a <em>fully simplified expression</em> in the current 1241context. That is, <tt>e2 == simplify(e2)</tt>. You are responsible 1242for maintaining this invariant in your decision procedure. 1243 1244\subsection theory_api_inconsistentThm inconsistentThm() and inconsistent() 1245 1246If the context is inconsistent, the <tt>inconsistent()</tt> method 1247returns <tt>true</tt>, and <tt>inconsistentThm()</tt> returns the 1248Theorem of <tt>FALSE</tt> (a proof of the contradiction). 1249 1250\subsection theory_api_isTerm isTerm() 1251 1252Tests whether the given expression is a term (as opposed to a formula). 1253 1254\subsection theory_api_isLiteral isLiteral() 1255 1256Tests whether the given formula is a literal (is an atomic formula or 1257its negation). 1258 1259\subsection theory_api_isAtomic isAtomic() 1260 1261Tests whether the given term or formula is atomic. In CVC3 it 1262means that the expression does not contain formulas as subexpressions 1263(for instance, as conditions in the <tt>IF-THEN-ELSE</tt> operator). 1264 1265This method is relatively expensive when called for the first time, 1266but it caches the result in the Expr attributes, so the amortized 1267complexity tends to be rather low. 1268 1269\subsection theory_api_updateHelper updateHelper() 1270 1271This method replaces all the immediate children of the given 1272expression by their canonical representatives w.r.t. the union-find 1273database, and returns the corresponding Theorem <tt>e==e1</tt>. 1274 1275This function is convenient to use inside <tt>\ref theory_api_update</tt> 1276for rewriting <tt>d</tt>, the expression being updated. However, it 1277only works when the changed subexpression in <tt>d</tt> is its 1278immediate child. 1279 1280\subsection theory_api_getType getType() 1281 1282\subsection theory_api_getTCC getTCC() 1283 1284\subsection theory_api_parseExpr parseExpr() 1285 1286 1287\section theory_api_proofs Proof Rules: The Trusted Core 1288 1289Every proven formula (or <em>fact</em>) in CVC3 appears in the 1290form of a <tt>Theorem</tt>. Values of type <tt>Theorem</tt> have a 1291special property: they cannot be constructed in any way but through 1292the <em>proof rules</em>. This is implemented by making all the 1293constructors of this class private. The only exception is the default 1294constructor, which creates a null theorem, and it can only be used to 1295create uninitialized variables of type <tt>Theorem</tt>, and assign 1296them later. 1297 1298A proof rule is a function which takes <em>premises</em> (previously 1299generated theorems) and other parameters, and generates a new theorem. 1300The implementation of proof rules comprises the <em>trusted core</em> 1301of CVC3, and the soundness of the tool relies entirely on the 1302soundness of this core. In other words, no matter what the bulk of 1303the code does, if CVC3 derives the validity of a particular fact, 1304it is guaranteed that that theorem is indeed valid, provided the 1305trusted implementation is correct and sound. 1306 1307For this reason, it is prudent to keep the trusted core reasonably 1308small, and more importantly, keep each proof rule clean and simple, so 1309that the correctness of the rule itself (mathematically) and its 1310implementation can be easily verified by manual inspection. 1311 1312For the same reason, <strong>every rule must be thoroughly 1313documented</strong>. It's a very good idea to include a LaTeX formula 1314for the proof rule that a function implements. Keep in mind that 1315reverse-engineering the mathematical meaning of a proof rule is a 1316daunting task, especially if the code is rather long and complex. 1317 1318Check out <tt>src/include/common_proof_rules.h</tt> for examples. 1319 1320\subsection theory_api_proof_classes Hierarchy of Classes 1321 1322Like anything else in CVC3, there is an API for implementing proof 1323rules defined by the class <tt>TheoremProducer</tt>. This class 1324provides two protected methods, <tt>newTheorem()</tt> and 1325<tt>newRWTheorem()</tt>, to its subclasses, which can create arbitrary 1326<tt>Theorem</tt> values. Therefore, a part of the code is considered 1327<em>trusted</em> whenever the file contains <tt>\#include 1328"theorem_producer.h"</tt> statement in it. To enforce this, 1329<tt>theorem_producer.h</tt> requires a macro symbol 1330<tt>_CVC3_TRUSTED_</tt> to be defined (otherwise, a compiler warning 1331is generated). 1332 1333<strong>Important:</strong> the <tt>_CVC3_TRUSTED_</tt> symbol must be 1334defined only in *.cpp files, and <strong>never</strong> in *.h, to 1335prevent accidental inclusion of <tt>theorem_producer.h</tt>, and thus, 1336inadvertently making large portions of code trusted. 1337 1338Exporting the proof rules to the untrusted code (class 1339<tt>TheoryFoo</tt>) is implemented through the custom API in 1340<tt>foo_proof_rules.h</tt> (abstract class <tt>FooProofRules</tt>), 1341whose pure methods are the proof rules. This header file does not 1342include <tt>theorem_producer.h</tt>, and therefore, is suitable for 1343inclusion by untrusted code. 1344 1345The implementation of <tt>FooProofRules</tt> consists of the 1346implementation API: <tt>foo_theorem_producer.h</tt> (class 1347<tt>FooTheoremProducer</tt>, inherits from <tt>FooProofRules</tt> and 1348<tt>TheoremProducer</tt>), and the implementation proper in 1349<tt>foo_theorem_producer.cpp</tt>. 1350 1351Normally, <tt>theorem_producer.h</tt> is included from 1352<tt>foo_theorem_producer.h</tt> (to declare 1353<tt>FooTheoremProducer</tt> as a subclass of 1354<tt>TheoremProducer</tt>), and <tt>foo_theorem_producer.h</tt> is 1355included from <tt>foo_theorem_producer.cpp</tt>: 1356 1357\verbatim 1358// File foo_proof_rules.h 1359class FooProofRules { 1360.... 1361}; 1362\endverbatim 1363 1364\verbatim 1365// File foo_theorem_producer.h 1366#include "theorem_producer.h" 1367 1368class FooTheoremProducer: public FooProofRules, public TheoremProducer { 1369.... 1370}; 1371\endverbatim 1372 1373\verbatim 1374// File foo_theorem_producer.cpp 1375#define _CVC3_TRUSTED_ 1376#include "foo_theorem_producer.h" 1377.... 1378\endverbatim 1379 1380In order for the theory code to use the proof rules, a pointer to 1381<tt>FooProofRules</tt> is declared in the <tt>TheoryFoo</tt> class: 1382 1383\verbatim 1384// File theory_foo.h 1385 1386class FooProofRules; 1387class TheoryFoo: public Theory { 1388...... 1389 FooProofRules* d_rules; 1390 //! Create an instance of FooProofRules class 1391 FooProofRules* createProofRules(VCL* vcl); 1392..... 1393}; 1394\endverbatim 1395 1396Since instantiating <tt>FooProofRules</tt> requires creating a new 1397object of class <tt>FooTheoremProducer</tt>, which belongs to trusted 1398part of the code, the implementation of <tt>createProofRules()</tt> 1399method needs to be in <tt>foo_theorem_producer.cpp</tt>, rather than 1400in <tt>theory_foo.cpp</tt>. This is the only exception to the rule 1401that everything declared in a <tt>X.h</tt> file must be implemented in 1402the corresponding <tt>X.cpp</tt> file in CVC3. 1403 1404\verbatim 1405// File foo_theorem_producer.cpp 1406...... 1407FooProofRules* TheoryFoo::createProofRules(VCL* vcl) { 1408 return new FooTheoremProducer(vcl); 1409} 1410..... 1411\endverbatim 1412 1413In the <tt>TheoryFoo()</tt> constructor, the class variable 1414<tt>d_rules</tt> is initialized by calling <tt>createProofRules()</tt>: 1415 1416\verbatim 1417// File theory_foo.cpp 1418..... 1419TheoryFoo::TheoryFoo(VCL* vcl): Theory(vcl, "Foo") { 1420..... 1421 d_rules = createProofRules(vcl); 1422..... 1423} 1424 1425// Destructor: destroy the proof rules class 1426TheoryFoo::~TheoryFoo() { delete d_rules; } 1427..... 1428\endverbatim 1429 1430\subsection theory_api_proof_rule Implementing a Proof Rule 1431 1432Each function implementing a proof rule has several components: 1433 1434<ul> 1435<li>Soundness check(s), 1436<li>The actual computation of the result, 1437<li>Building assumptions, 1438<li>Building a proof, 1439<li>Constructing a new <tt>Theorem</tt> object. 1440</ul> 1441 1442There is a special macro <tt>CHECK_SOUND(cond, message)</tt> for 1443checking the soundness conditions. If the condition <tt>cond</tt> 1444does not evaluate to <tt>true</tt>, then a <tt>SoundException</tt> is 1445thrown with the <tt>message</tt> string. 1446 1447For efficiency, the user may decide to skip the soundness checks. In 1448order to honor this decision, all soundness checks must be supressed 1449when the <tt>CHECK_PROOFS</tt> macro evaluates to <tt>false</tt>: 1450 1451\verbatim 1452if(CHECK_PROOFS) { 1453 CHECK_SOUND(denominator != 0, 1454 "TheoryArith: Division rule: denominator == 0"); 1455} 1456\endverbatim 1457 1458Soundness checks play <strong>the most important role</strong> in 1459making CVC3 sound. Your implementation must guarantee that if all 1460soundness checks pass, then the rule is indeed sound to apply, and the 1461theorem you generate at the end is indeed a theorem. Soundness checks 1462include verifying that the premises (Theorems given as arguments) are 1463of the expected format, and all the additional parameters satisfy all 1464the side conditions of the proof rule. 1465 1466Soundness checks <strong>must be complete and self-sufficient</strong> 1467(bullet-proof) within the rule; that is, no matter how the rule is 1468called and with which arguments, there should be no way for the rule 1469to generate an invalid theorem. Even if the untrusted code which 1470calls the rule does all the necessary checks, you have to do them 1471again inside the rule. This is the whole point of the code being 1472trusted: <em>it cannot go wrong, no matter what happens outside</em>. 1473In case of CVC3, this means that <em>any non-null</em> 1474<tt>Theorem</tt> <em>object represents a valid theorem</em>, no matter 1475how this theorem was generated. 1476 1477The main component of a <tt>Theorem</tt> object is a formula (returned 1478by <tt>Theorem::getExpr()</tt>), which is valid in the appropriate 1479<em>logical context</em>. The logical context is defined by the set 1480of <em>assumptions</em> carried along in the <tt>Theorem</tt> object. 1481Mathematically, a theorem object represents a <em>sequent</em> 1482\f$\Gamma\vdash\phi\f$, where \f$\Gamma\f$ is the set of assumptions 1483(formulas assumed to be true), and \f$\phi\f$ is the theorem itself, 1484the formula which logically follows from \f$\Gamma\f$. 1485 1486Typically, an inference rule has the following format: 1487\f[\frac{\Gamma_1\vdash\phi_1\quad\cdots\quad\Gamma_n\vdash\phi_n} 1488 {\Gamma_1,\ldots,\Gamma_n\vdash\psi}\f] 1489where the assumptions of the conclusion are the union of all the 1490assumptions from the premises. 1491 1492The easiest way to compute the set of assumptions for the conclusion 1493is to use the overloaded method <tt>merge()</tt> provided by the 1494<tt>theorem.h</tt> API. This way you do not have to bother about the 1495internal representation of assumptions. Keep in mind, that CVC3 1496may be running in the mode without assumptions (for efficiency), which 1497can be queried by <tt>withAssumptions()</tt> method: 1498 1499\verbatim 1500Theorem fooRule(const Theorem& prem1, const Theorem& prem2) { 1501..... 1502Assumptions a; 1503if(withAssumptions()) 1504 a = merge(prem1, prem2); 1505.... 1506} 1507\endverbatim 1508 1509If the rule accepts more than two premises, you can merge assumptions 1510by passing the vector of all premises to the <tt>merge()</tt> method. 1511 1512When there is only one premis, the simplest way is to make a clean 1513copy of the assumptions from the premis: 1514 1515\verbatim 1516if(withAssumptions()) 1517 a = premis.getAssumptionsCopy(); 1518\endverbatim 1519 1520Occasionally, one needs to remove assumptions from the set, as in the 1521following rule: 1522\f[\frac{\Gamma_1\vdash\alpha\quad \Gamma_2, \alpha\vdash\phi} 1523 {\Gamma_1, \Gamma_2\vdash\phi}\mbox{Cut}\f] 1524 1525In this case, you can use the overloaded <tt>operator-()</tt> for 1526class <tt>Assumptions</tt>: 1527 1528\verbatim 1529Theorem cutRule(const Theorem& alpha, const Theorem& phi) { 1530..... 1531Assumptions a; 1532if(withAssumptions()) 1533 a = (phi.getAssumptions() - alpha.getExpr()) + alpha.getAssumptions(); 1534.... 1535} 1536\endverbatim 1537 1538Remember, however, that due to the internal representation used in CVC3, 1539removing an assumption is quite expensive, while merging is very 1540cheap. 1541 1542Also, if the soundness of your rule relies on the presence (or 1543absence) of certain assumptions in premises, the first thing you need 1544to check for is that <tt>withAssumptions()</tt> returns <tt>true</tt> 1545(otherwise there is no way to determine the soundness of the rule, so 1546it should not be called in the mode without assumptions). 1547 1548In the above rule, if the assumption \f$\alpha\f$ were required to be 1549present for soundness of the rule, one could check it as follows: 1550 1551\verbatim 1552const Expr& alphaExpr = alpha.getExpr(); 1553const Assumptions& phiAssump = phi.getAssumptionsRef(); 1554if(CHECK_PROOFS) { 1555 CHECK_SOUND(withAssumptions(), 1556 "TheoryFoo::cutRule: called without assumptions!"); 1557 CHECK_SOUND(!phiAssump[alphaExpr].isNull(), 1558 "TheoryFoo::cutRule: alpha is not an assumption of phi"); 1559} 1560\endverbatim 1561 1562It is <strong>extremely important</strong> that assumptions are 1563computed correctly when <tt>withAssumptions()</tt> returns 1564<tt>true</tt>, since assumptions are used by the SAT solver in the 1565core framework, and are absolutely crucial for the results to be 1566correct (or <em>sound</em>). Remember, that assumptions represent the 1567logical context where the theorem is true, and if they are not 1568computed properly, the entire theorem may become invalid. 1569 1570Each <tt>Theorem</tt> object carries a <em>proof</em> of itself in the 1571form of a <tt>Proof</tt> object. 1572 1573Proofs are relatively expensive to generate (they take up extra space 1574and somewhat slow down the rule execution), and therefore, it is the 1575user's privilege to turn them off. For this reason, all proof 1576generation code must be guarded by the method <tt>withProof()</tt>, 1577similarly to <tt>withAssumptions()</tt>. 1578 1579CVC3 uses a version of Natural Deduction as its logical basis, and 1580exploits the idea of Curry-Howard isomorphism to represent proofs as 1581terms over function symbols representing proof rules. A 1582``<em>type</em>'' of a proof term is the formula (theorem) derived by 1583the corresponding proof. The correctness of a proof in this framework 1584corresponds to the proof term being ``well-typed.'' 1585 1586The <tt>TheoremProducer</tt> API provides an overloaded method 1587<tt>newPf()</tt> for building proof terms. The first argument of 1588(almost) any <tt>newPf()</tt> version is the name of the proof rule 1589(<tt>string</tt>), and the rest are the arguments (parameters as 1590<tt>Expr</tt> values, and the proofs of the rule's premises). 1591 1592The key idea in building a proof term for the rule is to provide 1593enough information in the proof term to be able to re-run the rule 1594again with exactly the same arguments. 1595 1596For instance, a proof term for the following rule: 1597\f[\frac{\Gamma_1\vdash x < y \quad \Gamma_2\vdash y < z} 1598 {\Gamma_1, \Gamma_2\vdash x < z}\mbox{project} 1599\f] 1600can be constructed as follows: 1601 1602\verbatim 1603Theorem projectRule(const Theorem& xLTy, const Theorem& yLTz) { 1604..... 1605Proof pf; 1606if(withProof()) { 1607 vector<Expr> exprs; 1608 vector<Proof> pfs; 1609 exprs.push_back(xLTy.getExpr()); exprs.push_back(yLTz.getExpr()); 1610 pfs.push_back(xLTy.getProof()); pfs.push_back(yLTz.getProof()); 1611 pf = newPf("project", exprs, pfs); 1612} 1613..... 1614} 1615\endverbatim 1616 1617It is a very good idea to describe the proof object arguments in the 1618proof rule documentation (doxygen comments) in 1619<tt>foo_proof_rules.h</tt> file. 1620 1621Note, that the proof object does <em>not</em> carry around any 1622information about the assumptions. This is because all the 1623assumptions are present implicitly as ``types'' of bound proof 1624variables in LAMBDA-terms. I skip the description of this issue in 1625the current version of this document, as it is rather subtle, and 1626there is a 99\% chance that you do not need to know that for your DP. 1627 1628Finally, creating the resulting theorem in the proof rule is usually 1629done by calling <tt>newTheorem(conclusionExpr, a, pf)</tt>, where 1630<tt>a</tt> is the <tt>Assumptions</tt> variable, and <tt>pf</tt> is 1631the proof term. 1632 1633There is a special class of proof rules called <em>rewrite rules</em> 1634in CVC3. These are proof rules without premises (<em>axioms</em>) 1635whose conclusion is of the form <tt>expr1 = expr2</tt> or <tt>frm1 1636<=> frm2</tt>. These rules are so ubiquitous in the system that there 1637is a special optimized constructor for the corresponding theorems: 1638<tt>newRWTheorem(expr1, expr2, a, pf)</tt>. 1639 1640<strong>Note:</strong> the <tt>Theorem</tt> object constructed by the 1641rewrite rule has a different internal representation from the "normal" 1642<tt>Theorem</tt> object. Therefore, constructing a rewrite theorem 1643with <tt>newTheorem(expr1.eqExpr(expr2), a, pf)</tt> will result in a 1644run-time error. Make sure that if your theorem is a rewrite theorem 1645(an equality <tt>=</tt> or equivalence <tt><=></tt>), then it must be 1646constructed using the <tt>newRWTheorem()</tt> method. 1647 1648I did not mention anything about computing the expression for the 1649conclusion of the rule, but it should be fairly obvious how to do 1650this. Remember, that each proof rule should be coded as concisely and 1651as cleanly as possible, to ensure the effectiveness of manual 1652inspection. Remember, this is a trusted part of the code. 1653<strong>K</strong>eep <strong>I</strong>t <strong>S</strong>imple, 1654<strong>S</strong>tupid. :-) 1655 1656<em>Sergey Berezin / berezin AT stanford DOT edu</em> 1657 1658*/ 1659