Lionheart A Refinement Logic Theorem Prover for Propositional Logic In this paper, I propose a simple heuristic algorithm for deciding logical propositions. The correctness and completeness of the system are discussed. II. Formula Representation I use Smullyan's definition of propositional formulas as the basis of the data structure. Smullyan defines propositional formulas as follows: A. Every propositional variable is a formula B. If X is a formula so is ~X. C. If X, Y are formulas, then for each of the binary connectives b, the expression (X b Y) is a formula. The conditions A, B, and C correspond roughly to the classes Var, Not, and BinOp, three subclasses of PropFormula in Lionheart's C++ representation. In addition, I introduce the ConstFormula type, corresponding to the following rule: D. true is a formula and false is a formula. In the prover's structure, however, this additional type, and then only F, serves as a placeholder on the right-hand side of the turnstile and is not manipulated at all. Thus the additional condition does not affect the completeness or correctness of the system. Each of the formula classes includes a print method, which displays the representation of that formula. Formulas of type B and C are printed recursively. The representation of formulas is as stated in the rules above. The formula classes also support an equals method to determine the equivalence of two formulas X and Y using the following rules: 1. The formulas must be of the same type. 2. Any immediate subformulas must be equivalent. III. Parsing of Formulas The C++ source file parser.cc contains a recursive-descent parser for the formulas described above. The parser recognizes the following operators, listed in order of increasing precedence: A. ~ : NOT B. & : AND C. | : OR D. ==> : IMPLIES The last three are binary operators and are left associative. The parser does not require parentheses around a subexpression containing a binary operator--the precedence and associativity rules are used to decide semantic meaning. Thus the input p | q | r & t corresponds to the following Smullyan-style formula: ((p v q) v (r ^ t)) The parser ignores any spaces and encodes any contiguous alphabetic text as a variable. Parentheses are supported and function as in the Smullyan definition, overriding precedence. IV. Decision Procedure The automatic prover attempts to achieve a proof of the given formula using the rules of refinement logic. A refinement logic proof step, or sequent, has the following form: H1, H2, H3, ... , Hn |- G where Hi is a formula, G is a formula. If a refinement-logic proof of the sequent exists, then we say G is true given H1, ... , Hn. The refinement logic system is isomorphic and equivalent to the analytic tableau system presented in Smullyan, and is therefore correct and complete. A propositional formula X is a tautology if and only if the sequent |- X has a refinement-logic proof. The rules of refinement logic are summarized in the appendix. One difficulty with refinement logic is that many possible rules may be legally applied to a given sequent. For example, one might repeatedly apply notR and notL, moving one formula from right to left, without making any progress in the proof. For this reason, the prover makes use of a set of heuristics to determine when to apply a rule: A. If the formula on the right of the turnstile is a variable and is not equal to any formula on the left side, notR. B. If the formula on the right of the turnstile is a variable and is equal to a formula on the left side, hyp. C. If the formula on the right of the turnstile is a binary formula with &, andR. D. If the formula on the right of the turnstile is a binary formula with ==>, impL. E. If the formula on the right of the turnstile is a binary formula with |, and one of its immediate subformulas is the logical conjugate of the other, magic. F. If the formula on the right of the turnstile is a logical conjugate, notR. Let a formula be called basic if it is either a variable or the logical conjugate of a binary formula with |. G. If the formula on the right of the turnstile is a binary formula with | and all formulas on the left are basic, then attempt a proof after applying an orR1. If unsuccessful, attempt a proof after applying an orR2. H. If the formula on the right of the turnstile is a binary formula with | and there are non-basic formulas on the left, notR. I. If there is some non-basic formula on the left, apply one of orL, andL, impL, notL, in that order of precedence. J. If a variable and its logical conjugate are on the left hand side, apply notL to the conjugate. (Note that this is followed immediately by rule B). K. Give up and fail to find a proof. The prover attempts to use A through K in that order, so the earlier rules take precedence over the later rules. Claim: (Correctness) Any formula proved by this system is a tautology. Proof: The system is constrained by the rules of refinement logic, which has been proved correct. For the proof of correctness I must introduce a measure of the progress of the prover. Construct a function height which maps the propositional formulas to the natural numbers. Define height(p) = 0, for all propositional variables p height(~X) = height(X), for all propositional formulas X height(X b Y) = height(X) + height(Y) + 1, for all formulas X, Y, and all binary connectives b. Further, construct seqheight, mapping refinement logic sequents to the natural numbers, such that if S is the sequent H1, ... , Hn |- G seqheight(S) = sum(i=1..n)height(Hi) + height(G) Let us say that sequent S dominates T if T is generated by one of the subgoals of S. Note that this is not a strict ordering, since S may have up to two subgoals. Lemma: Let the sequent S contain a formula with a binary connective. Then each subgoal of S contains a sequent T generated by the system such that seqheight(T) < seqheight(S). Proof: If the binary formula is on the right-hand side of S: If it is an AND formula, apply andR, decreasing the height of the right side for both subgoals. If it is an IMPL formula, apply impR on X==>Y. This decreases the height of the sequent by one, since Y remains on the right side and X joins the left side. If it is an OR formula, assume there are no binary formulas on the left side (since this case is handled separately). Then apply orR1/orR2, decreasing the height of the right side. If it is the logical conjugate of a binary formula, apply notR. The binary formula is then on the left side. (Note that the system does not generate formulas of the form ~~X, although this is legal by Smullyan's definition). The system now applies left-hand-side rules. If the binary formula is on the left-hand side of S: If it is an AND or OR formula, apply andL or orL. The loss of the connective decreases seqheight by at least 1. If it is an IMPL formula, apply impL. The two subgoals have height decreased by at least 1 for loss of the connective. If it is the logical conjugate of a binary formula, apply notL to bring a binary formula to the right side. The system now applies right-hand-side rules. Claim: (Completeness) Every tautology is proved by the system. Proof: From the lemma above we obtain the result that the system decreases the height of sequents until there are no more binary connectives. So the only formulas left are of the form p and ~p, where p is a propositional variable. Consider the 3 cases: 1) There is some variable p such that p and ~p are on the left side of the sequent. In this case, rule A is employed to apply notL to ~p. This yields case 2: 2) There is some variable p such that p is on both sides of the sequent. In this case, the hyp rule is used, so the proof of the subgoal succeeds. 3) Neither of the above conditions holds. The pseudo-rule giveup is used, so the proof of the subgoal fails. By the completeness and correctness result of refinement logic, a sequent is true iff its subgoals are true. The only means of losing information from a sequent is by applying orR1 or orR2, but the above system tries both cases. Therefore, the Lionheart system is complete. V. Testing A number of propositional logic tautologies from Smullyan were used to test the Lionheart prover. A portion of the results are found below: |- (((p ==> q) & (q ==> r)) ==> (p ==> r)) by impR ((p ==> q) & (q ==> r)) |- (p ==> r) by impR ((p ==> q) & (q ==> r)), p |- r by notR ((p ==> q) & (q ==> r)), p, ~r |- false by andL p, ~r, (p ==> q), (q ==> r) |- false by impL p, ~r, (q ==> r) |- p by hyp p, ~r, (q ==> r), q |- false by impL p, ~r, q |- q by hyp p, ~r, q, r |- false by notL p, q, r |- r by hyp |- ((((p ==> r) & (q ==> r)) & (p | q)) ==> r) by impR (((p ==> r) & (q ==> r)) & (p | q)) |- r by notR (((p ==> r) & (q ==> r)) & (p | q)), ~r |- false by andL ~r, ((p ==> r) & (q ==> r)), (p | q) |- false by andL ~r, (p | q), (p ==> r), (q ==> r) |- false by orL ~r, (p ==> r), (q ==> r), p |- false by impL ~r, (q ==> r), p |- p by hyp ~r, (q ==> r), p, r |- false by impL ~r, p, r |- q by notR ~r, p, r, ~q |- false by notL p, r, ~q |- r by hyp ~r, p, r, r |- false by notL p, r, r |- r by hyp ~r, (p ==> r), (q ==> r), q |- false by impL ~r, (q ==> r), q |- p by notR ~r, (q ==> r), q, ~p |- false by impL ~r, q, ~p |- q by hyp ~r, q, ~p, r |- false by notL q, ~p, r |- r by hyp ~r, (q ==> r), q, r |- false by impL ~r, q, r |- q by hyp ~r, q, r, r |- false by notL q, r, r |- r by hyp