Lionheart

A Refinement Logic Theorem Prover for Propositional Logic


In this paper, I propose a simple heuristic algorithm for deciding
logical propositions.  The correctness and completeness of the system
are discussed.

II. Formula Representation

I use Smullyan's definition of propositional formulas as the basis of
the data structure.  Smullyan defines propositional formulas as
follows:

A. Every propositional variable is a formula
B. If X is a formula so is ~X.
C. If X, Y are formulas, then for each of the binary connectives b,
the expression (X b Y) is a formula.

The conditions A, B, and C correspond roughly to the classes Var, Not,
and BinOp, three subclasses of PropFormula in Lionheart's C++
representation.  In addition, I introduce the ConstFormula type,
corresponding to the following rule:

D. true is a formula and false is a formula.

In the prover's structure, however, this additional type, and then
only F, serves as a placeholder on the right-hand side of the
turnstile and is not manipulated at all.  Thus the additional
condition does not affect the completeness or correctness of the
system.

Each of the formula classes includes a print method, which displays
the representation of that formula.  Formulas of type B and C are
printed recursively.  The representation of formulas is as stated in
the rules above. 

The formula classes also support an equals method to determine the
equivalence of two formulas X and Y using the following rules:

1. The formulas must be of the same type.
2. Any immediate subformulas must be equivalent.

III. Parsing of Formulas

The C++ source file parser.cc contains a recursive-descent parser for
the formulas described above.  The parser recognizes the following
operators, listed in order of increasing precedence:

A. ~   : NOT
B. &   : AND
C. |   : OR
D. ==> : IMPLIES

The last three are binary operators and are left associative.  The
parser does not require parentheses around a subexpression containing
a binary operator--the precedence and associativity rules are used to
decide semantic meaning.  Thus the input

p | q | r & t

corresponds to the following Smullyan-style formula:

((p v q) v (r ^ t))

The parser ignores any spaces and encodes any contiguous alphabetic
text as a variable.  Parentheses are supported and function as in the
Smullyan definition, overriding precedence.

IV. Decision Procedure

The automatic prover attempts to achieve a proof of the given formula
using the rules of refinement logic.  A refinement logic proof step,
or sequent, has the following form:

H1, H2, H3, ... , Hn |- G

where Hi is a formula, G is a formula.  If a refinement-logic proof of
the sequent exists, then we say G is true given H1, ... , Hn.

The refinement logic system is isomorphic and equivalent to the
analytic tableau system presented in Smullyan, and is therefore
correct and complete.  A propositional formula X is a tautology if and
only if the sequent

   |- X

has a refinement-logic proof.  The rules of refinement logic are
summarized in the appendix.

One difficulty with refinement logic is that many possible rules may
be legally applied to a given sequent.  For example, one might
repeatedly apply notR and notL, moving one formula from right to left,
without making any progress in the proof.  For this reason, the prover
makes use of a set of heuristics to determine when to apply a rule:

A. If the formula on the right of the turnstile is a variable and is
not equal to any formula on the left side, notR.

B. If the formula on the right of the turnstile is a variable and is
equal to a formula on the left side, hyp.

C. If the formula on the right of the turnstile is a binary formula
with &, andR.

D. If the formula on the right of the turnstile is a binary formula
with ==>, impL.

E. If the formula on the right of the turnstile is a binary formula
with |, and one of its immediate subformulas is the logical conjugate 
of the other, magic.

F. If the formula on the right of the turnstile is a logical conjugate,
notR.

Let a formula be called basic if it is either a variable or the
logical conjugate of a binary formula with |.

G. If the formula on the right of the turnstile is a binary formula
with | and all formulas on the left are basic, then attempt a proof after
applying an orR1.  If unsuccessful, attempt a proof after applying an 
orR2.

H. If the formula on the right of the turnstile is a binary formula
with | and there are non-basic formulas on the left, notR.

I. If there is some non-basic formula on the left, apply one of orL,
andL, impL, notL, in that order of precedence.

J. If a variable and its logical conjugate are on the left hand side,
apply notL to the conjugate. (Note that this is followed immediately by
rule B).

K. Give up and fail to find a proof.

The prover attempts to use A through K in that order, so the earlier
rules take precedence over the later rules.

Claim: (Correctness) Any formula proved by this system is a tautology.

Proof: The system is constrained by the rules of refinement logic,
which has been proved correct.

For the proof of correctness I must introduce a measure of the
progress of the prover.  Construct a function height which maps the
propositional formulas to the natural numbers.  Define

height(p) = 0, for all propositional variables p
height(~X) = height(X), for all propositional formulas X
height(X b Y) = height(X) + height(Y) + 1, for all formulas X, Y, and
all binary connectives b.

Further, construct seqheight, mapping refinement logic sequents to the
natural numbers, such that if S is the sequent

H1, ... , Hn |- G

seqheight(S) = sum(i=1..n)height(Hi) + height(G)

Let us say that sequent S dominates T if T is generated by one of the
subgoals of S.  Note that this is not a strict ordering, since S may
have up to two subgoals.

Lemma: Let the sequent S contain a formula with a binary connective.
Then each subgoal of S contains a sequent T generated by the system
such that seqheight(T) < seqheight(S).

Proof: 

If the binary formula is on the right-hand side of S:

   If it is an AND formula, apply andR, decreasing the height of the
   right side for both subgoals.

   If it is an IMPL formula, apply impR on X==>Y.  This decreases the
   height of the sequent by one, since Y remains on the right side and
   X joins the left side.

   If it is an OR formula, assume there are no binary formulas on the
   left side (since this case is handled separately).  Then apply
   orR1/orR2, decreasing the height of the right side.

   If it is the logical conjugate of a binary formula, apply notR.
   The binary formula is then on the left side.  (Note that the system
   does not generate formulas of the form ~~X, although this is legal
   by Smullyan's definition).  The system now applies left-hand-side
   rules.

If the binary formula is on the left-hand side of S:

   If it is an AND or OR formula, apply andL or orL.  The loss of the
   connective decreases seqheight by at least 1.

   If it is an IMPL formula, apply impL.  The two subgoals have height
   decreased by at least 1 for loss of the connective.

   If it is the logical conjugate of a binary formula, apply notL to
   bring a binary formula to the right side.  The system now applies
   right-hand-side rules.

Claim: (Completeness) Every tautology is proved by the system.

Proof: From the lemma above we obtain the result that the system
decreases the height of sequents until there are no more binary
connectives.  So the only formulas left are of the form p and ~p,
where p is a propositional variable.  Consider the 3 cases:

1) There is some variable p such that p and ~p are on the left side of
   the sequent.  In this case, rule A is employed to apply notL to
   ~p.  This yields case 2:

2) There is some variable p such that p is on both sides of the
   sequent.  In this case, the hyp rule is used, so the proof of the
   subgoal succeeds.

3) Neither of the above conditions holds.  The pseudo-rule giveup is
   used, so the proof of the subgoal fails.

By the completeness and correctness result of refinement logic, a
sequent is true iff its subgoals are true.

The only means of losing information from a sequent is by applying
orR1 or orR2, but the above system tries both cases.  Therefore, the
Lionheart system is complete.

V. Testing

A number of propositional logic tautologies from Smullyan were used to
test the Lionheart prover.  A portion of the results are found below:

 |- (((p ==> q) & (q ==> r)) ==> (p ==> r)) by impR
((p ==> q) & (q ==> r)) |- (p ==> r) by impR
((p ==> q) & (q ==> r)), p |- r by notR
((p ==> q) & (q ==> r)), p, ~r |- false by andL
p, ~r, (p ==> q), (q ==> r) |- false by impL
   p, ~r, (q ==> r) |- p by hyp

   p, ~r, (q ==> r), q |- false by impL
      p, ~r, q |- q by hyp

      p, ~r, q, r |- false by notL
      p, q, r |- r by hyp


 |- ((((p ==> r) & (q ==> r)) & (p | q)) ==> r) by impR
(((p ==> r) & (q ==> r)) & (p | q)) |- r by notR
(((p ==> r) & (q ==> r)) & (p | q)), ~r |- false by andL
~r, ((p ==> r) & (q ==> r)), (p | q) |- false by andL
~r, (p | q), (p ==> r), (q ==> r) |- false by orL
   ~r, (p ==> r), (q ==> r), p |- false by impL
      ~r, (q ==> r), p |- p by hyp

      ~r, (q ==> r), p, r |- false by impL
         ~r, p, r |- q by notR
         ~r, p, r, ~q |- false by notL
         p, r, ~q |- r by hyp

         ~r, p, r, r |- false by notL
         p, r, r |- r by hyp

   ~r, (p ==> r), (q ==> r), q |- false by impL
      ~r, (q ==> r), q |- p by notR
      ~r, (q ==> r), q, ~p |- false by impL
         ~r, q, ~p |- q by hyp

         ~r, q, ~p, r |- false by notL
         q, ~p, r |- r by hyp

      ~r, (q ==> r), q, r |- false by impL
         ~r, q, r |- q by hyp

         ~r, q, r, r |- false by notL
         q, r, r |- r by hyp