Equiv: Program Equivalence

Some Advice for Working on Exercises:

  • Most of the Coq proofs we ask you to do are similar to proofs that we've provided. Before starting to work on exercises problems, take the time to work through our proofs (both informally and in Coq) and make sure you understand them in detail. This will save you a lot of time.

  • The Coq proofs we're doing now are sufficiently complicated that it is more or less impossible to complete them by random experimentation or following your nose. You need to start with an idea about why the property is true and how the proof is going to go. The best way to do this is to write out at least a sketch of an informal proof on paper -- one that intuitively convinces you of the truth of the theorem -- before starting to work on the formal one. Alternately, grab a friend and try to convince them that the theorem is true; then try to formalize your explanation.

  • Use automation to save work! The proofs in this chapter can get pretty long if you try to write out all the cases explicitly.

Behavioral Equivalence

In an earlier chapter, we investigated the correctness of a very simple program transformation: the optimize_0plus function. The programming language we were considering was the first version of the language of arithmetic expressions -- with no variables -- so in that setting it was very easy to define what it means for a program transformation to be correct: it should always yield a program that evaluates to the same number as the original.

To talk about the correctness of program transformations for the full Imp language, in particular assignment, we need to consider the role of variables and state.

Definitions

For aexps and bexps with variables, the definition we want is clear: Two aexps or bexps are behaviorally equivalent if they evaluate to the same result in every state.

Here are some simple examples of equivalences of arithmetic and boolean expressions.

For commands, the situation is a little more subtle. We can't simply say two commands are behaviorally equivalent if they evaluate to the same ending state whenever they are started in the same initial state, because some commands, when run in some starting states, don't terminate in any final state at all! What we need instead is this: two commands are behaviorally equivalent if, for any given starting state, they either (1) both diverge or (2) both terminate in the same final state. A compact way to express this is if the first one terminates in a particular state then so does the second, and vice versa.

Simple Examples

For examples of command equivalence, let's start by looking at some trivial program transformations involving SKIP:

Exercise: 2 stars, standard (skip_right)

Prove that adding a SKIP after a command results in an equivalent program

Similarly, here is a simple transformation that optimizes TEST commands:

Of course, no (human) programmer would write a conditional whose guard is literally true. But they might write one whose guard is equivalent to true:

Theorem: If b is equivalent to BTrue, then TEST b THEN c1 ELSE c2 FI is equivalent to c1. Proof:

  • (->) We must show, for all st and st', that if st =[ TEST b THEN c1 ELSE c2 FI ]=> st' then st =[ c1 ]=> st'.

    Proceed by cases on the rules that could possibly have been used to show st =[ TEST b THEN c1 ELSE c2 FI ]=> st', namely E_IfTrue and E_IfFalse.

    • Suppose the final rule in the derivation of st =[ TEST b THEN c1 ELSE c2 FI ]=> st' was E_IfTrue. We then have, by the premises of E_IfTrue, that st =[ c1 ]=> st'. This is exactly what we set out to prove.

    • On the other hand, suppose the final rule in the derivation of st =[ TEST b THEN c1 ELSE c2 FI ]=> st' was E_IfFalse. We then know that beval st b = false and st =[ c2 ]=> st'.

      Recall that b is equivalent to BTrue, i.e., forall st, beval st b = beval st BTrue. In particular, this means that beval st b = true, since beval st BTrue = true. But this is a contradiction, since E_IfFalse requires that beval st b = false. Thus, the final rule could not have been E_IfFalse.

  • (<-) We must show, for all st and st', that if st =[ c1 ]=> st' then st =[ TEST b THEN c1 ELSE c2 FI ]=> st'.

    Since b is equivalent to BTrue, we know that beval st b = beval st BTrue = true. Together with the assumption that st =[ c1 ]=> st', we can apply E_IfTrue to derive st =[ TEST b THEN c1 ELSE c2 FI ]=> st'.

  • Here is the formal version of this proof:

    Exercise: 2 stars, standard, recommended (TEST_false)

    Exercise: 3 stars, standard (swap_if_branches)

    Show that we can swap the branches of an IF if we also negate its guard.

    For WHILE loops, we can give a similar pair of theorems. A loop whose guard is equivalent to BFalse is equivalent to SKIP, while a loop whose guard is equivalent to BTrue is equivalent to WHILE BTrue DO SKIP END (or any other non-terminating program). The first of these facts is easy.

    Exercise: 2 stars, advanced, optional (WHILE_false_informal)

    Write an informal proof of WHILE_false.

    (* FILL IN HERE *)

    To prove the second fact, we need an auxiliary lemma stating that WHILE loops whose guards are equivalent to BTrue never terminate.

    Lemma: If b is equivalent to BTrue, then it cannot be the case that st =[ WHILE b DO c END ]=> st'.

    Proof: Suppose that st =[ WHILE b DO c END ]=> st'. We show, by induction on a derivation of st =[ WHILE b DO c END ]=> st', that this assumption leads to a contradiction. The only two cases to consider are E_WhileFalse and E_WhileTrue, the others are contradictory.

    • Suppose st =[ WHILE b DO c END ]=> st' is proved using rule E_WhileFalse. Then by assumption beval st b = false. But this contradicts the assumption that b is equivalent to BTrue.

    • Suppose st =[ WHILE b DO c END ]=> st' is proved using rule E_WhileTrue. We must have that:

      1. beval st b = true, 2. there is some st0 such that st =[ c ]=> st0 and st0 =[ WHILE b DO c END ]=> st', 3. and we are given the induction hypothesis that st0 =[ WHILE b DO c END ]=> st' leads to a contradiction,

      We obtain a contradiction by 2 and 3.

    Exercise: 2 stars, standard, optional (WHILE_true_nonterm_informal)

    Explain what the lemma WHILE_true_nonterm means in English.

    (* FILL IN HERE *)

    Exercise: 2 stars, standard, recommended (WHILE_true)

    Prove the following theorem. Hint: You'll want to use WHILE_true_nonterm here.

    A more interesting fact about WHILE commands is that any number of copies of the body can be unrolled without changing meaning. Loop unrolling is a common transformation in real compilers.

    Exercise: 2 stars, standard, optional (seq_assoc)

    Proving program properties involving assignments is one place where the fact that program states are treated extensionally (e.g., x !-> m x ; m and m are equal maps) comes in handy.

    Exercise: 2 stars, standard, recommended (assign_aequiv)

    Exercise: 2 stars, standard (equiv_classes)

    Given the following programs, group together those that are equivalent in Imp. Your answer should be given as a list of lists, where each sub-list represents a group of equivalent programs. For example, if you think programs (a) through (h) are all equivalent to each other, but not to (i), your answer should look like this:

    [prog_a;prog_b;prog_c;prog_d;prog_e;prog_f;prog_g;prog_h] ; [prog_i]

    Write down your answer below in the definition of equiv_classes.

    Properties of Behavioral Equivalence

    We next consider some fundamental properties of program equivalence.

    Behavioral Equivalence Is an Equivalence

    First, we verify that the equivalences on aexps, bexps, and coms really are equivalences -- i.e., that they are reflexive, symmetric, and transitive. The proofs are all easy.

    Behavioral Equivalence Is a Congruence

    Less obviously, behavioral equivalence is also a congruence. That is, the equivalence of two subprograms implies the equivalence of the larger programs in which they are embedded:

    aequiv a1 a1'


    cequiv (x ::= a1) (x ::= a1')

    cequiv c1 c1' cequiv c2 c2'


    cequiv (c1;;c2) (c1';;c2')

    ... and so on for the other forms of commands.

    (Note that we are using the inference rule notation here not as part of a definition, but simply to write down some valid implications in a readable format. We prove these implications below.)

    We will see a concrete example of why these congruence properties are important in the following section (in the proof of fold_constants_com_sound), but the main idea is that they allow us to replace a small part of a large program with an equivalent small part and know that the whole large programs are equivalent without doing an explicit proof about the non-varying parts -- i.e., the proof burden of a small change to a large program is proportional to the size of the change, not the program.

    The congruence property for loops is a little more interesting, since it requires induction.

    Theorem: Equivalence is a congruence for WHILE -- that is, if b1 is equivalent to b1' and c1 is equivalent to c1', then WHILE b1 DO c1 END is equivalent to WHILE b1' DO c1' END.

    Proof: Suppose b1 is equivalent to b1' and c1 is equivalent to c1'. We must show, for every st and st', that st =[ WHILE b1 DO c1 END ]=> st' iff st =[ WHILE b1' DO c1' END ]=> st'. We consider the two directions separately.

  • (->) We show that st =[ WHILE b1 DO c1 END ]=> st' implies st =[ WHILE b1' DO c1' END ]=> st', by induction on a derivation of st =[ WHILE b1 DO c1 END ]=> st'. The only nontrivial cases are when the final rule in the derivation is E_WhileFalse or E_WhileTrue.

    • E_WhileFalse: In this case, the form of the rule gives us beval st b1 = false and st = st'. But then, since b1 and b1' are equivalent, we have beval st b1' = false, and E_WhileFalse applies, giving us st =[ WHILE b1' DO c1' END ]=> st', as required.

    • E_WhileTrue: The form of the rule now gives us beval st b1 = true, with st =[ c1 ]=> st'0 and st'0 =[ WHILE b1 DO c1 END ]=> st' for some state st'0, with the induction hypothesis st'0 =[ WHILE b1' DO c1' END ]=> st'.

      Since c1 and c1' are equivalent, we know that st =[ c1' ]=> st'0. And since b1 and b1' are equivalent, we have beval st b1' = true. Now E_WhileTrue applies, giving us st =[ WHILE b1' DO c1' END ]=> st', as required.

  • (<-) Similar.
  • Exercise: 3 stars, standard, optional (CSeq_congruence)

    Exercise: 3 stars, standard (CIf_congruence)

    For example, here are two equivalent programs and a proof of their equivalence...

    Exercise: 3 stars, advanced, optional (not_congr)

    We've shown that the cequiv relation is both an equivalence and a congruence on commands. Can you think of a relation on commands that is an equivalence but not a congruence?

    Program Transformations

    A program transformation is a function that takes a program as input and produces some variant of the program as output. Compiler optimizations such as constant folding are a canonical example, but there are many others.

    A program transformation is sound if it preserves the behavior of the original program.

    The Constant-Folding Transformation

    An expression is constant when it contains no variable references.

    Constant folding is an optimization that finds constant expressions and replaces them by their values.

    Note that this version of constant folding doesn't eliminate trivial additions, etc. -- we are focusing attention on a single optimization for the sake of simplicity. It is not hard to incorporate other ways of simplifying expressions; the definitions and proofs just get longer.

    Not only can we lift fold_constants_aexp to bexps (in the BEq and BLe cases); we can also look for constant boolean expressions and evaluate them in-place.

    To fold constants in a command, we apply the appropriate folding functions on all embedded expressions.

    Soundness of Constant Folding

    Now we need to show that what we've done is correct.

    Here's the proof for arithmetic expressions:

    Exercise: 3 stars, standard, optional (fold_bexp_Eq_informal)

    Here is an informal proof of the BEq case of the soundness argument for boolean expression constant folding. Read it carefully and compare it to the formal proof that follows. Then fill in the BLe case of the formal proof (without looking at the BEq case, if possible).

    Theorem: The constant folding function for booleans, fold_constants_bexp, is sound.

    Proof: We must show that b is equivalent to fold_constants_bexp b, for all boolean expressions b. Proceed by induction on b. We show just the case where b has the form BEq a1 a2.

    In this case, we must show

    beval st (BEq a1 a2) = beval st (fold_constants_bexp (BEq a1 a2)).

    There are two cases to consider:

  • First, suppose fold_constants_aexp a1 = ANum n1 and fold_constants_aexp a2 = ANum n2 for some n1 and n2.

    In this case, we have

    fold_constants_bexp (BEq a1 a2) = if n1 =? n2 then BTrue else BFalse

    and

    beval st (BEq a1 a2) = (aeval st a1) =? (aeval st a2).

    By the soundness of constant folding for arithmetic expressions (Lemma fold_constants_aexp_sound), we know

    aeval st a1 = aeval st (fold_constants_aexp a1) = aeval st (ANum n1) = n1

    and

    aeval st a2 = aeval st (fold_constants_aexp a2) = aeval st (ANum n2) = n2,

    so

    beval st (BEq a1 a2) = (aeval a1) =? (aeval a2) = n1 =? n2.

    Also, it is easy to see (by considering the cases n1 = n2 and n1 <> n2 separately) that

    beval st (if n1 =? n2 then BTrue else BFalse) = if n1 =? n2 then beval st BTrue else beval st BFalse = if n1 =? n2 then true else false = n1 =? n2.

    So

    beval st (BEq a1 a2) = n1 =? n2. = beval st (if n1 =? n2 then BTrue else BFalse),

    as required.

  • Otherwise, one of fold_constants_aexp a1 and fold_constants_aexp a2 is not a constant. In this case, we must show

    beval st (BEq a1 a2) = beval st (BEq (fold_constants_aexp a1) (fold_constants_aexp a2)),

    which, by the definition of beval, is the same as showing

    (aeval st a1) =? (aeval st a2) = (aeval st (fold_constants_aexp a1)) =? (aeval st (fold_constants_aexp a2)).

    But the soundness of constant folding for arithmetic expressions (fold_constants_aexp_sound) gives us

    aeval st a1 = aeval st (fold_constants_aexp a1) aeval st a2 = aeval st (fold_constants_aexp a2),

    completing the case.

  • (Doing induction when there are a lot of constructors makes specifying variable names a chore, but Coq doesn't always choose nice variable names. We can rename entries in the context with the rename tactic: rename a into a1 will change a to a1 in the current goal and context.)

    Exercise: 3 stars, standard (fold_constants_com_sound)

    Complete the WHILE case of the following proof.

    Soundness of (0 + n) Elimination, Redux

    Exercise: 4 stars, advanced, optional (optimize_0plus)

    Recall the definition optimize_0plus from the Imp chapter of Logical Foundations:

    Fixpoint optimize_0plus (e:aexp) : aexp := match e with | ANum n => ANum n | APlus (ANum 0) e2 => optimize_0plus e2 | APlus e1 e2 => APlus (optimize_0plus e1) (optimize_0plus e2) | AMinus e1 e2 => AMinus (optimize_0plus e1) (optimize_0plus e2) | AMult e1 e2 => AMult (optimize_0plus e1) (optimize_0plus e2) end.

    Note that this function is defined over the old aexps, without states.

    Write a new version of this function that accounts for variables, plus analogous ones for bexps and commands:

    optimize_0plus_aexp optimize_0plus_bexp optimize_0plus_com

    Prove that these three functions are sound, as we did for fold_constants_*. Make sure you use the congruence lemmas in the proof of optimize_0plus_com -- otherwise it will be long!

    Then define an optimizer on commands that first folds constants (using fold_constants_com) and then eliminates 0 + n terms (using optimize_0plus_com).

  • Give a meaningful example of this optimizer's output.

  • Prove that the optimizer is sound. (This part should be very easy.)
  • Proving Inequivalence

    Suppose that c1 is a command of the form X ::= a1;; Y ::= a2 and c2 is the command X ::= a1;; Y ::= a2', where a2' is formed by substituting a1 for all occurrences of X in a2. For example, c1 and c2 might be:

    c1 = (X ::= 42 + 53;; Y ::= Y + X) c2 = (X ::= 42 + 53;; Y ::= Y + (42 + 53))

    Clearly, this particular c1 and c2 are equivalent. Is this true in general?

    We will see in a moment that it is not, but it is worthwhile to pause, now, and see if you can find a counter-example on your own.

    More formally, here is the function that substitutes an arithmetic expression u for each occurrence of a given variable x in another expression a:

    And here is the property we are interested in, expressing the claim that commands c1 and c2 as described above are always equivalent.

    Sadly, the property does not always hold -- i.e., it is not the case that, for all x1, x2, a1, and a2,

    cequiv (x1 ::= a1;; x2 ::= a2) (x1 ::= a1;; x2 ::= subst_aexp x1 a1 a2).

    To see this, suppose (for a contradiction) that for all x1, x2, a1, and a2, we have

    cequiv (x1 ::= a1;; x2 ::= a2) (x1 ::= a1;; x2 ::= subst_aexp x1 a1 a2).

    Consider the following program:

    X ::= X + 1;; Y ::= X

    Note that

    empty_st = X ::= X + 1;; Y ::= X => st1,

    where st1 = (Y !-> 1 ; X !-> 1).

    By assumption, we know that

    cequiv (X ::= X + 1;; Y ::= X) (X ::= X + 1;; Y ::= X + 1)

    so, by the definition of cequiv, we have

    empty_st = X ::= X + 1;; Y ::= X + 1 => st1.

    But we can also derive

    empty_st = X ::= X + 1;; Y ::= X + 1 => st2,

    where st2 = (Y !-> 2 ; X !-> 1). But st1 <> st2, which is a contradiction, since ceval is deterministic!

    Exercise: 4 stars, standard, optional (better_subst_equiv)

    The equivalence we had in mind above was not complete nonsense -- it was actually almost right. To make it correct, we just need to exclude the case where the variable X occurs in the right-hand-side of the first assignment statement.

    Using var_not_used_in_aexp, formalize and prove a correct version of subst_equiv_property.

    Exercise: 3 stars, standard (inequiv_exercise)

    Prove that an infinite loop is not equivalent to SKIP

    Extended Exercise: Nondeterministic Imp

    As we have seen (in theorem ceval_deterministic in the Imp chapter), Imp's evaluation relation is deterministic. However, non-determinism is an important part of the definition of many real programming languages. For example, in many imperative languages (such as C and its relatives), the order in which function arguments are evaluated is unspecified. The program fragment

    x = 0;; f(++x, x)

    might call f with arguments (1, 0) or (1, 1), depending how the compiler chooses to order things. This can be a little confusing for programmers, but it gives the compiler writer useful freedom.

    In this exercise, we will extend Imp with a simple nondeterministic command and study how this change affects program equivalence. The new command has the syntax HAVOC X, where X is an identifier. The effect of executing HAVOC X is to assign an arbitrary number to the variable X, nondeterministically. For example, after executing the program:

    HAVOC Y;; Z ::= Y * 2

    the value of Y can be any number, while the value of Z is twice that of Y (so Z is always even). Note that we are not saying anything about the probabilities of the outcomes -- just that there are (infinitely) many different outcomes that can possibly happen after executing this nondeterministic code.

    In a sense, a variable on which we do HAVOC roughly corresponds to an uninitialized variable in a low-level language like C. After the HAVOC, the variable holds a fixed but arbitrary number. Most sources of nondeterminism in language definitions are there precisely because programmers don't care which choice is made (and so it is good to leave it open to the compiler to choose whichever will run faster).

    We call this new language Himp (``Imp extended with HAVOC'').

    To formalize Himp, we first add a clause to the definition of commands.

    Exercise: 2 stars, standard (himp_ceval)

    Now, we must extend the operational semantics. We have provided a template for the ceval relation below, specifying the big-step semantics. What rule(s) must be added to the definition of ceval to formalize the behavior of the HAVOC command?

    As a sanity check, the following claims should be provable for your definition:

    Finally, we repeat the definition of command equivalence from above:

    Let's apply this definition to prove some nondeterministic programs equivalent / inequivalent.

    Exercise: 3 stars, standard (havoc_swap)

    Are the following two programs equivalent?

    If you think they are equivalent, prove it. If you think they are not, prove that.

    Exercise: 4 stars, standard, optional (havoc_copy)

    Are the following two programs equivalent?

    If you think they are equivalent, then prove it. If you think they are not, then prove that. (Hint: You may find the assert tactic useful.)

    The definition of program equivalence we are using here has some subtle consequences on programs that may loop forever. What cequiv says is that the set of possible terminating outcomes of two equivalent programs is the same. However, in a language with nondeterminism, like Himp, some programs always terminate, some programs always diverge, and some programs can nondeterministically terminate in some runs and diverge in others. The final part of the following exercise illustrates this phenomenon.

    Exercise: 4 stars, advanced (p1_p2_term)

    Consider the following commands:

    Intuitively, p1 and p2 have the same termination behavior: either they loop forever, or they terminate in the same state they started in. We can capture the termination behavior of p1 and p2 individually with these lemmas:

    Exercise: 4 stars, advanced (p1_p2_equiv)

    Use these two lemmas to prove that p1 and p2 are actually equivalent.

    Exercise: 4 stars, advanced (p3_p4_inequiv)

    Prove that the following programs are not equivalent. (Hint: What should the value of Z be when p3 terminates? What about p4?)

    Exercise: 5 stars, advanced, optional (p5_p6_equiv)

    Prove that the following commands are equivalent. (Hint: As mentioned above, our definition of cequiv for Himp only takes into account the sets of possible terminating configurations: two programs are equivalent if and only if the set of possible terminating states is the same for both programs when given a same starting state st. If p5 terminates, what should the final state be? Conversely, is it always possible to make p5 terminate?)

    Additional Exercises

    Exercise: 4 stars, standard, optional (for_while_equiv)

    This exercise extends the optional add_for_loop exercise from the Imp chapter, where you were asked to extend the language of commands with C-style for loops. Prove that the command:

    for (c1 ; b ; c2) { c3 }

    is equivalent to:

    c1 ; WHILE b DO c3 ; c2 END

    Exercise: 3 stars, standard, optional (swap_noninterfering_assignments)

    (Hint: You'll need functional_extensionality for this one.)

    Exercise: 4 stars, advanced, optional (capprox)

    In this exercise we define an asymmetric variant of program equivalence we call program approximation. We say that a program c1 approximates a program c2 when, for each of the initial states for which c1 terminates, c2 also terminates and produces the same final state. Formally, program approximation is defined as follows:

    For example, the program

    c1 = WHILE ~(X = 1) DO X ::= X - 1 END

    approximates c2 = X ::= 1, but c2 does not approximate c1 since c1 does not terminate when X = 0 but c2 does. If two programs approximate each other in both directions, then they are equivalent.

    Find two programs c3 and c4 such that neither approximates the other.

    Find a program cmin that approximates every other program.

    Finally, find a non-trivial property which is preserved by program approximation (when going from left to right).