lesson_2_1

Lesson 2.1: Core Coq tactics and proofs

From mathcomp Require Import all_ssreflect.

Set Implicit Arguments.
Unset Strict Implicit.
Unset Printing Implicit Defensive.
Definition sorry {T} : T. Admitted.
In this lesson, we will learn the main Coq tactics and different proof styles using them. Before starting, let's quickly refresh the core Coq syntax:

Coq's Syntax Cheat-sheet:

The canonical reference is the Coq manual https://coq.inria.fr/refman/
Core typing judgment:
   Γ ⊢ p : T
p has type T under hypothesis Γ . Checking this is the main purpose of Coq's kernel.

Vernacular Commands:

        id ::= $string
       thm ::= Definition | Theorem | Lemma | Example

   command ::= $thm $id $argument* (: $type)? . Proof.? $tactic*. $end .
             | $thm $id $argument* (: $type)? := $term .
             | Fixpoint $id $argument+ (: $type)? := $term.
             | Print $id
             | About $id
             | Check $id
             | Search $search_options

   argument ::= $id | ( $id : $type )

        end ::= Qed | Defined | Admitted

Terms and types:

     kind ::= Prop                             collection of all "formulas"
            | Type                             collection of all "types"

     term ::= $kind
            | $id                              variable or constant
            | forall $argument+ , $type        Pi Type (or type function)
            | fun $argument+ => $term          regular function
            | $term $term                      application
            | match $term with $branch* end.   pattern matching

   branch ::= | $id+ => $term

     type ::= $term                 such that "$term : $kind"
The type A B is notation for (_ : A), B.

Inductives

   inductive_decl ::= Inductive $id $argument* (: $type)? ::= $constructor_decl*

 constructor_decl ::= | $id : $type            where $type has to belong
                                               to the inductive being declared

Some core defined datatypes:

  • bool: Type of booleans (either true or false) Some boolean ops: a && b, a || b, a ==> b, ~~ b.
  • nat: Type of Peano Numbers (either O or n.+1) Some nat ops: a + b, a - b, a - b, a ^ b, a %% b, a /% b.
  • option A: Either Some x , with x : A or None
  • list A: Lists of type A Some list ops: l ++ m, map f l, filter p l.

Some core defined logical connectives:

  • x = y: equality
  • True, False: Basic propositions
  • ¬ P : negation, defined as P False
  • P Q: conjunction
  • P Q: disjunction
  • (x : T), P x: exists (where P a predicate on values of type T

Writing proof terms, tactics, and the proof cycle:

In the previous lesson, we did learn what it means to be a proof. We can now try to prove some simple logical statements:
Definition logic1 P Q : PQP := sorry.
Definition logic2 (P : Prop) : P → ¬ ¬ P := sorry.
Definition nat_inv : n, n = 0 ∨ m, S m = n := sorry.
Definition forall_dist T P Q :
  ( (x : T), P xQ x) →
  ( (x : T), P x) ∧ ( (x : T), Q x) := sorry.
Writing proof terms by hand this way quickly turns into a cumbersome endeavor, as writing proof terms is not conceptually very close to the way humans think of formal proof steps.
Additionally, Coq is a direct descendent from the LCF family of provers, which are built from a different philosophy: their kernel provides a set of "tactical", which are meant to represent human proof steps. In this setting, users start from a "goal", and apply tactics to refine the goals until the end has been reached. Such tactics are trusted, and soundness is kept thanks to some abstraction mechanism in the host language.
For both this reasons, Coq quickly gained support for tactics, which in Coq's case, are term-synthesis procedures. In Coq's case, the initial goal will be the type or proposition we want to build a proof for, and we will use tactics to gradually built the term. Goals correspond to holes in the proof, and we are done when a complete program for the target goal has been built.
Tactics simply user life considerably, as in non-trivial cases, writing low-level proof terms quickly becomes a very complex and prone-to-error task.
The final step of this "proof cycle" is sending the newly-built terms to kernel, as to ensure the term is really a proof of the initial goal.
A typical example of using a tactic is to prove a theorem of the form A B ; we can use the split tactic, which will apply the introduction rule (or constructor) for and and produce 2 separate goals A and B.
Summarizing:
1. user states a particular proposition to be proved, submit to Coq this proposition of "goal"
2. if the proposition itself can be typed (has type Prop or Type), Coq will start the interactive mode, allow tactics to be enter
3. users can then provide a partial term, or call a tactic to synthesize one for them
4. Coq will provide feedback on the newly submitted term, and either produce an error, or update the goal and go back to step 3.
5. If there is no goal, we are done; we will close out proof with the `Qed` command and Coq will re-check the final proof as to catch any possible bugs with tactics.
Theorem forall_dist' T P Q :
  ( (x : T), P xQ x) →
  ( (x : T), P x) ∧ ( (x : T), Q x).
Proof. Admitted.

Proofs and telescopes: moving things around

At this point, you may have realized that proving things in Coq is a very combinatorial process. For example, in the above proof, all that we do is to reorder the arguments, and pass some parameters. With inductive data, the proof process usually involves packing and unpacking such data:
Theorem forall_exists T (P : TProp) : ( x, P x) → x, P x.
Proof. Admitted.
This process can often feel like assembling a puzzle! And indeed, it is in a sense very close to it.
But a key consideration of our puzzle is that some pieces do depend on others. Coq is often called a "dependently-typed programming language", as indeed, types can depend on terms. For example, consider the following type:
   (x y : nat) (x_eq_y : x = y) P (px_true: P x), P y
as we can see, px_true does depend on P and x. So we have no problem in shuffling our statement like this:
   (x y : nat) P (px_true: P x) (x_eq_y : x = y), P y
however, we cannot do this, as then Coq will complain P is not known:
   (x y : nat) (px_true: P x) P (x_eq_y : x = y), P y
Thus, we working with assumptions, we must be aware of the dependencies between hypothesis!
In fact, a large part of building a mechanized proof effectively is relies on having the proper tools for the management of context and assumptions, both locally and globally. This process is not an exact science, and a few "styles" can be found in the wild.
Styles of proof writing and management greatly differ in several accounts, such as:
  • robustness to change: proofs are living objects, and often subject to continuous refactoring and refinement. Different styles provide different trade-offs in this aspect
  • verbosity: some proof styles optimize for the writer, some others for the reader of the proof
  • efficiency: some tactics and styles can be orders of magnitude faster
  • automation: some proof styles favor highly "automated" styles, whereas others follow what's written very closely.
In this lesson, we will make a quite opinionated choice, and select ssreflect as our proof language of choice. ssreflect and the mathematical component library were built for the development of large-scale mathematical proofs, and favor a highly-modular, robust, and efficient style. Moreover, the maturity of the proof language is quite high.
The ssreflect proof language is quite different from the one loaded by default from the Coq prelude, and often misunderstood, as many of its design rationales are maybe not documented in the most public places. In particular, I would highlight the following features:
  • principled "strict" approach to telescope management: ssreflect doesn't allow to perform implicit ill-formed telescope / context manipulation. If the user desires to do, they must indicate it explicitly.
  • efficient assumption processing: ssreflect favors to process the assumptions just as they appear, for example, instead of adding an equality to the context, we may as well just rewrite with it and dispose the same moment we see it; similarly for assumptions that need case analysis. Moreover, this approach minimizes the amount of names that user needs to assign, creating more robust proofs.
  • strict naming policy: all objects introduced in the proof context must be explicitly named, or forbidden to be referenced by name.
  • efficient use of space: ssreflect favor a 80-column writing style, and a reasonably compact proof writing style, getting closer to human-workflow
  • separation of implementation and specification: this usually results in more robust proofs. For example, it is not allowed to depend on the enumeration of a finite type, but only on some permutation of it.
  • orthogonality / compositionality: ssreflect tries to minimize the amount of tactics, and tries to ensure they compose properly.
In particular, I'd dare to say that point one corrects a design flaw of Coq, which will be discussed more in-depth in the case analysis section.

The assumption / telescope stack

The first thing that the ssreflect proof language provides a set of "tacticals" operators, : and which will allow tactics to perform context manipulation.
Usually, our starting goal will have the following shape:
=========================================================================
   forall (x : T1) (y : T2 x) ... (z : Tn x y ...), Goal x y ... z
Note how in this case, each assumption (and the terminal Goal) does depend on all the previous one. We can place all the hypothesis in the assumption context using move hyp1 ... hypn, which results in a goal:
  hyp1 : T1
  hyp2 : T2 hyp1
  ...
  hypn : Tn hyp1 hyp2 ...
==============
   Goal hyp1 hyp2 ... hypn
note that this is equivalent to synthesizing a term fun hyp1 ... hypn _ , where _ is a hole. We can do move: hyp1 ... hypn to revert the goal to its original state.
We refer to and : as operators as indeed they always appear together with a tactic:
  • tac $intro_pattern this form will run tac , then apply the $intro_pattern to the elements in the goals.
  • tac: $gen_pattern this form will push the elements of $gen_pattern to the goal, then run tac.
When a tactic appears alone, for example move, it usually acts on the top of the goal.
A special case is rewrite which can be seen as operating over the whole goal at a time.
Note that these tacticals are not tactics proper in the sense that just the arrow and the colon can just operate on the assumptions, we still need a tactic to operate in the goal. It is very important to keep this "polarity" in mind when doing proofs with math-comp. Usually, the operations that we can do on the assumptions are the dual of the ones we can do in the conclusion.

Introduction and generalization patterns

We will briefly describe what $intro_pattern and $gen_pattern are, for a complete reference you can consult the ssreflect reference manual, distributed with Coq.
An $intro_pattern is a list of operations to perform on the top of stack, we have the following cases:
  • $id: if a name is there, the hypothesis will be introduced in the context with name $id (if such name is free)
  • [pat | ... | pat]: this will tell Coq to try to destruct the hypothesis if it is an inductive, it is similar to doing case. If the hypothesis is not an inductive, it errors. For equality, it will try to run injection
  • $intro_pattern | ... | $intro_pattern: when tac produces several goals, this will apply each intro_pattern to the goal. Cardinalities have to match.
  • or <-: this will try to rewrite the rest of the goal with the hypothesis, if it is an equality.
  • /$id: this will try apply to the top of goal the view $id (more on views later)
  • //: this will try to solve the goal with the done tactic.
  • /=: this will call simpl to simplify the goal.
For the purposes of this tutorial, we will consider a $gen_pattern as a list of terms; whose types will be placed at the top of the goal, right to left.
As you will see, while the patterns look a bit cryptic, they will allow us for very compact and efficient context management.

Moving backwards, or forward?

A last thing to consider before starting to look into the tactics is whether we are proving in "forward" or "backwards" style:
  • forward proving is usually done in mathematics, and it is when we transform the hypothesis until we obtain the goal
  • a backwards style of proving is when we transform the goal until it is one of the hypothesis.a
Of course, in many instances, we use a combination of the two styles.
Lemma forward_backwards P Q R (h0 : P) (h1 : PQ) (h2 : QR) : R.
Proof. Admitted.

Core tactics

We now briefly describe the core tactics that we will use:a

move Placeholder tactic

The move tactic is a placeholder for the cases we want to just do context manipulation. If used alone, it just calls puts the top of the goal in weak normal form.

by Closing proofs

The tactical by tac ensures that tac will solve the goal, modulo some trivial things.
It is very convenient for proof maintenance, as if a proof is broken it tends to keep the breakage local.

apply: using hypothesis and constants

The most primitive tactic in Coq is apply: H, with variants refine and exact . apply H will use hypothesis H : T to solve goal T, or if H : U T, it will perform backwards chaining. Examples:
Lemma move_ex1 : natnatnat.
Proof. by moven1 n2; apply: n2. Qed.
Print move_ex1.

Lemma move_ex2 (x : natnat) : (natnat).
Proof. by move: x; apply. Qed.
Print move_ex2.
Note that actually apply: name is really the composition of : with apply! So it will first introduce x at the top of the stack, then apply will work as RPN stack calculators!
While unusual, this is pretty efficient when writing proofs.
Lemma apply_ex1 n y (H : n = 3 → y = 4) (U : n = 1 + 2) : 4 = y.
Proof. by apply: esym; apply: H; apply: U. Qed.
A more compact form is (exact is basically an alias for by apply)
Lemma apply_ex2 n y (H : n = 3 → y = 4) (U : n = 1 + 2) : 4 = y.
Proof. exact/esym/H/U. Qed.
Note that as you get more familiar with the system, and read more high-quality proofs, you will discover countless ways to do things more efficiently. Coq can of course solve this goal automatically:
Lemma apply_ex3 n y (H : n = 3 → y = 4) (U : n = 1 + 2) : 4 = y.
Proof. Admitted.

have: adding partial results

Likely the most important tactic for building mathematical proofs is the have tactic, designed to perform effective forward reasoning. have is very useful in a lot of contexts, and has two basic functions:
  • have $pat := thm arg1 ... argn. will perform lemma instantiation, where thm is an already existing object which can take n (or more) arguments. This is also specially useful to debug apply going wrong.
  • have $pat : T. will start a proof for the intermediate lemma T, then add H : T to the assumptions of the main goal.
$pat can be empty, or an intro pattern (not exactly the same, but very close)
We usually use have as a counterpart for the paper form "let S be some Bar".
Have is a very powerful tactic and admits a few more forms and variants, such as suff and wlog, see the documentation for a complete reference.
Lemma have_ex1 n y (H : n = 3 → y = 4) (U : n = 1 + 2) : 4 = y.
Proof. Admitted.

Lemma have_ex2 n y (H : n = 3 → y = 4) (U : n = 1 + 2) : 4 = y.
Proof. Admitted.

rewrite replacing equals by equals

The second most important tactic for mathematical reasoning is rewrite. The idea of rewrite is simple, given a hypothesis H : a = b, rewrite H will replace all b for a in the goal.
rewrite -H does reverse the direction, and many other options are available to control unfolding and positions for replacing. The most common ones are:
  • rewrite {$occ_list}H: where $occ_list is a list of positions
  • rewrite [$rew_pat]H: where $rew_pat is a pattern of the subterm to rewrite
rewrite also admits the //, /= and ?H syntax to solve side goals and perform partial evaluation.
A common counterpart for simpl is rewrite /=, but in this case, rewrite supports all the pattern and occurrence selectors.
Lemma rewrite_ex1 m : m + 0 = m.
Proof. Admitted.

Lemma rewrite_ex2 m n p : m + n + p + m = m + m + n + p.
Proof. Admitted.
We will see some more rewrite examples along the class.

case and elim: case analysis and induction

As we have discussed, a proof by case analysis in Coq corresponds to a program that does pattern matching. Similarly, an induction proof will usually do case analysis + well founded recursion.
A particularity of case analysis and induction is that these are scoped operations. Let's see an example:
Lemma case_1 b1 b2 : b1 = b2b1 && b2 = b1.
Proof. Admitted.
Lemma case_2 T (x y : T) : x = yy = x.
Proof. Admitted.
Lemma case_3 P Q : PQQ.
Proof. Admitted.
Lemma case_4 T P Q : P =1 Q → ( (x : T), P x) → (x : T), Q x.
Proof. Admitted.
Getting the scope right is usually the most challenging part of case analysis.
Induction is a bit different, as it usually requires elim to infer
In general, we can safely state that the first rule of induction and proofs is to avoid induction as much as possible! Induction has bad dependency properties due to both depending on the shape of the current goal and the shape of the object we are inducting over.
Moreover, it is very often the case that a quite general lemma can be proven by induction, and the rest of the theory can use it.
Lemma elim_1 m n p : m + n + p = m + (n + p).
Proof. Admitted.
Lemma size_map T U (f : TU) l : size (map f l) = size l.
Proof. Admitted.
Hint: prove an auxiliary lemma
Lemma size_catC T (l s : seq T) : size (l ++ s) = size (s ++ l).
Proof. Admitted.

Domain-specific tactics

Coq has a lot more tactics that are domain-specific, a few examples:
  • constructor (variants split and ): these will try to apply the constructor for a goal that is an inductive
  • auto: performs automatic proof search using a database
  • intuition: solver for intuitionistic predicate calculus
  • lia,lra: artimetic tactics
There are many more tactics on the wild, so don't hesitate to in the manual.

Reflection and proving with computation

We will conclude this lesson by discussing an often overlooked fact, which is that Coq, as opposed to other theorem provers, can actually run very complex programs quite efficiently.
A good example is some arithmetic operations, let's compare the definition of less than in math-comp vs the definition of less than in Coq's stdlib, an in particular less focus on how the proofs differ:
Print le.
Print leq.

Lemma le_1 : 100 ≤ 200.
Proof. Admitted.

Lemma le_2 : leq 100 200.
Proof. Admitted.

Lemma lt_1 n : ~~ (n < n).
Proof. Admitted.

Lemma lt_2 n : ¬ (lt n n).
Proof. Admitted.
While this is a simple example, the same trade-off does apply to, for example, membership in a list.
The Mathematical Components Library provide special support for relating computable versions of propositions to propositional ones.
This relation is captured via the reflect datatype:
About Bool.reflect.
Print reflect.
We can think of reflect P b as "P provable if and if b = true".
Next lessons will take advantage of this predicate extensively, for now we illustrate the difference briefly with this example:
Lemma or4P_demo : [\/ 3 == 4, 4 == 2, 2 == 2 | 0 == 1].
Proof. Admitted.
Indeed, using reflection, we can replace a proof search using a tactic by computation.
It is good to check
About andP.

About mapP.