Lesson 1: Coq Core Topics and Tactics

In our first steps with Coq, we start by importing the required libraries and setting some parameters.

By default, Coq can only understand the quite bare-bones calculus of the Calculus of Inductive Constructions. This is a very "pure" language, but not very useful to do some more realistic examples.

In this course, we will use a mature Coq library, "The Mathematical Components Library" which will provide us with a rich theory on data types and mathematics.

From mathcomp Require Import all_ssreflect.

Set Implicit Arguments.
Unset Strict Implicit.
Unset Printing Implicit Defensive.

Definition sorry {T} : T. Admitted.

Interaction model:

Coq is an interactive theorem prover . That means that we submit commands to the system, and Coq will answer back with some feedback about our input.

The first thing we need to understand in order to get familiar with Coq is how to use this interface. For this class, we will use the jsCoq web-based system, which has a "classical" 3-pane setup.

The main window (left), contains a document with mixed text and Coq code. We can use the arrows in the right menu-bar to submit incrementally the code parts to Coq. jsCoq will highlight the code with different colors depending on the result Coq provides: - grey background means Coq accepted the sentence without problem - yellow background means Coq is processing the sentence - red means that Coq found a problem with the sentence

You can also use the Alt-N (for next) and Alt-P (for previous) keys for a quicker navigation; on Mac, the command key is used instead.

In addition to the colored feedback over the document, we have two panes in the right. The upper panel is usually called "goals panel", and is used to display the pending "proof obligations". The lower right panel is used to display miscellaneous information such as *error details*, search results, etc...

To illustrate this, we can try to prove a quite logical theorem:

Theorem x_implies_x : ∀ P, P → P.
Proof. by auto. Qed.

Basic Coq Commands:

Coq files are organized as mathematical documents using a vernacular language, which are basically a collection of definitions for top-level terms (or proofs) and their types (or propositions).

We will precisely introduce what types and terms are next, but before, we also present some helper commands useful to develop interactive proofs. The basic ones are:

Definition / Theorem:

The core "vernacular" command to interact with Coq is the Theorem/Definition command, which is basically a wrapper to build a new definition that will be checked by Coq's kernel.

The Theorem/Definition command has two forms:

Interactive form:

Theorem $name $arguments : $type. Proof. $tactics. Qed.

In this form, the user specifies the definition / theorem to build, and Coq will switch to "interactive" mode where the user will employ a set of tactics to synthesize the desired term.
Definitional form:

Definition $name $arguments : $type := $term

In this form, the user provides the full term directly.

The difference is best illustrated with an example:

Theorem x_implies_x_v1 : ∀ P, P → P.
Proof. by auto. Qed.

Theorem x_implies_x_v2 P : P → P.
Proof. by auto. Qed.

Definition x_implies_x_v3 : ∀ P, P → P := fun _ x ⇒ x.

Definition x_implies_x_v4 P : P → P := fun x ⇒ x.

It is possible to annotate arguments with types, so we can write:

Definition x_implies_x_v4' (P : Prop) : P → P := fun x ⇒ x.

Print:

The Print command allows us to inspect definitions, theorems, and data types. We can use it in the theorems we just defined:

Print x_implies_x_v1.
Print x_implies_x_v4.
Print x_implies_x_v4'.

Print bool.
Print nat.
Print list.

About:

The About command allows us to print just the type of an object. This is useful as often we only care about the statement of a theorem, not about its proof. You can try with the previous examples and compare to Print:

About x_implies_x_v1.
Print path.
About path.

Check:

About and Print display static metadata for Coq objects, the Check command allows us to actually run Coq's type checking and inference engine with arbitrary expressions. Some examples:

Check path.
Check (path leq 0 [:: 1; 2; 3]).
Check true.
Check (true : bool).

Search:

A very important command in practical use is Search, we will discuss Search more in the upcoming lessons, but we show a small example:

Search (_ ≤ _ + _).

Documentation

Unfortunately Coq doesn't have commands to query documentation. The coqdoc tool is often used to generate, but this is an area that needs urgent improvement.

Section:

The last command we will survey here Section, which helps grouping arguments / hypothesis when several definitions share them. When a section is closed, the definitions inside it are generalized to include these arguments

Section Example.

Variables (n : nat).

Lemma zero_is_left_id_for_plus : 0 + n = n.
Proof. by []. Qed.

Lemma zero_is_right_id_for_plus : n + 0 = n.
Proof. by apply/eqP; elim: n. Qed.

About zero_is_right_id_for_plus.
End Example.

About zero_is_right_id_for_plus.

Coq's Type theory and Functional Programming

As we have briefly discussed in the presentation, Coq is based on a functional calculus called the "Calculus of Constructions" which is a type theory. Type theories where invented in the early 20th century to avoid set-theoretical paradoxes, and were connected to logic and programming languages in the second half of the 20th century.

What is a type?

The exact definition of a type is an open research question. For the purposes of this tutorial, we can think of a type as a set.

In TT it is common to use the colon symbol to denote "some object p is member of a set T", which we write as

   p : T

instead of the more traditional p ∈ T .

From now on, we assume our objects to be programs. Then, we can write 3 : nat , true : bool , and fun x ⇒ x : nat → nat . We can also write true : nat which seems problematic!

Coq's theory has the ability to assign types to objects under assumptions. This is in a sense the characterization of "functional". So we define environments Γ as lists of type assignments for variables x₁ ... xₙ such that Γ := (x₁ : T₁) ... (xₙ : Tₙ) .

We call the resulting ternary relation "a typing judgment":

   Γ ⊢ p : T

which reads as "p is a program of type T under assumptions Γ". This the most important rule in the metatheory of Coq, and of many programming languages.

Membership into the ⊢ relation is usually governed by a set of typing rules, usually written in deductive style. For example, we can write this rule for pairs:

   Γ ⊢ p₁ : T₁     Γ ⊢ p₂ : T₂
  ─────────────────────────────
    Γ ⊢ ⟨ p₁ , p₂ ⟩ : T₁ × T₂

which states that if the upper part of the rule is derivable, so is the conclusion or bottom part.

Given a set of typing rules, a type-checker is a tool that will determine whether a triple Γ , p , T belongs to the ⊢ relation. This process can be decidable or not.

Proofs as types

The key idea that makes Coq usable as a proof checker is the "Curry-Howard" correspondence, which states that types can be understood as logical propositions, and programs can be understood as proofs.

Then, we can read:

   Γ ⊢ p : T

as "p is a proof for proposition T under assumptions Γ", and our typing rules determine our logical theory. It is usually desired for our logical theory to be consistent, that is to say, to require it can't derive a type T and its negation. This is often not the case for type-systems in programming languages, and they for example allow well-typed programs not to terminate, which amounts to a circular proof.

Under this optic, Coq is a tool that is often used to write down a theorem T, and synthesize a suitable p, such that "∅ ⊢ p : T" is derivable.

In addition to the Curry-Howard correspondence, it is also useful to think of truth in the Coq system as an equivalent principle named the Brouwer–Heyting–Kolmogorov interpretation of intuitionistic logic, sometimes called BHK of "realizability" interpretation.

Let's assume that propositions T are drawn from the language of first-order logic. Then assume that we have some Turing-complete language such as the untyped lambda calculus or a Partial Combinatory Algebra, with a distinguished pairing operator, then:

a proof of "P₁ ∧ P₂" is a tuple < p₁, p₂ > where pᵢ is a proof of i.
a proof of "P₁ ∨ P₂" is a pair <s,p> where if s (a boolean) is true then p is a proof of P₁, else p is a proof of P₂.
a proof of "P -> Q" is a function that takes as input a proof of P and returns a proof of Q.
a proof of "∀ x : S, ϕ(x)" is a function that converts an element s of S into a proof of "ϕ(s)"; note the similarity with the interpretation of implication !
a proof of "∃ x : S, ϕ(x)" is a tuple <w,p> where w is an element of S and p is a proof of "ϕ(w)"

In the next section, we will codify this intuition using typing rules.

Coq's programs and types

For better or worse, users need to have a basic understanding of the typing judgment of Coq as to be able to debug errors and generally have a mental model of how their program or proof is going to be checked.

This is "leaky" in a sense, but provides the very important advantage of having proofs objects that can be checked by a minimal kernel implementing Coq's typing rules.

The type universe

The first rule of Coq is that we have two special types Prop and Type, for types that are "logical" and types that are "computational". This is a key feature of Coq, and often confuses users.

So for example, Coq validates nat : Type and P : Prop, Q : Prop ⊢ P ∧ Q : Prop The distinction between Prop and Type is quite technical, and at some point the Coq user will be required to learn more about it.

We can summarize the intuition briefly here: types that are propositions "hide" their proofs, thus it is not possible to observe externally when two proofs are different. This provides the advantage of allowing "logical impredicativity", where the class of formulas is itself a formula. Types in Type do allow observation, but then the class of such types becomes necessarily stratified.

A type inside a type?

Indeed, you may have guessed that Prop and Type are quite strange animals, they are set of types, that contain types!

That seems a bit paradoxical. Indeed, Prop and Type are special in Coq's calculus and can be thought of as proper classes instead of sets. Moreover, the typing relation is asserted usually twice, for example imagine 3 : nat and nat : Type , so in a sense we have 3 ∈ nat ∈ Type .

It is a common error when starting with Coq to arrive to situation where we send to the system 3 : Type, which will fail, nat : nat :

Check (3 : nat : Type).
Fail Check (nat : nat).
Fail Check (3 : Type).

Type formation rules

The main type formation rule for Coq is:

    Γ ⊢ P : Prop   Γ, (x : P) ⊢ Q : Prop
    ────────────────────────────────────
       Γ ⊢ forall (x : P), Q : Prop

Prop vs bool

A common question is whether propositions in Coq are boolean valued.

For example, assume n m : nat, should n = m in bool or in Prop ?

Recall that a proof in Coq is a program. Thus, if we place n = m in bool it means that when we assign a value to these variables, the proof will be an element of bool that is say, true or false. Thus, we have just declared with our types that equality of natural numbers is decidable.

However, Coq is not limited to decidable propositions at all, and this is where Prop is required. In modern practice, there is a tendency to assign the type bool to most constructively boolean propositions.

For example, for the main formation rule, we can only assign the type bool to ∀ (x : P), Q when P is a type of finite cardinality and Q can be assigned type bool too.

Gallina, Coq's programming language

We now proceed to introduce Coq's programming language "Gallina". Gallina allows the definition of functions, propositions, and data in a way that is well-suited for the construction of proof objects.

Recall that the main goal of Coq's type-checker is to verify that a given Gallina program is valid, that is to say, can be assigned some type T.

Given the input Γ ⊢ p : T, Coq's type checker will return yes/no.

Let's now see Gallina's syntax and typing rules.

Variables

Variables in Gallina are just identifiers, and the typing rule will check that they are bound by doing a lookup in the environment, which will determine its type:

   (x : T) ∈ Γ
  ─────────────
    Γ ⊢ x : T

Creating functions / reasoning under hypotheses

Functions in Gallina are specified using a "lambda notation". For example, the identity function is written λ x → x , or in textual form fun x ⇒ x.

In words of Dana Scott, "the lambda calculus captures the intuition of reasoning under a hypothesis". And indeed, we can see that in its typing rule:

   Γ, (x : T) ⊢ e : U     Γ ⊢ T : Prop    Γ, (x : T) ⊢ U: Prop
  ─────────────────────────────────────────────────────────────
           Γ ⊢ λ (x : T) . e : forall (x : T), U

which reads "λ x . e" has type "forall (x : T), U" under Γ if "e" has type "U" under Γ extended with assumption (x : T), and the types are properly built.

When U doesn't depend on x, Coq will render ∀ (x : T), U as T → U.

We can resort to Coq to check some lambda terms:

Check (fun x ⇒ x).
Check (fun (x : nat) ⇒ x).
Check (fun (x : nat) ⇒ erefl x).

Function application / modus-ponens

   Γ ⊢ f : ∀ (x : T), U      Γ ⊢ e : T
  ─────────────────────────────────────
           Γ ⊢ f e : U[e/x]

which reads "e applied to f has type U with e substituted for x if f has the right functional type and e has the right argument type".

We can just check some of the above terms, now with an argument applied:

Check ((fun (x : nat) ⇒ erefl x) 3).

In the case where the type doesn't depend on the argument (non-dependent case), we have:

   Γ ⊢ f : T → U   Γ ⊢ e : T
  ───────────────────────────
        Γ ⊢ f e : U

we can combine some rules to get:

   Γ ⊢ f : T → U     Γ ⊢ e : U → V
  ─────────────────────────────────
   Γ ⊢ λ (x : T) , e (f x) : T → V

which is the familiar modus-ponens rule.

Inductive Data Types

The above rules complete the presentation of the Calculus of Constructions! While we are missing some details related to Type, it is indeed remarkable how minimal it is, for the amount of logical power it packs.

You may wonder "but what about the logical primitives?" "Where is negation, where is or? Where is bool?"

That's a great question, and indeed so far we have been using some primitives that are not part of the above presentation. The reason is that types such as bool, and connectives such as or, or negation, can be encoded in the calculus we presented.

For example, not and bool can be defined by using their "elimination principle":

Definition imp_not (P : Prop) := ∀ (R : Prop), P → R.
Definition imp_bool (P : Prop) := P → P → P.

The other connectives are encoded similarly. However, these encoding are pretty cumbersome to work with, and moreover they have bad computational properties.

Thus, the original Coq Calculus is the extension of the core with inductively defined "data types", similar to the "algebraic data types" found in languages such as OCaml, Haskell, or Rust. Coq's inductives tho have some unique features due to the proof-theoretic nature of Coq which we will explore in the next lesson.

Today, we will focus on more "vanilla" inductive data types.

Inductive definitions

While so far we have been using Gallina to define programs, inductives are truly "pieces of data". For example, we can define the natural numbers in Coq in the following way:

Inductive natural : Type :=
| o : natural
| s : natural → natural.

Let's examine the definition carefully: the first line declares to Coq natural as a type, thus extending the typing rules to admit natural : Type.

The second part of the declaration is what we call "constructors", that is to say, ways to build a natural number. The first constructor is the constant `O` , which is by itself a natural, and the second constructor is S, which takes a natural. So, if x : nat, then S x : nat.

Coq's typing rules are extended so o : natural, and x : natural ⊢ s x : natural.

Constructors can be seen as "tags" that represent different cases, so their name is a user choice. We can as well write:

Inductive natural' : Type :=
| zero : natural'
| succ : natural' → natural'.

or even

Inductive weird_natural : Type :=
  | zero' : weird_natural
  | succ' : weird_natural → weird_natural
  | weird : weird_natural → weird_natural → weird_natural.

The key point is that this specification fully determines what the elements of the natural type are. For the first case, we have that if x : natural, x has to be either O or S x', where x' is another natural.

Coq allows us to define quite arbitrary inductives, and they don't necessarily have be possible to inhabit. For example, this is a valid definition:

Inductive oups : Type :=
| not_possible : 0 = 1 → oups.

The elements of opus are all the proofs of 0 = 1, and as we know Coq is consistent, there is no such proof. So while we could define the data type, we will never be able to build one of its elements.

So inductives extend the "programs are proof" motto to "data is proof". Inductive data types one of the key points in the success of Coq as a tool, as they allow us to model mathematical universes pretty precisely, while at the same time keeping the guarantee that we can't build some data that would lead us to inconsistency.

This is not so easy, given that you may have noticed that for example the natural is recursive in nature. Indeed, Coq places some extra conditions on which inductives are admissible. Most common one is called "the positivity condition".

Dually, when analyzing / consuming some data that is recursive, now Coq has to check that the function doing this is well-founded and thus it will terminate.

Case analysis

Provided some element of an inductive data type, we want to be able to analyze it in base of its declared constructors. For example, given a natural number, we may want to compute if the number is even. Or if our data represents some possible behaviors of a system, equivalently we may want to do a proof by case analysis.

Coq supports by means of the match statement:

Fixpoint even (n : nat) : bool :=
  match n with
  | O ⇒ true
  | S 0 ⇒ false
  | S (S n) ⇒ even n
  end.
Compute (even 10, even 9).

Note that instead of Definition, here we use the Fixpoint keyword, which does tell Coq that our definition is recursive. Coq will check that the definition indeed does terminate. If we try a definition that Coq can't see terminating, Coq will complain:

Fail Fixpoint loop (n : nat) : bool := loop n.

We can disable this check and get a proof of False

#[bypass_check(guard)]
Fixpoint loop (n : nat) : False := loop n.

We can do case analysis for the bad datatype too to obtain a proof of False:

Definition oups_is_bad (o : oups) : False :=
  match o with
  | not_possible p0eq1 ⇒ O_S _ p0eq1
  end.
About O_S.

Logic as Data

Coq's core calculus only has the ∀ logical connective, how can we define the rest of operations as data? Indeed, we can define them as follows:

Module MyDefs.

Inductive False :=.

Definition not P := P → False.

Inductive and (P Q : Prop) :=
| And : P → Q → and P Q.

Inductive or (P Q : Prop) :=
| Left : P → or P Q
| Right : Q → or P Q.

Inductive exist (A : Type) (P : A → Prop) : Type :=
ex : ∀ x : A, P x → exist P.

End MyDefs.

Note that the above definitions match very closely our semantic definition about what a proof is. For example, a proof of or P Q is either a proof of P, tagged with the Left constructor, or a proof of Q, tagged with Right

Specifications in the Coq's Mathematical Universe

A very interesting and powerful use case of inductive definitions is building models of our specifications and logical theories.

Imagine we have the axioms for a commutative magma, that is to say, a type with a binary endo-operator which happens to be commutative. We can specify that as:

Record cmagma :=
  { A :> Type
  ; op : A → A → A
  ; op_comm : commutative op
  }.

As far as we are concerned, this is just a piece of data that packs a type, an operator, and an axiom of the operator. Can we construct such a structure with a type that is not void?

The answer is yes:

Print cmagma.
Definition nat_magma := {| A := nat; op := addn; op_comm := addnC |}.
Check (3 : nat_magma).

We will dwell into this more in the next lesson, but in general we can understand elements of inductive types like elements of a model of Coq's coherent mathematical universe.

Computation and equality

So far, we have been using "computation" and equality in a quite informal sense. The notion of computation in Coq mainly corresponds to the notion of computation for the lambda calculus, that is to way, provided a function f : U → T and a term x : T, the application f u : T will compute by substituting u for the variable in the definition of f.

There are several strategies in terms of evaluation strategies, but the main tricky part is that case analysis can't progress when a variable is in a match position:

Eval cbv in (fun b ⇒ orb b true).
Unset Printing Notations.
Eval cbv in (fun (b : bool) ⇒ orb false true).
Set Printing Notations.

The first result tells us that the if is waiting on b to get a value, and thus we can't progres as we do in the second case. This is logical and expected.

However, it is well-known that in the presence of multiple parameters, we have to bias evaluation.

Something we skimmed over (on purpose) so far was the question of equality of Coq programs. But if you look at the typing rules, it turns out that Coq needs to decide when typing when two types and programs are equal.

Usually, we say that t1 is convertible to t2 if there is a term t such that t1 and t2 will evaluate to it, that is to say:

   ∃ t, t₁ ↦ t ∧ t₂ ↦ t
  ──────────────────────
         t₁ ≈ t₂

for example true || false ≈ true, 3 + 3 ≈ 6, and n.+1 ! ≈ n.+1 × n!, etc...

Thus, when typing, Coq will consider two types equal when they are convertible:

   T ≈ U      Γ ⊢ p : T
  ──────────────────────
        Γ ⊢ p : U

Having convertibility in the type rules as a built-in is extremely useful and another reason for the success of Coq. However, there is a great tension between equality and computation, as we will short illustrate.

Equality, briefly

Equality in type theory is also an advanced research topic, with many possible definitions. Coq defines "user-level" or "propositional" (as opposed to computational) equality as a data type

Print eq.

When combined with the conversion typing rule, we have that all terms that are convertible are equal. However the converse doesn't hold, as we can strictly equate more terms propositionality than by conversion.

We will see more examples in the next lesson, but the following example illustrates the problem:

Section EqComp.
Variable (n : nat).
Check (erefl : 0 + n = n).
Fail Check (erefl : n + 0 = n).
Print Nat.add.
End EqComp.

In the second case, computation blocks. That may be surprising but it is due to + checking the left argument first. As it is a variable, it can't decide which case it is.

Another way to understand this mismatch is by thinking of computation as directed vs equality as undirected.

Proofs by computation

Despite its shortcoming, computation inside Coq is extremely useful to do proofs; when our term is variable-free, we will get a normal form, and that has been used extensively to automate large proofs.

Moreover, we can write algorithms inside Coq and prove their properties, then run them, and obtain a proof over the result. To do a proof by computation, we just need to tell Coq that our proof term is the equality constructor, erefl:

Definition plus33eq6 : 3 + 3 = 6 := erefl.
Definition list_app : [:: 3; 5; 6] = rev (rev [:: 3; 5; 6]) := erefl.

Coq's front-end

We will finish this part by looking into some more internal Coq implementation details, but that arise quite often in the real world. In particular, we will describe the two "elaboration passes" which are used by Coq to go from parsing to core "kernel" terms.

Extensive use in the real world of Coq parsing extension and notations and type inference requires even the casual user to have some knowledge of this layers.

Notations and Implicit Arguments

Contrary to most programming languages, Coq's syntax and parser is highly extensible. This is a must in order to formalize non-trivial mathematics, as good readability is required.

Users and libraries can declare Notations, which extend the parser, and will add some unsugaring rules Coq's first elaboration pass.

Notations can mask some arguments, which then Coq will interpret as holes, to be resolved in the next phase.

Example of notations are numbers, lists, logical operators, etc...

The command Set Printing All. is of great help in the cases where the unsugared output is required to understand why something doesn't work.

Set Printing All.
Theorem or_commutative P Q : P ∨ Q → Q ∨ P.
Proof. by tauto. Qed.
Unset Printing All.

Theorem eq_comm (T : Type) (x y : T) : x = y → y = x.
Proof. by []. Qed.
About eq_comm.
Unset Printing All.

Another important feature that users face a lot are implicit arguments. In many cases, like in equality, the first argument (base type of the equality) will be uniquely determined by the next argument. Thus, Coq allows us to declare this argument as "implicit", and a hole will be inserted for it automatically. Implicits are marked by using braces on the argument declaration, or by a explicit command.

Thus, these two expressions are equivalent eq 3 and @eq nat 3.

Type inference

Coq has a quite advanced type inference engine. By virtue of Coq types containing terms, Coq's type inference engine is also a program synthesis engine on its own.

Type inference is very important as Coq's kernel does require every lambda abstraction to be annotated with the type in order to type-check it.

Moreover, it is often common to omit types of arguments for better readability and let Coq figure out using constraints and unification, and the first phase of implicit argument resolution will produce many "type holes" to be filled by the inference engine.

While it is out of scope for this tutorial, making an effective use of Coq's type inference engine is essential to achieve an efficient proof style in the real world, both at the user level and at the tactic level.

Among others, Coq's type inference engine support bidirectional inference, coercions, type classes, type-directed synthesis of records, and higher-order unification.

Fail Lemma foo : x = x.
Lemma foo' x : x = 1.
Admitted.

We will now move to learn the main Coq tactics and different proof styles using them. Before starting, let's quickly summarize the core Coq syntax we have just seen:

Coq's Syntax Cheat-sheet:

The canonical reference is the Coq manual https://coq.inria.fr/refman/

Core typing judgment:

   Γ ⊢ p : T

p has type T under hypothesis Γ . Checking this is the main purpose of Coq's kernel.

Vernacular Commands:

        id ::= $string
       thm ::= Definition | Theorem | Lemma | Example

   command ::= $thm $id $argument* (: $type)? . Proof.? $tactic*. $end .
             | $thm $id $argument* (: $type)? := $term .
             | Fixpoint $id $argument+ (: $type)? := $term.
             | Print $id
             | About $id
             | Check $id
             | Search $search_options

   argument ::= $id | ( $id : $type )

        end ::= Qed | Defined | Admitted

Terms and types:

     kind ::= Prop                             collection of all "formulas"
            | Type                             collection of all "types"

     term ::= $kind
            | $id                              variable or constant
            | forall $argument+ , $type        Pi Type (or type function)
            | fun $argument+ => $term          regular function
            | $term $term                      application
            | match $term with $branch* end.   pattern matching

   branch ::= | $id+ => $term

     type ::= $term                 such that "$term : $kind"

The type A → B is notation for ∀ (_ : A), B.

Inductives

   inductive_decl ::= Inductive $id $argument* (: $type)? ::= $constructor_decl*

 constructor_decl ::= | $id : $type            where $type has to belong
                                               to the inductive being declared

Some core defined datatypes:

bool: Type of booleans (either true or false) Some boolean ops: a && b, a || b, a ==> b, ~~ b.
nat: Type of Peano Numbers (either O or n.+1) Some nat ops: a + b, a - b, a - b, a ^ b, a %% b, a /% b.
option A: Either Some x , with x : A or None
list A: Lists of type A Some list ops: l ++ m, map f l, filter p l.

Some core defined logical connectives:

x = y: equality
True, False: Basic propositions
¬ P : negation, defined as P → False
P ∧ Q: conjunction
P ∨ Q: disjunction
∃ (x : T), P x: exists (where P a predicate on values of type T

Writing proof terms, tactics, and the proof cycle:

In the previous lesson, we did learn what it means to be a proof. We can now try to prove some simple logical statements writing their corresponding program:

Definition logic1 P Q : P → Q → P := sorry.
Definition logic2 (P : Prop) : P → ¬ ¬ P := sorry.
Definition nat_inv : ∀ n, n = 0 ∨ ∃ m , S m = n := sorry.
Definition forall_dist T P Q :
(∀ (x : T), P x ∧ Q x ) →
(∀ (x : T), P x ) ∧ (∀ (x : T), Q x ) := sorry.

Writing proof terms by hand this way quickly turns into a cumbersome endeavor, as writing proof terms is not conceptually very close to the way humans think of formal proof steps.

Additionally, Coq is a direct descendent from the LCF family of provers, which are built from a different philosophy: their kernel provides a set of "tactical", which are meant to represent human proof steps. In this setting, users start from a "goal", and apply tactics to refine the goals until the end has been reached. Such tactics are trusted, and soundness is kept thanks to some abstraction mechanism in the host language.

For both this reasons, Coq quickly gained support for tactics, which in Coq's case, are term-synthesis procedures. In Coq's case, the initial goal will be the type or proposition we want to build a proof for, and we will use tactics to gradually built the term. Goals correspond to holes in the proof, and we are done when a complete program for the target goal has been built.

Tactics simply user life considerably, as in non-trivial cases, writing low-level proof terms quickly becomes a very complex and prone-to-error task.

The final step of this "proof cycle" is sending the newly-built terms to kernel, as to ensure the term is really a proof of the initial goal.

A typical example of using a tactic is to prove a theorem of the form A ∧ B ; we can use the split tactic, which will apply the introduction rule (or constructor) for and and produce 2 separate goals A and B.

Summarizing:

1. user states a particular proposition to be proved, submit to Coq this proposition of "goal"

2. if the proposition itself can be typed (has type Prop or Type), Coq will start the interactive mode, allow tactics to be enter

3. users can then provide a partial term, or call a tactic to synthesize one for them

4. Coq will provide feedback on the newly submitted term, and either produce an error, or update the goal and go back to step 3.

5. If there is no goal, we are done; we will close out proof with the `Qed` command and Coq will re-check the final proof as to catch any possible bugs with tactics.

Let's play a bit with tactics to prove a simple fact:

Theorem forall_dist' T P Q :
(∀ (x : T), P x ∧ Q x ) →
(∀ (x : T), P x ) ∧ (∀ (x : T), Q x ).
Proof. Admitted.

Proofs and telescopes: moving things around

At this point, you may have realized that proving things in Coq is a very combinatorial process. For example, in the above proof, all that we do is to reorder the arguments, and pass some parameters. With inductive data, the proof process usually involves packing and unpacking such data:

Theorem forall_exists T (P : T → Prop) :
(∀ x, P x ) → ∃ x , P x.
Proof. Admitted.

This process can often feel like assembling a puzzle! And indeed, it is in a sense very close to it.

But a key consideration of our puzzle is that some pieces do depend on others. Coq is often called a "dependently-typed programming language", as indeed, types can depend on terms. For example, consider the following type:
∀ (x y : nat) (x_eq_y : x = y) P (px_true: P x), P y

as we can see, px_true does depend on P and x. So we have no problem in shuffling our statement like this:
∀ (x y : nat) P (px_true: P x) (x_eq_y : x = y), P y

however, we cannot do this, as then Coq will complain P is not known:
∀ (x y : nat) (px_true: P x) P (x_eq_y : x = y), P y

Thus, we working with assumptions, we must be aware of the dependencies between hypothesis!

In fact, a large part of building a mechanized proof effectively is relies on having the proper tools for the management of context and assumptions, both locally and globally. This process is not an exact science, and a few "styles" can be found in the wild.

Styles of proof writing and management greatly differ in several accounts, such as:

robustness to change: proofs are living objects, and often subject to continuous refactoring and refinement. Different styles provide different trade-offs in this aspect
verbosity: some proof styles optimize for the writer, some others for the reader of the proof
efficiency: some tactics and styles can be orders of magnitude faster
automation: some proof styles favor highly "automated" styles, whereas others follow what's written very closely.

In this lesson, we will make a quite opinionated choice, and select ssreflect as our proof language of choice. ssreflect and the mathematical component library were built for the development of large-scale mathematical proofs, and favor a highly-modular, robust, and efficient style. Moreover, the maturity of the proof language is quite high.

The ssreflect proof language is quite different from the one loaded by default from the Coq prelude, and often misunderstood, as many of its design rationales are maybe not documented in the most public places. In particular, I would highlight the following features:

principled "strict" approach to telescope management: ssreflect doesn't allow to perform implicit ill-formed telescope / context manipulation. If the user desires to do, they must indicate it explicitly.
efficient assumption processing: ssreflect favors to process the assumptions just as they appear, for example, instead of adding an equality to the context, we may as well just rewrite with it and dispose the same moment we see it; similarly for assumptions that need case analysis. Moreover, this approach minimizes the amount of names that user needs to assign, creating more robust proofs.
strict naming policy: all objects introduced in the proof context must be explicitly named, or forbidden to be referenced by name.
efficient use of space: ssreflect favor a 80-column writing style, and a reasonably compact proof writing style, getting closer to human-workflow
separation of implementation and specification: this usually results in more robust proofs. For example, it is not allowed to depend on the enumeration of a finite type, but only on some permutation of it.
orthogonality / compositionality: ssreflect tries to minimize the amount of tactics, and tries to ensure they compose properly.

In particular, I'd dare to say that point one corrects a design flaw of Coq, which will be discussed more in-depth in the case analysis section.

The assumption / telescope stack

The first thing that the ssreflect proof language provides a set of "tacticals" operators, : and ⇒ which will allow tactics to perform context manipulation.

Usually, our starting goal will have the following shape:

=========================================================================
   forall (x : T1) (y : T2 x) ... (z : Tn x y ...), Goal x y ... z

Note how in this case, each assumption (and the terminal Goal) does depend on all the previous one. We can place all the hypothesis in the assumption context using move⇒ hyp1 ... hypn, which results in a goal:

  hyp1 : T1
  hyp2 : T2 hyp1
  ...
  hypn : Tn hyp1 hyp2 ...
==============
   Goal hyp1 hyp2 ... hypn

note that this is equivalent to synthesizing a term fun hyp1 ... hypn ⇒ _ , where _ is a hole. We can do move: hyp1 ... hypn to revert the goal to its original state.

We refer to ⇒ and : as operators as indeed they always appear together with a tactic:

tac⇒ $intro_pattern this form will run tac , then apply the $intro_pattern to the elements in the goals.
tac: $gen_pattern this form will push the elements of $gen_pattern to the goal, then run tac.

When a tactic appears alone, for example move, it usually acts on the top of the goal / stack.

A special case is rewrite which can be seen as operating over the whole goal at a time.

Note that these tacticals are not tactics proper in the sense that just the arrow and the colon can just operate on the assumptions, we still need a tactic to operate in the goal. It is very important to keep this "polarity" in mind when doing proofs with math-comp. Usually, the operations that we can do on the assumptions are the dual of the ones we can do in the conclusion.

Introduction and generalization patterns

We will briefly describe what $intro_pattern and $gen_pattern are, for a complete reference you can consult the ssreflect reference manual, distributed with Coq.

An $intro_pattern is a list of operations to perform on the top of stack, we have the following cases:

$id: if a name is there, the hypothesis will be introduced in the context with name $id (if such name is free)
[pat | ... | pat]: this will tell Coq to try to destruct the hypothesis if it is an inductive, it is similar to doing case. If the hypothesis is not an inductive, it errors. For equality, it will try to run injection
$intro_pattern | ... | $intro_pattern: when tac produces several goals, this will apply each intro_pattern to the goal. Cardinalities have to match.
→ or <-: this will try to rewrite the rest of the goal with the hypothesis, if it is an equality.
{ id }: this will clear the id from the list of hypothesis
/$id: this will try apply to the top of goal the view $id (more on views later)
//: this will try to solve the goal with the done tactic.
/=: this will call simpl to simplify the goal.

For the purposes of this tutorial, we will consider a $gen_pattern as a list of terms; whose types will be placed at the top of the goal, right to left.

As you will see, while the patterns look a bit cryptic, they will allow us for very compact and efficient context management.

Moving backwards, or forward?

A last thing to consider before starting to look into the tactics is whether we are proving in "forward" or "backwards" style:

forward proving is usually done in mathematics, and it is when we transform the hypothesis until we obtain the goal
a backwards style of proving is when we transform the goal until it is one of the hypothesis.a

Of course, in many instances, we use a combination of the two styles.

Lemma forward_backwards P Q R (h0 : P) (h1 : P → Q) (h2 : Q → R) : R.
Proof. Admitted.

Core tactics

We now briefly describe the core tactics that we will use:

move: Placeholder tactic

The move tactic is a placeholder for the cases we want to just do context manipulation. If used alone, it just calls puts the top of the goal in weak normal form.

by: Closing proofs

The tactical by tac ensures that tac will solve the goal, modulo some trivial things.

It is very convenient for proof maintenance, as if a proof is broken it tends to keep the breakage local.

apply: using hypothesis and constants

The most primitive tactic in Coq is apply: H, with variants refine and exact . apply: H will use hypothesis H : T to solve goal T, or if H : U → T, it will perform backwards chaining, that is to say, the new goal will be U. Examples:

Lemma move_ex1 : nat → nat → nat.
Proof. by move⇒ n1 n2; apply: n2. Qed.
Print move_ex1.

Lemma move_ex2 (fn : nat → nat) : (nat → nat).
Proof. by move: fn; apply. Qed.
Print move_ex2.

Note that actually apply: name is really the composition of : with apply, which acts on top of the stack! So the previous example will actually first introduce fn at the top of the stack, then apply will try to use it. This is akin to how RPN stack calculators work!

While unusual, this is pretty efficient when writing proofs, and moreover, composes pretty well with the goal/stack manipulation operators.

Lemma apply_ex1 n y (H : n = 3 → y = 4) (U : n = 1 + 2) : 4 = y.
Proof. by apply: esym; apply: H; apply: U. Qed.

A more compact form is (exact is basically an alias for by apply)

Lemma apply_ex2 n y (H : n = 3 → y = 4) (U : n = 1 + 2) : 4 = y.
Proof. exact/esym/H/U. Qed.

Note that as you get more familiar with the system, and read more high-quality proofs, you will discover countless ways to do things more efficiently. Coq can of course solve this goal automatically:

Lemma apply_ex3 n y (H : n = 3 → y = 4) (U : n = 1 + 2) : 4 = y.
Proof. Admitted.

have: stating partial results

Likely the most important tactic for building mathematical proofs is the have tactic, designed to perform effective forward reasoning. have is very useful in a lot of contexts, and has two basic functions:

have $pat := thm arg1 ... argn. will perform lemma instantiation, where thm is an already existing object which can take n (or more) arguments. This is also specially useful to debug apply when it goes wrong.
have $pat : T. will start a proof for the intermediate lemma T, then add H : T to the assumptions of the main goal.

$pat can be empty, in which case the result will be added to the top of the stack, or an intro pattern (see the manual for more information, as have accepts additional patterns)

We usually use have as a counterpart for the paper form "let S be some Bar".

Have is a very powerful tactic and admits a few more forms and variants, such as suff and wlog, see the documentation for a complete reference.

Lemma have_ex1 n y (H : n = 3 → y = 4) (U : n = 1 + 2) : 4 = y.
Proof. Admitted.

Lemma have_ex2 n y (H : n = 3 → y = 4) (U : n = 1 + 2) : 4 = y.
Proof. Admitted.

rewrite replacing equals by equals

The second most important tactic for mathematical reasoning is rewrite. The idea of rewrite is simple, given a hypothesis H : a = b, rewrite H will replace all b for a in the goal.

rewrite -H does reverse the direction, and many other options are available to control unfolding and positions for replacing. The most common ones are:

rewrite {$occ_list}H: where $occ_list is a list of positions
rewrite [$rew_pat]H: where $rew_pat is a pattern of the subterm to rewrite

rewrite also admits the //, /=, ?H, and /id syntax to solve side goals and perform partial evaluation.

A common counterpart for simpl is rewrite /=, but in this case, rewrite supports all the pattern and occurrence selectors.

Let's see some examples of rewrite and the rewrite patterns in action:

Lemma rewrite_ex1 m : m + 0 = m.
Proof. Admitted.

Lemma rewrite_ex2 m n p : m + n + p + m = m + m + n + p.
Proof. Admitted.

We will see some more rewrite examples along the tutorial.

case and elim: case analysis and induction

As we have discussed, a proof by case analysis in Coq corresponds to a program that does pattern matching. Similarly, an induction proof will usually do case analysis + well founded recursion.

A particularity of case analysis and induction is that these are scoped operations. Let's see some examples:

Lemma case_1 b1 b2 : b1 = b2 → b1 && b2 = b1.
Proof. Admitted.
Lemma case_2 T (x y : T) : x = y → y = x.
Proof. Admitted.
Lemma case_3 P Q : P ∧ Q → Q.
Proof. Admitted.
Lemma case_4 T P Q : P =1 Q → (∃ (x : T ), P x ) → ∃ (x : T ), Q x.
Proof. Admitted.

Getting the scope or invariant right is usually the most challenging part of case analysis and induction.

In general, we can safely state that the first rule of induction and proofs is to avoid induction as much as possible!

Induction has bad dependency properties due to both depending on the shape of the current goal and the shape of the object we are inducting over.

Moreover, it is very often the case that a quite general lemma can be proven by induction, and the rest of the theory can use it. Let's prove some statements by induction:

Lemma elim_1 m n p : m + n + p = m + (n + p).
Proof. Admitted.
Lemma size_map T U (f : T → U) l : size (map f l) = size l.
Proof. Admitted.

Hint: prove an auxiliary lemma

Lemma size_catC T (l s : seq T) : size (l ++ s) = size (s ++ l).
Proof. Admitted.

Domain-specific tactics

Coq has a lot more tactics that are domain-specific, a few examples:

constructor (variants split and ∃): these will try to apply the constructor for a goal that is an inductive
auto: performs automatic proof search using a database
intuition: solver for intuitionistic predicate calculus
lia,lra: arithmetic / simplex tactics

There are many more tactics used in the wild, so don't hesitate to look in the manual.