Records: Adding Records to STLC

Adding Records

We saw in chapter MoreStlc how records can be treated as just syntactic sugar for nested uses of products. This is OK for simple examples, but the encoding is informal (in reality, if we actually treated records this way, it would be carried out in the parser, which we are eliding here), and anyway it is not very efficient. So it is also interesting to see how records can be treated as first-class citizens of the language. This chapter shows how.

Recall the informal definitions we gave before:

Syntax:

t ::= Terms: | {i1=t1, ..., in=tn} record | t.i projection | ...

v ::= Values: | {i1=v1, ..., in=vn} record value | ...

T ::= Types: | {i1:T1, ..., in:Tn} record type | ...

Reduction:

ti ==> ti'

(ST_Rcd) {i1=v1, ..., im=vm, in=tn, ...} ==> {i1=v1, ..., im=vm, in=tn', ...}

t1 ==> t1'

(ST_Proj1) t1.i ==> t1'.i

(ST_ProjRcd) {..., i=vi, ...}.i ==> vi

Typing:

Gamma |- t1 : T1 ... Gamma |- tn : Tn

(T_Rcd) Gamma |- {i1=t1, ..., in=tn} : {i1:T1, ..., in:Tn}

Gamma |- t : {..., i:Ti, ...}

(T_Proj) Gamma |- t.i : Ti

Formalizing Records

Syntax and Operational Semantics

The most obvious way to formalize the syntax of record types would be this:

Unfortunately, we encounter here a limitation in Coq: this type does not automatically give us the induction principle we expect: the induction hypothesis in the TRcd case doesn't give us any information about the ty elements of the list, making it useless for the proofs we want to do.

It is possible to get a better induction principle out of Coq, but the details of how this is done are not very pretty, and the principle we obtain is not as intuitive to use as the ones Coq generates automatically for simple Inductive definitions.

Fortunately, there is a different way of formalizing records that is, in some ways, even simpler and more natural: instead of using the standard Coq list type, we can essentially incorporate its constructors (nil and cons) in the syntax of our types.

Similarly, at the level of terms, we have constructors trnil, for the empty record, and rcons, which adds a single field to the front of a list of fields.

Some examples...

{ i1:A }

{ i1:A->B, i2:A }

Well-Formedness

One issue with generalizing the abstract syntax for records from lists to the nil/cons presentation is that it introduces the possibility of writing strange types like this...

where the tail of a record type is not actually a record type!

We'll structure our typing judgement so that no ill-formed types like weird_type are ever assigned to terms. To support this, we define predicates record_ty and record_tm, which identify record types and terms, and well_formed_ty which rules out the ill-formed types.

First, a type is a record type if it is built with just RNil and RCons at the outermost level.

With this, we can define well-formed types.

Note that record_ty is not recursive -- it just checks the outermost constructor. The well_formed_ty property, on the other hand, verifies that the whole type is well formed in the sense that the tail of every record (the second argument to RCons) is a record.

Of course, we should also be concerned about ill-formed terms, not just types; but typechecking can rule those out without the help of an extra well_formed_tm definition because it already examines the structure of terms. All we need is an analog of record_ty saying that a term is a record term if it is built with trnil and rcons.

Substitution

Substitution extends easily.

Reduction

A record is a value if all of its fields are.

To define reduction, we'll need a utility function for extracting one field from record term:

The step function uses this term-level lookup function in the projection rule.

Reserved Notation "t1 '-->' t2" (at level 40).

Inductive step : tm -> tm -> Prop :=
  | ST_AppAbs : forall x T11 t12 v2,
         value v2 ->
         (app (abs x T11 t12) v2) --> ([x:=v2]t12)
  | ST_App1 : forall t1 t1' t2,
         t1 --> t1' ->
         (app t1 t2) --> (app t1' t2)
  | ST_App2 : forall v1 t2 t2',
         value v1 ->
         t2 --> t2' ->
         (app v1 t2) --> (app v1 t2')
  | ST_Proj1 : forall t1 t1' i,
        t1 --> t1' ->
        (rproj t1 i) --> (rproj t1' i)
  | ST_ProjRcd : forall tr i vi,
        value tr ->
        tlookup i tr = Some vi ->
        (rproj tr i) --> vi
  | ST_Rcd_Head : forall i t1 t1' tr2,
        t1 --> t1' ->
        (rcons i t1 tr2) --> (rcons i t1' tr2)
  | ST_Rcd_Tail : forall i v1 tr2 tr2',
        value v1 ->
        tr2 --> tr2' ->
        (rcons i v1 tr2) --> (rcons i v1 tr2')

where "t1 '-->' t2" := (step t1 t2).

Notation multistep := (multi step).
Notation "t1 '-->*' t2" := (multistep t1 t2) (at level 40).

Hint Constructors step.

(* ----------------------------------------------------------------- *)

Typing

Next we define the typing rules. These are nearly direct transcriptions of the inference rules shown above: the only significant difference is the use of well_formed_ty. In the informal presentation we used a grammar that only allowed well-formed record types, so we didn't have to add a separate check.

One sanity condition that we'd like to maintain is that, whenever has_type Gamma t T holds, will also be the case that well_formed_ty T, so that has_type never assigns ill-formed types to terms. In fact, we prove this theorem below. However, we don't want to clutter the definition of has_type with unnecessary uses of well_formed_ty. Instead, we place well_formed_ty checks only where needed: where an inductive call to has_type won't already be checking the well-formedness of a type. For example, we check well_formed_ty T in the T_Var case, because there is no inductive has_type call that would enforce this. Similarly, in the T_Abs case, we require a proof of well_formed_ty T11 because the inductive call to has_type only guarantees that T12 is well-formed.

Fixpoint Tlookup (i:string) (Tr:ty) : option ty :=
  match Tr with
  | RCons i' T Tr' =>
      if eqb_string i i' then Some T else Tlookup i Tr'
  | _ => None
  end.

Definition context := partial_map ty.

Reserved Notation "Gamma '|-' t '\in' T" (at level 40).

Inductive has_type : context -> tm -> ty -> Prop :=
  | T_Var : forall Gamma x T,
      Gamma x = Some T ->
      well_formed_ty T ->
      Gamma |- (var x) \in T
  | T_Abs : forall Gamma x T11 T12 t12,
      well_formed_ty T11 ->
      (update Gamma x T11) |- t12 \in T12 ->
      Gamma |- (abs x T11 t12) \in (Arrow T11 T12)
  | T_App : forall T1 T2 Gamma t1 t2,
      Gamma |- t1 \in (Arrow T1 T2) ->
      Gamma |- t2 \in T1 ->
      Gamma |- (app t1 t2) \in T2
  (* records: *)
  | T_Proj : forall Gamma i t Ti Tr,
      Gamma |- t \in Tr ->
      Tlookup i Tr = Some Ti ->
      Gamma |- (rproj t i) \in Ti
  | T_RNil : forall Gamma,
      Gamma |- trnil \in RNil
  | T_RCons : forall Gamma i t T tr Tr,
      Gamma |- t \in T ->
      Gamma |- tr \in Tr ->
      record_ty Tr ->
      record_tm tr ->
      Gamma |- (rcons i t tr) \in (RCons i T Tr)

where "Gamma '|-' t '\in' T" := (has_type Gamma t T).

Hint Constructors has_type.

(* ================================================================= *)

Examples

Exercise: 2 stars, standard (examples)

Finish the proofs below. Feel free to use Coq's automation features in this proof. However, if you are not confident about how the type system works, you may want to carry out the proofs first using the basic features (apply instead of eapply, in particular) and then perhaps compress it using automation. Before starting to prove anything, make sure you understand what it is saying.

Properties of Typing

The proofs of progress and preservation for this system are essentially the same as for the pure simply typed lambda-calculus, but we need to add some technical lemmas involving records.

Well-Formedness

Field Lookup

Lemma: If empty |- v : T and Tlookup i T returns Some Ti, then tlookup i v returns Some ti for some term ti such that empty |- ti \in Ti.

Proof: By induction on the typing derivation Htyp. Since Tlookup i T = Some Ti, T must be a record type, this and the fact that v is a value eliminate most cases by inspection, leaving only the T_RCons case.

If the last step in the typing derivation is by T_RCons, then t = rcons i0 t tr and T = RCons i0 T Tr for some i0, t, tr, T and Tr.

This leaves two possiblities to consider - either i0 = i or not.

If i = i0, then since Tlookup i (RCons i0 T Tr) = Some Ti we have T = Ti. It follows that t itself satisfies the theorem.
On the other hand, suppose i <> i0. Then
Tlookup i T = Tlookup i Tr
and
tlookup i t = tlookup i tr,
so the result follows from the induction hypothesis.

Here is the formal statement:

Progress

Theorem progress : forall t T,
     empty |- t \in T ->
     value t \/ exists t', t --> t'.
Proof with eauto.
  (* Theorem: Suppose empty |- t : T.  Then either
       1. t is a value, or
       2. t --> t' for some t'.
     Proof: By induction on the given typing derivation. *)
  intros t T Ht.
  remember (@empty ty) as Gamma.
  generalize dependent HeqGamma.
  induction Ht; intros HeqGamma; subst.
  - (* T_Var *)
    (* The final rule in the given typing derivation cannot be 
       <tt>T_Var</tt>, since it can never be the case that 
       <tt>empty |- x : T</tt> (since the context is empty). *)
    inversion H.
  - (* T_Abs *)
    (* If the <tt>T_Abs</tt> rule was the last used, then 
       <tt>t = abs x T11 t12</tt>, which is a value. *)
    left...
  - (* T_App *)
    (* If the last rule applied was T_App, then <tt>t = t1 t2</tt>, 
       and we know from the form of the rule that
         <tt>empty |- t1 : T1 -> T2</tt>
         <tt>empty |- t2 : T1</tt>
       By the induction hypothesis, each of t1 and t2 either is a value
       or can take a step. *)
    right.
    destruct IHHt1; subst...
    + (* t1 is a value *)
      destruct IHHt2; subst...
      * (* t2 is a value *)
      (* If both <tt>t1</tt> and <tt>t2</tt> are values, then we know that
         <tt>t1 = abs x T11 t12</tt>, since abstractions are the only 
         values that can have an arrow type.  But
         <tt>(abs x T11 t12) t2 --> [x:=t2]t12</tt> by <tt>ST_AppAbs</tt>. *)
        inversion H; subst; try solve_by_invert.
        exists ([x:=t2]t12)...
      * (* t2 steps *)
        (* If <tt>t1</tt> is a value and <tt>t2 --> t2'</tt>, then
           <tt>t1 t2 --> t1 t2'</tt> by <tt>ST_App2</tt>. *)
        destruct H0 as [t2' Hstp]. exists (app t1 t2')...
    + (* t1 steps *)
      (* Finally, If <tt>t1 --> t1'</tt>, then <tt>t1 t2 --> t1' t2</tt>
         by <tt>ST_App1</tt>. *)
      destruct H as [t1' Hstp]. exists (app t1' t2)...
  - (* T_Proj *)
    (* If the last rule in the given derivation is <tt>T_Proj</tt>, then
       <tt>t = rproj t i</tt> and
           <tt>empty |- t : (TRcd Tr)</tt>
       By the IH, <tt>t</tt> either is a value or takes a step. *)
    right. destruct IHHt...
    + (* rcd is value *)
      (* If <tt>t</tt> is a value, then we may use lemma
         <tt>lookup_field_in_value</tt> to show <tt>tlookup i t = Some ti</tt> 
         for some <tt>ti</tt> which gives us <tt>rproj i t --> ti</tt> by
         <tt>ST_ProjRcd</tt>. *)
      destruct (lookup_field_in_value _ _ _ _ H0 Ht H)
        as [ti [Hlkup _]].
      exists ti...
    + (* rcd_steps *)
      (* On the other hand, if <tt>t --> t'</tt>, then
         <tt>rproj t i --> rproj t' i</tt> by <tt>ST_Proj1</tt>. *)
      destruct H0 as [t' Hstp]. exists (rproj t' i)...
  - (* T_RNil *)
    (* If the last rule in the given derivation is <tt>T_RNil</tt>, 
       then <tt>t = trnil</tt>, which is a value. *)
    left...
  - (* T_RCons *)
    (* If the last rule is <tt>T_RCons</tt>, then <tt>t = rcons i t tr</tt> and
         <tt>empty |- t : T</tt>
         <tt>empty |- tr : Tr</tt>
       By the IH, each of <tt>t</tt> and <tt>tr</tt> either is a value or can 
       take a step. *)
    destruct IHHt1...
    + (* head is a value *)
      destruct IHHt2; try reflexivity.
      * (* tail is a value *)
      (* If <tt>t</tt> and <tt>tr</tt> are both values, then <tt>rcons i t tr</tt>
         is a value as well. *)
        left...
      * (* tail steps *)
        (* If <tt>t</tt> is a value and <tt>tr --> tr'</tt>, then
           <tt>rcons i t tr --> rcons i t tr'</tt> by
           <tt>ST_Rcd_Tail</tt>. *)
        right. destruct H2 as [tr' Hstp].
        exists (rcons i t tr')...
    + (* head steps *)
      (* If <tt>t --> t'</tt>, then
         <tt>rcons i t tr --> rcons i t' tr</tt>
         by <tt>ST_Rcd_Head</tt>. *)
      right. destruct H1 as [t' Hstp].
      exists (rcons i t' tr)... Qed.

(* ----------------------------------------------------------------- *)

Context Invariance

Inductive appears_free_in : string -> tm -> Prop :=
  | afi_var : forall x,
      appears_free_in x (var x)
  | afi_app1 : forall x t1 t2,
      appears_free_in x t1 -> appears_free_in x (app t1 t2)
  | afi_app2 : forall x t1 t2,
      appears_free_in x t2 -> appears_free_in x (app t1 t2)
  | afi_abs : forall x y T11 t12,
        y <> x  ->
        appears_free_in x t12 ->
        appears_free_in x (abs y T11 t12)
  | afi_proj : forall x t i,
     appears_free_in x t ->
     appears_free_in x (rproj t i)
  | afi_rhead : forall x i ti tr,
      appears_free_in x ti ->
      appears_free_in x (rcons i ti tr)
  | afi_rtail : forall x i ti tr,
      appears_free_in x tr ->
      appears_free_in x (rcons i ti tr).

Hint Constructors appears_free_in.

Lemma context_invariance : forall Gamma Gamma' t S,
     Gamma |- t \in S  ->
     (forall x, appears_free_in x t -> Gamma x = Gamma' x)  ->
     Gamma' |- t \in S.
Proof with eauto.
  intros. generalize dependent Gamma'.
  induction H;
    intros Gamma' Heqv...
  - (* T_Var *)
    apply T_Var... rewrite <- Heqv...
  - (* T_Abs *)
    apply T_Abs... apply IHhas_type. intros y Hafi.
    unfold update, t_update. destruct (eqb_stringP x y)...
  - (* T_App *)
    apply T_App with T1...
  - (* T_RCons *)
    apply T_RCons... Qed.

Lemma free_in_context : forall x t T Gamma,
   appears_free_in x t ->
   Gamma |- t \in T ->
   exists T', Gamma x = Some T'.
Proof with eauto.
  intros x t T Gamma Hafi Htyp.
  induction Htyp; inversion Hafi; subst...
  - (* T_Abs *)
    destruct IHHtyp as [T' Hctx]... exists T'.
    unfold update, t_update in Hctx.
    rewrite false_eqb_string in Hctx...
Qed.

(* ----------------------------------------------------------------- *)

Preservation

Lemma substitution_preserves_typing : forall Gamma x U v t S,
     (update Gamma x U) |- t \in S  ->
     empty |- v \in U   ->
     Gamma |- ([x:=v]t) \in S.
Proof with eauto.
  (* Theorem: If x|->U;Gamma |- t : S and empty |- v : U, then
     Gamma |- (<tt>x:=v</tt>t) S. *)
  intros Gamma x U v t S Htypt Htypv.
  generalize dependent Gamma. generalize dependent S.
  (* Proof: By induction on the term t.  Most cases follow 
     directly from the IH, with the exception of var, 
     abs, rcons. The former aren't automatic because we 
     must reason about how the variables interact. In the 
     case of rcons, we must do a little extra work to show 
     that substituting into a term doesn't change whether 
     it is a record term. *)
  induction t;
    intros S Gamma Htypt; simpl; inversion Htypt; subst...
  - (* var *)
    simpl. rename s into y.
    (* If t = y, we know that
         <tt>empty |- v : U</tt> and
         <tt>x|->U; Gamma |- y : S</tt>
       and, by inversion, <tt>update Gamma x U y = Some S</tt>.  
       We want to show that <tt>Gamma |- [x:=v]y : S</tt>.

There are two cases to consider: either <tt>x=y</tt> or <tt>x<>y</tt>. *)
    unfold update, t_update in H0.
    destruct (eqb_stringP x y) as [Hxy|Hxy].
    + (* x=y *)
    (* If <tt>x = y</tt>, then we know that <tt>U = S</tt>, and that 
       <tt>[x:=v]y = v</tt>. So what we really must show is that 
       if <tt>empty |- v : U</tt> then <tt>Gamma |- v : U</tt>.  We have
        already proven a more general version of this theorem, 
        called context invariance! *)
      subst.
      inversion H0; subst. clear H0.
      eapply context_invariance...
      intros x Hcontra.
      destruct (free_in_context _ _ S empty Hcontra)
        as [T' HT']...
      inversion HT'.
    + (* x<>y *)
    (* If <tt>x <> y</tt>, then <tt>Gamma y = Some S</tt> and the substitution
       has no effect.  We can show that <tt>Gamma |- y : S</tt> by 
       <tt>T_Var</tt>. *)
      apply T_Var...
  - (* abs *)
    rename s into y. rename t into T11.
    (* If <tt>t = abs y T11 t0</tt>, then we know that
         <tt>x|->U; Gamma |- abs y T11 t0 : T11->T12</tt>
         <tt>x|->U; y|->T11; Gamma |- t0 : T12</tt>
         <tt>empty |- v : U</tt>
       As our IH, we know that forall S Gamma,
         <tt>x|->U; Gamma |- t0 : S -> Gamma |- [x:=v]t0 S</tt>.

We can calculate that
       <tt> [x:=v]t = abs y T11 (if eqb_string x y then t0 else [x:=v]t0) </tt>,
       and we must show that <tt>Gamma |- [x:=v]t : T11->T12</tt>.  We know
       we will do so using <tt>T_Abs</tt>, so it remains to be shown that:
         <tt>y|->T11; Gamma |- if eqb_string x y then t0 else [x:=v]t0 : T12</tt>
       We consider two cases: <tt>x = y</tt> and <tt>x <> y</tt>. *)
    apply T_Abs...
    destruct (eqb_stringP x y) as [Hxy|Hxy].
    + (* x=y *)
      (* If <tt>x = y</tt>, then the substitution has no effect.  Context
         invariance shows that <tt>y:U,y:T11</tt> and <tt>Gamma,y:T11</tt> are
         equivalent.  Since <tt>t0 : T12</tt> under the former context, 
         this is also the case under the latter. *)
      eapply context_invariance...
      subst.
      intros x Hafi. unfold update, t_update.
      destruct (eqb_string y x)...
    + (* x<>y *)
      (* If <tt>x <> y</tt>, then the IH and context invariance allow 
         us to show that
           <tt>x|->U; y|->T11; Gamma |- t0 : T12</tt>       =>
           <tt>y|->T11; x|->U; Gamma |- t0 : T12</tt>       =>
           <tt>y|->T11; Gamma |- [x:=v]t0 : T12</tt> *)
      apply IHt. eapply context_invariance...
      intros z Hafi. unfold update, t_update.
      destruct (eqb_stringP y z)...
      subst. rewrite false_eqb_string...
  - (* rcons *)
    apply T_RCons... inversion H7; subst; simpl...
Qed.

Theorem preservation : forall t t' T,
     empty |- t \in T  ->
     t --> t'  ->
     empty |- t' \in T.
Proof with eauto.
  intros t t' T HT.
  (* Theorem: If <tt>empty |- t : T</tt> and <tt>t --> t'</tt>, then
     <tt>empty |- t' : T</tt>. *)
  remember (@empty ty) as Gamma. generalize dependent HeqGamma.
  generalize dependent t'.
  (* Proof: By induction on the given typing derivation.  
     Many cases are contradictory (<tt>T_Var</tt>, <tt>T_Abs</tt>) or follow 
     directly from the IH (<tt>T_RCons</tt>).  We show just the 
     interesting ones. *)
  induction HT;
    intros t' HeqGamma HE; subst; inversion HE; subst...
  - (* T_App *)
    (* If the last rule used was <tt>T_App</tt>, then <tt>t = t1 t2</tt>, 
       and three rules could have been used to show <tt>t --> t'</tt>:
       <tt>ST_App1</tt>, <tt>ST_App2</tt>, and <tt>ST_AppAbs</tt>. In the first two 
       cases, the result follows directly from the IH. *)
    inversion HE; subst...
    + (* ST_AppAbs *)
      (* For the third case, suppose
           <tt>t1 = abs x T11 t12</tt>
         and
           <tt>t2 = v2</tt>.  We must show that <tt>empty |- [x:=v2]t12 : T2</tt>.
         We know by assumption that
             <tt>empty |- abs x T11 t12 : T1->T2</tt>
         and by inversion
             <tt>x:T1 |- t12 : T2</tt>
         We have already proven that substitution_preserves_typing and
             <tt>empty |- v2 : T1</tt>
         by assumption, so we are done. *)
      apply substitution_preserves_typing with T1...
      inversion HT1...
  - (* T_Proj *)
    (* If the last rule was <tt>T_Proj</tt>, then <tt>t = rproj t1 i</tt>.  
       Two rules could have caused <tt>t --> t'</tt>: <tt>T_Proj1</tt> and
       <tt>T_ProjRcd</tt>.  The typing of <tt>t'</tt> follows from the IH 
       in the former case, so we only consider <tt>T_ProjRcd</tt>.

Here we have that <tt>t</tt> is a record value.  Since rule 
       <tt>T_Proj</tt> was used, we know <tt>empty |- t \in Tr</tt> and 
       <tt>Tlookup i Tr = Some Ti</tt> for some <tt>i</tt> and <tt>Tr</tt>.  
       We may therefore apply lemma <tt>lookup_field_in_value</tt> 
       to find the record element this projection steps to. *)
    destruct (lookup_field_in_value _ _ _ _ H2 HT H)
      as [vi [Hget Htyp]].
    rewrite H4 in Hget. inversion Hget. subst...
  - (* T_RCons *)
    (* If the last rule was <tt>T_RCons</tt>, then <tt>t = rcons i t tr</tt> 
       for some <tt>i</tt>, <tt>t</tt> and <tt>tr</tt> such that <tt>record_tm tr</tt>.  If 
       the step is by <tt>ST_Rcd_Head</tt>, the result is immediate by 
       the IH.  If the step is by <tt>ST_Rcd_Tail</tt>, <tt>tr --> tr2'</tt>
       for some <tt>tr2'</tt> and we must also use lemma <tt>step_preserves_record_tm</tt> 
       to show <tt>record_tm tr2'</tt>. *)
    apply T_RCons... eapply step_preserves_record_tm...
Qed.