This file is used as a demonstration of the jsCoq tool and is not an official version of the Software Foundations book by Benjamin Pierce et al. License:
Copyright (c) 2012 Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
The simply typed lambda-calculus (STLC) is a tiny core calculus embodying the key concept of functional abstraction, which shows up in pretty much every real-world programming language in some form (functions, procedures, methods, etc.).
We will follow exactly the same pattern as in the previous chapter when formalizing this calculus (syntax, small-step semantics, typing rules) and its main properties (progress and preservation). The new technical challenges (which will take some work to deal with) all arise from the mechanisms of variable binding and substitution.
The STLC is built on some collection of base types -- booleans, numbers, strings, etc. The exact choice of base types doesn't matter -- the construction of the language and its theoretical properties work out pretty much the same -- so for the sake of brevity let's take just Bool for the moment. At the end of the chapter we'll see how to add more base types, and in later chapters we'll enrich the pure STLC with other useful constructs like pairs, records, subtyping, and mutable state.
Starting from the booleans, we add three things:
This gives us the following collection of abstract syntax constructors (written out here in informal BNF notation -- we'll formalize it below).
Informal concrete syntax: t ::= x variable | λ x:T1.t2 abstraction | t1 t2 application | true constant true | false constant false | if t1 then t2 else t3 conditional
The \ symbol (backslash, in ascii) in a function abstraction
λ x:T1.t2 is generally written as a greek letter lambda
(hence
the name of the calculus). The variable x is called the
parameter to the function; the term t1 is its body. The
annotation :T specifies the type of arguments that the function
can be applied to.
Some examples:
The identity function for booleans.
The identity function for booleans, applied to the boolean true.
The boolean not
function.
The constant function that takes every (boolean) argument to true.
A two-argument function that takes two booleans and returns the first one. (Note that, as in Coq, a two-argument function is really a one-argument function whose body is also a one-argument function.)
A two-argument function that takes two booleans and returns the first one, applied to the booleans false and true.
Note that, as in Coq, application associates to the left -- i.e., this expression is parsed as ((λ x:Bool. λ y:Bool. x) false) true.
A higher-order function that takes a function f (from booleans to booleans) as an argument, applies f to true, and applies f again to the result.
The same higher-order function, applied to the constantly false function.
As the last several examples show, the STLC is a language of higher-order functions: we can write down functions that take other functions as arguments and/or return other functions as results.
Another point to note is that the STLC doesn't provide any
primitive syntax for defining named functions -- all functions
are anonymous.
We'll see in chapter MoreStlc that it is easy
to add named functions to what we've got -- indeed, the
fundamental naming and binding mechanisms are exactly the same.
The types of the STLC include Bool, which classifies the boolean constants true and false as well as more complex computations that yield booleans, plus arrow types that classify functions.
T ::= Bool | T1 -> T2 For example:
Note that an abstraction λ x:T.t (formally, tabs x T t) is always annotated with the type T of its parameter, in contrast to Coq (and other functional languages like ML, Haskell, etc.), which use type inference to fill in missing annotations. We're not considering type inference here, to keep things simple.
Some examples...
idB = λ x:Bool. x
idBB = λ x:Bool->Bool. x
idBBBB = λ x:(Bool->Bool) -> (Bool->Bool). x
k = λ x:Bool. λ y:Bool. x
notB = λ x:Bool. if x then false else true
(We write these as Notations rather than Definitions to make things easier for auto.)
To define the small-step semantics of STLC terms, we begin -- as always -- by defining the set of values. Next, we define the critical notions of free variables and substitution, which are used in the reduction rule for application expressions. And finally we give the small-step relation itself.
To define the values of the STLC, we have a few cases to consider.
First, for the boolean part of the language, the situation is clear: true and false are the only values. An if expression is never a value.
Second, an application is clearly not a value: It represents a function being invoked on some argument, which clearly still has work left to do.
Third, for abstractions, we have a choice:
Coq, in its built-in functional programming langauge, makes the first choice -- for example, Eval simpl in (fun x:bool => 3 + 4) yields fun x:bool => 7.
Most real-world functional programming languages make the second choice -- reduction of a function's body only begins when the function is actually applied to an argument. We also make the second choice here.
Finally, having made the choice not to reduce under abstractions,
we don't need to worry about whether variables are values, since
we'll always be reducing programs from the outside in,
and that
means the step relation will always be working with closed
terms (ones with no free variables).
Now we come to the heart of the STLC: the operation of substituting one term for a variable in another term.
This operation will be used below to define the operational semantics of function application, where we will need to substitute the argument term for the function parameter in the function's body. For example, we reduce (λ x:Bool. if x then true else x) false to if false then true else false ]] by substituting false for the parameter x in the body of the function.
In general, we need to be able to substitute some given
term s for occurrences of some variable x in another term t.
In informal discussions, this is usually written [x:=s]t and
pronounced substitute x with s in t.
Here are some examples:
The last example is very important: substituting x with true in λ x:Bool. x does not yield λ x:Bool. true! The reason for this is that the x in the body of λ x:Bool. x is bound by the abstraction: it is a new, local name that just happens to be spelled the same as some global name x.
Here is the definition, informally... x:=sx = s x:=sy = y if x <> y x:=s(λ x:T11.t12) = λ x:T11. t12 x:=s(λ y:T11.t12) = λ y:T11. x:=st12 if x <> y x:=s(t1 t2) = (x:=st1) (x:=st2) x:=strue = true x:=sfalse = false x:=s(if t1 then t2 else t3) = if x:=st1 then x:=st2 else x:=st3 ]] ... and formally:
Technical note: Substitution becomes trickier to define if we consider the case where s, the term being substituted for a variable in some other term, may itself contain free variables. Since we are only interested here in defining the step relation on closed terms (i.e., terms like λ x:Bool. x, that do not mention variables are not bound by some enclosing lambda), we can skip this extra complexity here, but it must be dealt with when formalizing richer languages.
The definition that we gave above uses Coq's Fixpoint facility to define substitution as a function. Suppose, instead, we wanted to define substitution as an inductive relation substi. We've begun the definition by providing the Inductive header and one of the constructors; your job is to fill in the rest of the constructors.
The small-step reduction relation for STLC now follows the same
pattern as the ones we have seen before. Intuitively, to reduce a
function application, we first reduce its left-hand side until it
becomes a literal function; then we reduce its right-hand
side (the argument) until it is also a value; and finally we
substitute the argument for the bound variable in the body of the
function. This last rule, written informally as
(λ x:T.t12) v2 ==> x:=v2t12
is traditionally called beta-reduction
.
value v2
t1 ==> t1'
value v1 t2 ==> t2'
t1 ==> t1'
Example: ((λ x:Bool->Bool. x) (λ x:Bool. x)) ==>* (λ x:Bool. x) i.e. (idBB idB) ==>* idB
Example: ((λ x:Bool->Bool. x) ((λ x:Bool->Bool. x) (λ x:Bool. x))) ==>* (λ x:Bool. x) i.e. (idBB (idBB idB)) ==>* idB.
Example: ((λ x:Bool->Bool. x) (λ x:Bool. if x then false else true)) true) ==>* false i.e. ((idBB notB) ttrue) ==>* tfalse.
Example: ((λ x:Bool->Bool. x) ((λ x:Bool. if x then false else true) true)) ==>* false i.e. (idBB (notB ttrue)) ==>* tfalse.
A more automatic proof
Again, we can use the normalize tactic from above to simplify the proof.
Question: What is the type of the term x y
?
Answer: It depends on the types of x and y!
I.e., in order to assign a type to a term, we need to know what assumptions we should make about the types of its free variables.
This leads us to a three-place typing judgment
, informally
written Gamma ⊢ t : T, where Gamma is a
typing context
-- a mapping from variables to their types.
We hide the definition of partial maps in a module since it is actually defined in SfLib.
Informally, we'll write Gamma, x:T for extend the partial
function Gamma to also map x to T.
Formally, we use the
function extend to add a binding to a partial map.
Gamma x = T
Gamma , x:T11 ⊢ t12 \in T12
Gamma ⊢ t1 \in T11->T12 Gamma ⊢ t2 \in T11
Gamma ⊢ t1 \in Bool Gamma ⊢ t2 \in T Gamma ⊢ t3 \in T
We can read the three-place relation Gamma ⊢ t \in T as:
to the term t we can assign the type T using as types for
the free variables of t the ones specified in the context
Gamma.
Note that since we added the has_type constructors to the hints database, auto can actually solve this one immediately.
Another example: empty ⊢ λ x:A. λ y:A->A. y (y x)) \in A -> (A->A) -> A.
empty ⊢ λ x:Bool->B. λ y:Bool->Bool. \z:Bool. y (x z) \in T.
We can also show that terms are not typable. For example, let's formally check that there is no typing derivation assigning a type to the term λ x:Bool. λ y:Bool, x y -- i.e., ~ exists T, empty ⊢ λ x : Bool. λ y : Bool, x y : T.