Skip to the Main Content

Note:These pages make extensive use of the latest XHTML and CSS Standards. They ought to look great in any standards-compliant modern browser. Unfortunately, they will probably look horrible in older browsers, like Netscape 4.x and IE 4.x. Moreover, many posts use MathML, which is, currently only supported in Mozilla. My best suggestion (and you will thank me when surfing an ever-increasing number of sites on the web which have been crafted to use the new standards) is to upgrade to the latest version of your browser. If that's not possible, consider moving to the Standards-compliant and open-source Mozilla browser.

May 10, 2020

In Further Praise of Dependent Types

Posted by David Corfield

After an exciting foray into the rarefied atmosphere of current geometry, I want to return to something more prosaic – dependent types – the topic treated in Chapter 2 of my book. We have already had a paean on this subject a few years ago in Mike’s post – In Praise of Dependent Types, hence the title of this one.

I’ve just watched Kevin Buzzard’s talk – Is HoTT the way to do mathematics?. Kevin is a number theorist at Imperial College London who’s looking to train his undergraduates to produce computer-checked proofs of mainstream theory (e.g., theorems in algebraic geometry concerning rings and schemes) in the Lean theorem-prover.

Why Lean? Well, at (12:14) in the talk Kevin claims of mathematicians that

They use dependent types, even though they don’t know they are using dependent types.

Let’s hope they receive this news with the delight of Molière’s Mr. Jourdain:

« Par ma foi ! il y a plus de quarante ans que je dis de la prose sans que j’en susse rien, et je vous suis le plus obligé du monde de m’avoir appris cela. »

“My faith! For more than forty years I have been speaking prose while knowing nothing of it, and I am the most obliged person in the world to you for telling me so.”

So Kevin has chosen Lean over set-based provers since it employs these dependent types. He then goes on in his talk to consider whether one should prefer a system with univalence built in, but that’s a topic for another day.

I have a simple explanation as to why mathematicians are using dependent types. They’re doing so because first and foremost they are speakers of natural language, and as such they are already using dependent types. This latter point is something we might have learned way back in the 1994 from Aarne Ranta in his excellent Type-theoretic Grammar.

Recognising the commonality between natural and mathematical language, Ranta tells us in a very recent paper that mathematics is a good place to look to understand how language works:

The main characteristic of the language of mathematics is perhaps that different members of the community can completely agree about the meaning of each word and sentence. This means that one can take these meanings for granted and analyse the underlying mechanisms in isolation from uncertainty about what the correct analysis of data is. In this sense, the language of mathematics provides “laboratory conditions” for the study of semantics and pragmatics. (Ranta 2020, p. 122).

Ranta, A. 2020. ‘Some remarks on pragmatics in the language of mathematics’, Journal of Pragmatics 160, pp. 120-122.

So, a good place to try and understand underlying linguistic structure is informal mathematics, something he studied in his 1993 paper Type theory and the informal language of mathematics. This investigates both directions of the passage between formal and informal mathematics: sugaring and formalization.

Important topics for the dependent type-theoretic account of language are pronouns and quantifiers. Starting with the latter, any dependent type x:AB(x):Type x: A \vdash B(x): Type gives rise to two non-dependent types:

x:AB(x), x:AB(x):Type. \vdash \sum_{x: A} B(x), \prod_{x: A}B(x): Type.

These assignments are left and right adjoints to the one which sends a type CC to the constant dependent type:

x:AC:Type. x: A \vdash C: Type.

(Exercise: Assuming these adjunctions, demonstrate the arithmetic identities for natural numbers, a b i= ia b ia ^{\sum b_i} = \prod_i a^{b_i} and i(b i) a=( ib i) a\prod_i (b_i)^a = (\prod_i b_i)^a.)

When B(x)B(x) is a proposition, these quantified types are ‘The AAs which are BB’ (‘truncated’ to ‘Some AA is BB’) and ‘Every AA is BB’.

The kind of ambiguity that Ranta says is missing in mathematics occurs frequently in natural language. Philosophers of language puzzle about how we determine the scope of a quantifier as when someone utters ‘Every bottle is green’. Certainly it doesn’t mean every bottle in the world. It probably doesn’t mean every bottle in the house or even every one in the room. Of course, the context matters. For instance, maybe the person who says this has just returned from a store with a selection of wine.

Dependent type theory understands there to be a host of implicit parameters. ‘Every’ is ‘every Aevery_A’ where AA is a type. Similarly for demonstratives, this and that. The need for parameters is felt by the kind of philosopher of language who wants to downplay the role of the context as an independent determinant of meaning, one who, like Jason Stanley, could maintain:

All truth-conditional effects of extra-linguistic context can be traced to logical form. (Language in Context, OUP, 2007, p. 30)

Stanley speaks of ‘context-dependency’, ‘domain indices’, ‘covert pronomial elements’, and ‘unpronounced syntactic structure’. My claim is that if this has any chance of working, it has to be through the logical form of dependent type theory. On hearing ‘every’ in ‘every bottle is green’, we search for the corresponding implicit type. ‘Bottle’ is a cue and may lead us with the context to the intended type of ‘bottles that have just been brought home from the store’.

Consider another example from Stanley:

Every time John lit a cigarette, it rained.

Despite appearances this can’t merely be quantification over times. If John always lights a cigarette in New York and at the same moment it always rains in London, this doesn’t support the proposition. It must rain at the same location as John.

I would want to say that the verb ‘rain’ has implicit parameters for time and place. An achievement, such as ‘John lights a cigarette’, has an associated time and place. The proposition is claiming that every achievement which is John lighting a cigarette occurs at a time and place such that it rains then and there.

  • X:Achievement,x:Xt(x):Time,l(x):LocationX:Achievement, x: X \vdash t(x): Time, l(x): Location

  • u:Time,v:locationRain(u,v):Propu: Time, v: location \vdash Rain(u, v): Prop

  • Johnlightsacigarette:Achievement\vdash John\; lights\; a\; cigarette: Achievement

Hence,

  • z:Johnlightsacigarettet(z):Time;l(z):Locationz: John\; lights\; a\; cigarette \vdash t(z): Time; l(z): Location

  • z:JohnlightsacigaretteRain(t(z),l(z)):Type\vdash \prod_{z: John\; lights\; a\; cigarette} Rain(t(z), l(z)): Type.

This final type corresponds to the target sentence.

Briefly on pronouns, one of the first pieces of natural language I saw rendered in dependent type theory was Sundholm’s version of the Donkey sentence:

Every farmer who owns a donkey beats it.

This may strike people as a little unnatural, but it’s easy to devise sentences of a similar form, especially if we choose ‘any’ rather than ‘every’:

Anyone who owns a gun should register it.

(A treatment of the subtleties of this choice and comparable ones can be found in Zeno Vendler’s Each and Every, Any and All.)

The point is that there’s a relation between two kinds of thing, here people and guns, where one person may have no, one, or many guns. We use the indefinite article ‘a’ for gun, and yet seem to want to refer to something particular by using ‘it’.

Every AA that RRs a BB, SSs it,

rendered in type theory as

z: x:A y:BR(x,y)S(p(z),p(q(z)), \prod_{z: \sum_{x: A} \sum_{y: B} R(x, y)} S(p(z), p(q(z)), where pp and qq are the projections to first and second components. We see ‘it’ corresponds to p(q(z))p(q(z)).

A first-order quantification, on the other hand, will look to two universal quantifiers

xy(A(x)&B(y)&R(x,y)S(x,y)), \forall x \forall y(A(x) \& B(y) \& R(x, y) \to S(x, y)), which seems further from the original linguistic form.

Ranta gives an example from informal Euclidean geometry:

Every point that lies outside a line determines a parallel to it.

z:( x:point y:lineOutside(x,y))DAP(p(z),p(q(z)))\prod_{z: (\sum_{x: point} \sum_{y: line} Outside(x, y))}DAP(p(z), p(q(z)))

Perhaps a more natural sounding mathematical example is

Any number which has a proper divisor is greater than it.

In more formal mathematics, we tend to deploy letters, so

Any number, nn, which has a proper divisor, mm, is such that n>mn \gt m.

I think Ranta is right, we could learn a lot from informal mathematical language.

Posted at May 10, 2020 4:57 PM UTC

TrackBack URL for this Entry:   https://golem.ph.utexas.edu/cgi-bin/MT-3.0/dxy-tb.fcgi/3221

13 Comments & 0 Trackbacks

Re: In Further Praise of Dependent Types

I have heard computer scientists complaining that our (mathematical) notation is ambiguous. They don’t like [0,1] for the real unit interval, because it looks exactly like the list [0,1] even though it is manifestly clear to any mathematician that the meaning of [0,1] is completely unambiguous in any context. The CS people like ]0,1] and (0,1] even less because this messes up their parsing. We’re stuck with something like Ioc 0 1 right now (“Interval, open, closed”). Come on. Lean 4 has an amazing parser designed by Sebastian Ullrich and even though it won’t be perfect, it will be better than Lean 3’s. I believe that writing mathematics in a computer file which looks the same as the mathematics we write on blackboards is a really important step towards making software such as Lean usable for mathematicians. We are working on it. I think that some of the computer scientists who claim it’s sloppy or ambiguous just don’t understand it well enough.

Posted by: Kevin Buzzard on May 10, 2020 6:29 PM | Permalink | Reply to this

Re: In Further Praise of Dependent Types

I (a mathematical logician) complain that the mathematical language and notation that my colleagues teach our students is ambiguous, and in some cases it is manifestly not clear to the students, and I think the clarity of thought needed is not there, because there are so many ambiguities in mathematical culture and behaviour that experts simply don’t see, but learners struggle with.

One example: Serre complained about number theorists using the word “constant”. Only with context and experience can you work out which quantities a number theorist’s “constant” does not depend on!

An example from a different context (BBC bitesize daily): “Remember, the larger the minus number, the smaller it is”. This is only an approximate quotation, but the ambiguity was there, namely being inconsistent about the choice of words used for magnitude and for the order.

I absolutely agree with Kevin that writing the same mathematics to be human-readable and machine-readable is essential.

Posted by: Jonathan Kirby on May 10, 2020 8:04 PM | Permalink | Reply to this

Re: In Further Praise of Dependent Types

there are so many ambiguities in mathematical culture and behaviour that experts simply don’t see, but learners struggle with.

Right. So the processes of formalizing and sugaring between the kind of language experts speak and something acceptable to a theorem-prover should be illuminating.

It would also be valuable to look at changing expert language over time. E.g., it took me a while to figure out what Euclid I, Prop. 14 means:

If with any straight line, and at a point on it, two straight lines not lying on the same side make the sum of the adjacent angles equal to two right angles, then the two straight lines are in a straight line with one another.

Posted by: David Corfield on May 11, 2020 7:47 AM | Permalink | Reply to this

Re: In Further Praise of Dependent Types

I enjoyed the talk, very interesting! Lean is good and has a large library. The most recent thing I saw is “A formal proof of the independence of the continuum hypothesis” (in Lean, using Boolean-valued models). Here’s a fairly quick survey of some of the automated theorem provers/assistants around (as of a few years ago, 2006; it doesn’t mention a few of the others that appear at the annual theorem-proving competitions – Vampire, E, SPASS): The Seventeen Provers of the World , edited by Freek Wiedijk (foreword, Dana Scott). In addition, there are industrial-performance SAT-solvers (and a newish development, SMT). My impression is that much of this tends to be studied by different groups, with different interests. At some point, particularly if a nice theorem prover/assistant appeared with a user-friendly interface (not an IDE), a GUI with point-and-click, easy access to settings via menus, etc., I suppose these things may come together for the average mathematician to use, along with computer algebra, which is already easily available e.g., in Mathematica. I guess someone at Wolfram needs to insert a proper theorem prover/assistant into Mathematica. I doubt it makes much difference what its underlying system is, so long as it’s fast and it has a good library.

Posted by: Jeffrey Ketland on May 13, 2020 11:11 AM | Permalink | Reply to this

Re: In Further Praise of Dependent Types

the meaning of [0,1] is completely unambiguous in any context

This is of course true, and any computer scientists who claim otherwise are wrong. Both readings of [0,1] are constructions, meaning that they have their type checked rather than synthesised. The type the expression is checked against will disambiguate it.

Unfortunately, even though the theory works out fine, there are a fair few technical problems in the way of current proof assistants doing what I (and you) mean. In Agda, [ 0 , 1 ], ] 0 , 1 ], &c would all be fine, but because Agda makes no distinction between letters and symbols, you need all them spaces. Then, I guess (based on your comment), Lean recognises the symbols [ and ], but reads too much into them. Another problem is that type-based disambiguation has its limits if we want to be able to parse without doing type-checking (which is probably good for performance and simplicity of implementation). Also, as far as I know, in all current proof assistants, there’s no support for making smart constructors be checked, like constructors are. This becomes increasingly important in large formalisations, where you’re working less and less with the raw implementation of anything in particular (as I’m sure you’re familiar with).

Posted by: James Wood on May 18, 2020 8:23 PM | Permalink | Reply to this

Re: In Further Praise of Dependent Types

A few years ago there was some interest in a dissertation then just done by Mohan Ganesalingam, about understanding the natural language used in mathematical writing. I think Tim Gowers did some work with Ganesalingam related to this afterwards. The thesis was online but later it was published as a book (review), so the online version was taken down. He does refer to Ranta’s earlier work.

It seems relevant to some aspects of this discussion so I’m mentioning it in case anyone wants to check further into it. I thought it looked interesting but I didn’t pursue it further than that. A web search can probably find more info.

Posted by: solrize on May 12, 2020 9:49 AM | Permalink | Reply to this

Re: In Further Praise of Dependent Types

Thanks for this. I should take a look.

The 2020 article by Ranta I mentioned is here. It’s a response to

Marco Ruffino, Luca San Mauro, Giorgio Venturi, ‘At least one black sheep: Pragmatics and mathematical language’,

which discusses Ganesalingam’s book. As always, there’s disagreement as to how to separate the ‘pragmatic’ from the ‘semantic’. In the book it had been claimed that

to a first approximation, mathematics does not exhibit any pragmatic phenomena.

Just before the passage I quoted in my post, Ranta writes:

I strongly agree with the main thesis of San Mauro et al. that mathematical language is permeated by pragmatic phenomena. Like San Mauro et al., I find it regrettable that “philosophers of language rarely consider mathematical language as a possible case-study to test their hypotheses” (San Mauro et al. this issue, p. 1). At the same time, I find Ganesalingam’s The Language of Mathematics to be an important contribution that sheds light not only on its explicit object of study but also on language in general, revealing fundamental mechanisms of semantics and pragmatics, even though Ganesalingam prefers to call them semantic as soon as they are properly understood.

Posted by: David Corfield on May 12, 2020 10:21 AM | Permalink | Reply to this

Re: In Further Praise of Dependent Types

Ganesalingam in The Language of Mathematics uses Discourse Representation Theory (DRT) where Ranta uses Dependent Type Theory (DTT). My money is still on DTT. Does anyone using DRT in computerised mathematics?

According to Ganesalingam

DRT was developed to handle Geach’s ‘donkey sentences’ (Geach, 1980) and, as we will show in §3.3.1, there are similar sentences in real mathematical texts. We will also be able to adapt DRT to overcome deficiencies noted by Ranta in his own work; see §3.3.2 for details. (p. 15, n6)

DTT claims to do well with donkey sentences too, as I said in the post.

As far as I can see from the preview, one supposed deficiency of DTT is treated in §3.3.6:

Our approach also correctly handles an area which Ranta highlights as a deficiency of his own analysis. He notes that the plural pronoun ‘they’ is sometimes given a distributive reading and sometimes a collective reading; cf.

Nor do we quite understand the use of the plural pronoun They, which is sometimes distributive, paraphrasable by the term conjunction e.g.

If AA and BB do not lie outside the line aa, they are incident on it

but sometimes used on the place of the “surface term conjunction”, so that it fuses together the arguments of the predicate, e.g.

If aa and bb do not converge, they are parallel.

(Ranta, 1994, p. 12)

Our analysis correctly distinguishes distributive and collective readings. ‘they’ is always taken to be a pronoun introducing an underlined plural discourse referent, requiring a plural antecedent. The difference in the readings of the two sentences given above is accounted for by the fact that ‘incident on it’ is distributive (as a single point is incident on a single line) and the adjective ‘parallel’ is collective (a single line cannot be parallel, but a collection of lines can). (p. 64)

How very odd. I wonder why Ranta believed this to be difficult for DTT? Suppose we have x,y:AP(x),Q(x,y):Propx, y: A \vdash P(x), Q(x, y): Prop and suppose we have a,b:Aa, b: A and P(a),P(b),Q(a,b)P(a), P(b), Q(a, b) all true. Then we would say ‘aa and bb are AAs. They are PP. They are QQ.’ if sugared versions allow this, such as when QQ says of two lines that they’re parallel. The form of the predicates makes clear which ‘they’ is intended.

Maybe it’s not clear when such a Q(a,b)Q(a, b) affords the sugaring, ‘They are QQ’. We wouldn’t do this for an asymmetric relation, such as less than. Certainly many symmetric relations are fine, ‘They are parallel/equal/isometric/homeomorphic’.

Posted by: David Corfield on May 12, 2020 11:47 AM | Permalink | Reply to this

Re: In Further Praise of Dependent Types

Later, referring to the same paper by Ranta, Ganesalingam writes (pp. 81-82):

The analysis given there is unable to cope with kinds of variable introduction that cannot be formalised as quantifiers. For example, consider the variable-introducing sentence:

If a point AA lies outside a line aa, AA determines a parallel to aa.

Ranta notes that his theory is unable to cope with this sentence because it would need to analyse ‘if a point AA lies outside a line aa’ as a quantifier, and the variable names would not be usable outside the scope of this quantifier (Ranta, 1994, p. 13).

The sentence is just the result of term elimination for the dependent product:

f: z:( x:point y:lineOutside(x,y))DAP(p(z),p(q(z))). f: \prod_{z: (\sum_{x: point} \sum_{y: line} Outside(x, y))}DAP(p(z), p(q(z))).

I don’t see how dependent type theory finds this kind of case problematic.

Posted by: David Corfield on May 12, 2020 11:00 PM | Permalink | Reply to this

Re: In Further Praise of Dependent Types

I’m just idly wondering: maybe there’s some difference in what you’re expecting from DTT and what Ranta and Ganesalingam want from a linguistic theory. You’re showing that DTT can express the intended meanings of informal math, but how does DTT help uncover the meaning? Like if you wanted to program a computer to read informal math. That’s a guess at why they find DTT unsatisfactory in these examples.

Posted by: Matt Oliveri on May 13, 2020 2:17 PM | Permalink | Reply to this

Re: In Further Praise of Dependent Types

I think maybe Ranta is after something more demanding than I am, a translation algorithm between DTT and informal mathematics. It certainly won’t be one-to-one:

An early implementation of sugaring … found 1128 variants of a donkey sentence with the same structure as the axiom of parallels.

To give an idea, elsewhere he gives nine natural language sugarings of x:man y:donkey(xownsy)\sum_{x: man} \sum_{y: donkey} (x\;owns\; y) by applying three different operations twice:

  • there is a man and there is a donkey and he owns it,
  • there is a man and he owns a donkey,
  • there is a man and there is a donkey that he owns,
  • there is a donkey and a man owns it,
  • a man owns a donkey,
  • there is a donkey that a man owns,
  • there is a man such that there is a donkey and he owns it,
  • there is a man who owns a donkey,
  • there is a man such that there is a donkey that he owns.

I guess the problem he raises about

If a point AA lies outside a line aa, AA determines a parallel to aa,

is that even with such multiple sugarings, we don’t reach this from a type. But we could just say so much the worse for that formulation. Isn’t it more properly expressed as

Given a point AA lying outside a line aa, then AA determines a parallel to aa,

a sugaring of something like

A:Point,a:Line,p:Outside(A,a)f(A,a,p):DAP(A,a)? A: Point, a: Line, p: Outside(A, a) \vdash f(A, a, p): DAP(A,a)?

Posted by: David Corfield on May 14, 2020 8:17 AM | Permalink | Reply to this

Re: In Further Praise of Dependent Types

Isn’t it more properly expressed as

Given a point AA lying outside a line aa, then AA determines a parallel to aa,

They both sound fine to me, and equivalent.

Here is a version that avoids giving names to proofs:

A:Point,a:LineOutside(A,a)DAP(A,a)trueA:Point,a:Line \vdash Outside(A,a) \to DAP(A,a)\;true

or

(A:Point)(a:Line),Outside(A,a)DAP(A,a)\forall (A:Point) (a:Line),Outside(A,a) \to DAP(A,a)

So the “main thing” is arguably an implication, with quantifiers being an “obvious” technical necessity.

Posted by: Matt Oliveri on May 14, 2020 9:31 PM | Permalink | Reply to this

etymology of sugaring?

I am intrigued by the term `sugaring’: my first guess was that it must be related to Mary Poppins’ spoonful of sugar (that helps the medicine go down) but then I found

syntactic sugar

Nevertheless I think it is interestingly related to the notion of `exuberance’ in the theory of translation, used by linguists and philologists, in particular

Alton L Becker

(connected with problems of translations between natural languages with very different categorizations of the universe of discourse, or with subject/object vs topic/comment structures).

Posted by: jackjohnson on May 15, 2020 3:05 PM | Permalink | Reply to this

Post a New Comment