Skip to content

Commit

Permalink
editing pass; too many changes to document, diff is best way to get a…
Browse files Browse the repository at this point in the history
…n idea.
  • Loading branch information
JacquesCarette committed Sep 4, 2024
1 parent 9fa502f commit 6c1e23f
Showing 1 changed file with 37 additions and 36 deletions.
73 changes: 37 additions & 36 deletions paper/paper.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ authors:
orcid: 0000-0000-0000-0000
affiliation: 2
- name: Carette, Jacques
orcid: 0000-0000-0000-0000
orcid: 0000-0001-8993-9804
affiliation: 3
- name: McKinna, James
orcid: 0000-0001-6745-2560
Expand Down Expand Up @@ -48,20 +48,21 @@ Unlike the standard libraries of traditional programming languages, the Agda sta

# Statement of need

Most programming languages include a "standard" library offering a basic set of algorithms, data structures, and operating system operations.
Most programming languages include a "standard" library offering a basic set of algorithms, data structures, and operating system procedures.
However, there are two reasons why a standard library is particularly important in Agda compared to traditional programming languages.

First, like other theorem provers, the Agda language provides only a minimal core set of primitives from which programs can be constructed.
As a result, many concepts traditionally considered part of a language must be defined within the program itself.
While this approach reduces compiler complexity and enhances its reliability, it also means that users have access to fewer built-in definitions initially.
For example, in a fresh Agda environment, there is no predefined notion of an integer or a string, let alone more complex data structures such as arrays or maps.
This approach reduces compiler complexity and enhances its reliability, and also shows the strength of the core language
itself as it can indeed push these concepts out to the library.
For example, in a fresh Agda environment, there is no predefined notion of an integer or a string, let alone more complex data structures such as arrays or maps. Thus the crucial need for a standard library.

This comment has been minimized.

Copy link
@MatthewDaggitt

MatthewDaggitt Sep 5, 2024

Contributor

Has this edit has lost the emphasis of why the Agda standard library is more crucial than other standard libraries? I was trying to talk about the (small) downside of this?

This comment has been minimized.

Copy link
@JacquesCarette

JacquesCarette Sep 5, 2024

Author Contributor

I think Lean has few built-ins too. I forget exactly how it goes in Coq. So I don't think that particular point was strong.

This comment has been minimized.

Copy link
@MatthewDaggitt

MatthewDaggitt Sep 5, 2024

Contributor

Yes, I think so. Okay maybe we should just drop this paragraph altogether given @jamesmckinna 's comments on Zulip.

This comment has been minimized.

Copy link
@jamesmckinna

jamesmckinna Sep 5, 2024

Contributor

Actually, I liked the original point, but thought we should be more careful about String. But if I think about it a bit more, how many 'real'/conventional data types are not backed by Agda.Builtins (even Float!)... maybe it's more noteworthy for the non-specialist that Agda takes nat over int, for all the 'obvious' reasons...?

This comment has been minimized.

Copy link
@JacquesCarette

JacquesCarette Sep 9, 2024

Author Contributor

I agree about nat over int, but that's something for the long design paper.


Second, Agda users often seek to prove that programs constructed using data types from the standard library are "correct." Therefore, the standard library must provide not only the operations for these data types but also proofs of their correctness (e.g., proving that integer addition is commutative or that string concatenation is associative).
Without the Agda standard library, something as simple as defining a function to reverse a string and proving that it preserves the length of the string would require hundreds of lines of code.
Second, Agda users often seek to prove that programs constructed using data types from the standard library are "correct."
Therefore, the standard library needs to provide all the necessary building blocks: not just operations for these data types but also proofs of their basic properties (e.g., that integer addition is commutative or string concatenation is associative). Starting from just the language, something as simple as defining a string reversing function and proving that it preserves the length of the string would require hundreds of lines of code.

This comment has been minimized.

Copy link
@MatthewDaggitt

MatthewDaggitt Sep 5, 2024

Contributor

Very minor but I don't think "string reversing function" is grammatical? Maybe "string-reversing function" is but was "function to reverse a string" really that bad?

This comment has been minimized.

Copy link
@JacquesCarette

JacquesCarette Sep 5, 2024

Author Contributor

"function to reverse a string" is not that bad on its own, but since many of the sentences are themselves fairly verbose, any 'lightening up' seemed worth it (to me). I can't say I'm strong enough on English grammar to say for sure that "string reversing function" is/isn't grammatical... but sounds fine to my hears!

This comment has been minimized.

Copy link
@jamesmckinna

jamesmckinna Sep 5, 2024

Contributor

I'd go for the hyphenated version, if we think that shortening is worth it. I've tried to do some sentence-shortening (a personal weakness of mine during revision) but maybe we should agree the substantive content ahead of such things?


# Impact

The Agda standard library has been used in a wide range of projects, too numerous to list exhaustively.
The Agda standard library (hereafter: `agda-stdlib`) has been used in a wide range of projects, too numerous to list exhaustively.
A diverse selection of such projects, not intended as endorsements over any others, includes:

- Formalisation of category theory [@hu2021categories]
Expand All @@ -74,58 +75,58 @@ A diverse selection of such projects, not intended as endorsements over any othe

- Verification of routing protocols [@daggitt2023routing]

As one of the largest existing Agda libraries, the standard library has had a synergistic relationship with the development of Agda itself, prompting the implementation of several new language features.
For example, the standard library is designed to be compatible with several different compiler options, including `--cubical` and `--safe`.
To enable this, in 2019 Agda categorised all language options into two categories of ''infective'' and ''coinfective'', allowing the library to precisely partition code that can be used under certain flag combinations.
This categorisation enables the library to integrate safe code natively defined in Agda with code that uses unsafe operating system calls, while maintaining the safety guarantees of the former.
`agda-stdlib` has had a synergistic relationship with the development of Agda itself, prompting the implementation of several new language features.

This comment has been minimized.

Copy link
@MatthewDaggitt

MatthewDaggitt Sep 5, 2024

Contributor

Starting a sentence with a quoted noun is a little aesthetically unpleasing?

This comment has been minimized.

Copy link
@JacquesCarette

JacquesCarette Sep 5, 2024

Author Contributor

I agree, and I hesitated. And yet the sentence, as constructed, really says the right thing. So I went with it.

This comment has been minimized.

Copy link
@jamesmckinna

jamesmckinna Sep 5, 2024

Contributor

(Tried to do this a GitHub suggestion, but it's come out as a quote instead?)

The development of `agda-stdlib` has had a synergistic relationship with that of Agda itself, prompting the implementation of several new language features.
For example, `agda-stdlib` is designed to be compatible with several different compiler options, including `--cubical` and `--safe`.
To enable this, in 2019 Agda categorised all language options into two categories of ''infective'' and ''coinfective'', allowing any library to precisely partition code that can be used under certain flag combinations.
This categorisation enables libraries to integrate safe Agda code with code that uses unsafe operating system calls, while maintaining the safety guarantees of the former.

Additionally, the library has directly influenced the language by requesting the ability to attach custom messages to definitions, which are then displayed by the compiler when the definitions are used, enabling the implementation of deprecation warnings.
Additionally, the development needs of `agda-stdlib` has directly influenced the language by requesting the ability to attach custom messages to definitions, which are then displayed by the compiler when the definitions are used, enabling the implementation of deprecation warnings. This lets ends users more easily evolve their code along with the evolution of `agda-stdlib`.

This comment has been minimized.

Copy link
@MatthewDaggitt

MatthewDaggitt Sep 5, 2024

Contributor

Should be have not has now that we've used the plural "development needs"?

This comment has been minimized.

Copy link
@JacquesCarette

JacquesCarette Sep 5, 2024

Author Contributor

Yes please.


# Design

Designing a standard library for an ITP such as Agda presents several challenges.

Firstly, as discussed, the standard library contains many of the foundational mathematical results used to prove program correctness.
Even though the library currently focuses on discrete mathematics - reflecting the bias in its user base towards programming language theory - organising this material into a coherent and logical structure is extremely challenging [@carette2020leveraging].
There is a constant tension between being as general as possible (e.g., defining operations over general algebraic structures) and providing clear, straightforward, and intuitive definitions (e.g., defining operations concretely over integers).
Additionally, there is a persistent temptation to introduce new representations of existing mathematical objects that are easier to work with for a particular problem, which comes at the cost of duplicating the theory for the new representation.
Theorem provers like Isabelle [@paulson1994isabelle] and Coq [@coq2024manual] approach these problems by having very minimal standard libraries and encouraging the use of external libraries developed by the community, which reduces the emphasis on ensuring the existence of canonical definitions for certain concepts.
On the other hand, MathLib [@van2020maintaining] for Lean aims to provide a repository of canonical definitions.
The design of the Agda standard library leans more towards the Lean approach, with a high bar set for adding alternative formalisations of the same result.
Firstly, as discussed, `agda-stdlib` contains much of the foundational mathematics used to prove program correctness.
While the focus on discrete mathematics and algebra reflects the bias in its user base towards programming language theory, organising this material into a coherent and logical structure is extremely challenging [@carette2020leveraging].
There is constant tension between being as general as possible (e.g., defining operations over general algebraic structures) and providing clear, straightforward, and intuitive definitions (e.g., defining operations concretely over integers).
Additionally, there is a persistent temptation to introduce new representations of existing mathematical objects that are easier to work with for a particular application, which comes at the cost of duplicating the theory for the new representation.
Theorem provers like Isabelle [@paulson1994isabelle] and Coq [@coq2024manual] approach these problems by having very minimal standard libraries and encouraging the use of external libraries developed by the community, which reduces the emphasis on ensuring the existence of canonical definitions for certain concepts, at the cost of lack of interoperability between variabous packages.
On the other hand, like `agda-stdlib`, MathLib [@van2020maintaining] for Lean aims to provide a repository of canonical definitions.

A second challenge is that Agda was the first major ITP to fully embrace dependently-typed programming as the default.
With the exception of Idris, a more recent entrant to the field [@brady2013idris], other major theorem provers either do not support dependent types or encourage them only to be used sparingly.
In contrast, nearly everything in the Agda standard library makes use of dependent types, with proofs consisting of evidence-bearing terms of the relevant dependent types.
With the exception of Idris, a more recent entrant to the field [@brady2013idris], other major theorem provers either do not support dependent types or encourage spare usage.
In contrast, nearly everything in `agda-stdlib` makes use of dependent types, with correctness-related invariants being closely integrated with definitions.
Furthermore most proofs consist of evidence-bearing terms for the relevant types, rather than being "irrelevant".
As a result, the library provides relatively sophisticated features like polymorphic n-ary functions [@allais2019generic], regular expressions which provide proof of membership when compiled and applied, and proof-carrying `All` and `Any` predicates for containers [citation?].
While this provides powerful tools for users, learning how to design such a large-scale, dependently-typed library is an on-going journey.
While this provides powerful tools for users, learning how to design such a large-scale, dependently-typed library is an on-going journey. `agda-stdlib` is the first library to tackle this challenge.

This comment has been minimized.

Copy link
@jamesmckinna

jamesmckinna Sep 5, 2024

Contributor

'ongoing' is fine!

Relatedly, the standard library has been used as a test bed for the design of the Agda language itself, as evidenced by the library's inclusion of three different notions of co-inductive data types.

Agda’s unique support for dependently-parameterized modules has also significantly influenced the library’s design.
Although type classes are a common mechanism for creating interfaces and overloading syntax in other functional languages such as Haskell [@haskell2010], the Agda standard library has so far found little need to use them extensively.
While Agda supports a very general form of type classes via instance search, the ability to use qualified, parameterized modules as first-class objects appears to reduce their necessity compared to other functional languages.
Although type classes are a common mechanism for creating interfaces and overloading syntax in other functional languages such as Haskell [@haskell2010], and other ITPs like Coq and Lean's MathLib use then extensively as a core feature of their design, the Agda standard library has so far found little need to use them much.

This comment has been minimized.

Copy link
@jamesmckinna

jamesmckinna Sep 5, 2024

Contributor

'them' not 'then'

This comment has been minimized.

Copy link
@jamesmckinna

jamesmckinna Sep 5, 2024

Contributor

As for the content+intention behind the sentence... we surely have users such as Conal Elliott who would much rather be able to make heavy use of haskell-style idiomatics, but various things seem to frustrate that (mostly to do with instance inference, I think?), rather than stdlib design decisions per se... are we defending stdlib design decisions on their own feet, or hinting that (design/implementation decisions in) Agda has led to particular choices for the library? I realise the latter point is 'self-evident', but equally, we should try to stay even-handed about any such experimental/instrumental bias in our thinking?

This comment has been minimized.

Copy link
@jamesmckinna

jamesmckinna Sep 5, 2024

Contributor

Also, in the last clause: 'the Agda standard library has'->'agda-stdlib has', but I wonder even if it would be better to say out loud 'the developers of agda-stdlib have'?

This comment has been minimized.

Copy link
@JacquesCarette

JacquesCarette Sep 9, 2024

Author Contributor

I like this 'out loud' suggestion, will implement. As to the previous comment, I think it's "too deep" for this shallow paper.

While Agda supports a very general form of instance search, the ability to use qualified, parameterized modules as first-class objects appears to reduce their necessity compared to the languages mentioned above.

This comment has been minimized.

Copy link
@MatthewDaggitt

MatthewDaggitt Sep 5, 2024

Contributor

"little need to use them much" doesn't read very well. Doubling up of "little" and "much".

This comment has been minimized.

Copy link
@JacquesCarette

JacquesCarette Sep 5, 2024

Author Contributor

Completely agree! Suggestions? For this one, just go ahead and edit the whole sentence. I think 'extensively' was going overboard, but I kind of broke the sentence in trying to fix that.

This comment has been minimized.

Copy link
@jamesmckinna

jamesmckinna Sep 5, 2024

Contributor

Eh? My comments above now seem to mean these earlier ones of yours are unmoored form their original references... sorry about that!

Additionally, module parameters enable the safe and scalable embedding of non-constructive mathematics into a constructive system.
Since Agda is entirely constructive, the vast majority of the standard library is also constructive.
However, the library provides the option to perform non-constructive, classical reasoning, such as the law of excluded middle, by passing the relevant axioms as module parameters.
This allows users to write such code without directly having to postulate the non-constructive axioms, which would prevent them from using the code with the `--safe` compiler flag.
Since Agda is entirely constructive, the vast majority of `agda-stdlib` is also constructive.
Non-constructive classical reasoning can be achieved by passing the relevant axioms as module parameters.

This comment has been minimized.

Copy link
@MatthewDaggitt

MatthewDaggitt Sep 5, 2024

Contributor

Necessary comma has been accidentally removed after "non-constructive"

This comment has been minimized.

Copy link
@JacquesCarette

JacquesCarette Sep 5, 2024

Author Contributor

Go ahead and re-insert.

This comment has been minimized.

Copy link
@jamesmckinna

jamesmckinna Sep 9, 2024

Contributor

Plus some additional bridge text.

This enables users to write provably 'safe' non-constructive code, i.e. with having to *postulate* non-constructive axioms.

This comment has been minimized.

Copy link
@MatthewDaggitt

MatthewDaggitt Sep 5, 2024

Contributor

Did you mean "without" rather than "with"?

This comment has been minimized.

Copy link
@JacquesCarette

JacquesCarette Sep 5, 2024

Author Contributor

Good catch!


# Testing

One of the advantages of creating a standard library for an ITP is that proving the correctness of the defined operations is an integral part of the library itself.
As a result, there is no need for test suites to verify functional correctness.
However, the library’s test suite does address two critical areas.
The first area is the foreign function interface with the underlying operating system (e.g., reading from the command line, file access, timers).
Since the correctness of the bindings to external libraries or the underlying OS' primitives cannot be reasoned about in Agda itself, these operations are included in the test suite.
One of the advantages of ITPs is that correctness proofs are regarding as an integral part of creating a collection of structures and operations.

This comment has been minimized.

Copy link
@MatthewDaggitt

MatthewDaggitt Sep 5, 2024

Contributor

Should be "regarded" not "regarding". I also don't think that the "creating a collection of structures and operations" is as relevant as what was there before.

This comment has been minimized.

Copy link
@JacquesCarette

JacquesCarette Sep 5, 2024

Author Contributor

Agree on the grammar.

On the second part: I guess we're not quite agreeing on what point ought to be made here is. If we figure that out, we'll figure out which sentence to put there.

This comment has been minimized.

Copy link
@MatthewDaggitt

MatthewDaggitt Sep 5, 2024

Contributor

I guess my question is what's the difference to you between "a library" and "a collection of structures and operations"?

In my mind you could imagine a library with no structures at all... (okay, not a standard library but a library). I'm also not sure our readers are going to know what we mean by "structures" where as they will know what a "library" is.

This comment has been minimized.

Copy link
@jamesmckinna

jamesmckinna Sep 5, 2024

Contributor

Still: should we spell out the mechanics of 'structure' and 'bundle' as we use them, cf. 'Pebble-style' etc.?
(and, my own view, that we still haven't quite got the whole thing right... yet!?)

This comment has been minimized.

Copy link
@JacquesCarette

JacquesCarette Sep 9, 2024

Author Contributor

Rethinking about it: for the purposes of this short paper, "operations" will resonate more with the reader than "a collection of structures and operations". This needs more space to handle properly - so I've brought that wording back.

Spelling things out definitely belongs in the long paper.

Thus there is no need for test suites to verify functional correctness.
However the library’s test suite does address two critical areas.
First is the foreign function interface with the underlying operating system (e.g., reading from the command line, file access, timers).
Correctness of bindings to an external library or the underlying OS' primitives cannot be reasoned about in Agda itself, these operations are tested externally, i.e. in a test suite.
The second area is performance.
The performance of a program cannot be analysed within Agda, making it necessary to include performance tests.
Although the library currently includes a few performance tests, this has not so far been a major priority for the community, and remains an area in need of further work.
Performance also cannot be analysed internally, making it necessary to include performance tests.
This part of the test suite is sparser, as this has not yet been a major priority for the community.

# Notable achievements in version 2.0

This short paper outlines the state of version 2.0 of the Agda standard library, in which we believe we have successfully addressed some of the significant design challenges present in version 1.0. Key improvements include:
We outline the state of version 2.0 of `agda-stdlib`, where we believe we have successfully addressed some of the significant design challenges present in versions 1.0-1.7. Key improvements include:

- Minimized Dependency Graphs: We have reduced the depth of dependency graphs within the library, ensuring that the most commonly used modules rely on fewer parts of the library. This change has resulted in significantly faster load times for users during interactive development.

- Standardisation of Mathematical Object Construction: We have standardised the construction of mathematical objects such as groups, rings, orders, equivalences, etc., from their sub-objects, enhancing consistency and usability.
- Standardisation: We have standardised the construction of mathematical objects such as groups, rings, orders, equivalences, etc., from their sub-objects, enhancing consistency and usability. We have also worked on standardizing morphisms of such objects.

- Introduction of a Tactics Library: We’ve introduced a small but growing tactics library. Experiments have shown that these tactics are currently slower than those in comparable systems, indicating a potential area for future improvements in Agda itself.

Expand Down

1 comment on commit 6c1e23f

@jamesmckinna
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have propagated many of the corrections/rephrasings discussed above in my latest commit.

Please sign in to comment.