Skip to content

feat: Create Asymptotics/GrowthRates#468

Open
Timeroot wants to merge 12 commits into
leanprover:mainfrom
Timeroot:growthrates_1
Open

feat: Create Asymptotics/GrowthRates#468
Timeroot wants to merge 12 commits into
leanprover:mainfrom
Timeroot:growthrates_1

Conversation

@Timeroot
Copy link
Copy Markdown

@Timeroot Timeroot commented Apr 3, 2026

I'll copy the docstring from the file, because I think it explains pretty well what the goal here is. I'm trying to PR this into CSLib from CircuitComp.

Asymptotic Growth Rates

Named Growth Rates

This file collects about common "growth rates" that show up in complexity theory. While
IsBigO expresses growth rate up to a multiplicative constant, there are other important
classes not directly expressible as IsBigO. In rough order of literature frequency:

  • GrowthRate.poly: Polynomial growth, typically written poly(n) or n ^ O(1).
  • GrowthRate.polylog: (log n)^k, that is, a polynomial in the logarithm.
  • GrowthRate.exp: Exponential growth with any rate, often written exp(O(n))
  • GrowthRate.sublinear: Sublinear growth, typically written o(n).
  • GrowthRate.quasilinear: Growth as O(n * (log n)^k)
  • GrowthRate.quasipoly: Growth as O(2 ^ (log n)^k)
  • GrowthRate.primitiveRecursive: Growth as a primitive recursive function.
  • GrowthRate.computable: Any computable function. This excludes, for instance, the Busy
    Beaver function.

These are all given as a GrowthRate := Set (ℕ → ℕ). We have GrowthRate.bigO as a thin wrapper
around Asymptotics.IsBigO, likewise for littleO.

We also provide aliases for some of the more common Big-O classes, in order to work
with them more cleanly.

  • GrowthRate.const: O(1)
  • GrowthRate.log: O(log n)
  • GrowthRate.sqrt: O(sqrt n)
  • GrowthRate.linear: O(n)
  • GrowthRate.linearithmic: O(n * log n)
  • GrowthRate.two_pow: O(2 ^ n)
  • GrowthRate.e_pow: O(Real.exp n)

Where they involve functions with different definitions on
distinct types (e.g. Nat.sqrt vs. Real.sqrt, or (2 : ℕ) ^ · vs. (2 : ℝ) ^ .), we
want to have both forms.

Since all of these rates are Sets, their ordering of "faster growth" is given by the
subset relation . That is, where you might want to write f ≤ g where f and g
are growth rates, this is best written as f ⊆ g.

Lawful Growth Rates

We call a GrowthRate lawful if it is closed under dominating sequences, addition, and
composition with a sublinear function; and is nontrivial (it contains at least one function
besides zero).

This last condition is equivalent to containing the constant function 1; or, containing any
two distinct functions. These conditions are enough to get most desirable properties, and all
of above have LawfulGrowthRate instances. This allows reusable proofs for many common properties,
such as invariance under affine scaling.

Main Theorems

Most theorems in this file fall into one of three categories:

  • Equivalent definitions. Sometimes it's more convenient to have expressions as ℕ → ℕ,
    sometimes it's more convenient to work with real numbers. (For instance, e ^ n, or different
    bases of logarithms.) For instance, GrowthRate.log_iff_rlog relates GrowthRate.log to the
    real function Real.log, instead of its definition in terms of Nat.log 2.
  • Ordering. For instance, GrowthRate.exp_ssubset_primitiveRecursive shows that exp is a strict
    subset of primitiveRecursive.
  • Closure properties. For instance, GrowthRate.linear_comp says that any LawfulGrowthRate is
    closed under composition with any function f ∈ GrowthRate.linear. GrowthRate.poly_comp says
    that GrowthRate.poly is closed under composition. And GrowthRate.exp_mul says that
    GrowthRate.exp is closed under multiplication.
/-- A **Growth rate** is just any collection of `ℕ → ℕ`, but as a type alias intended for
discussing how quickly certain classes functions grow, as is often needed in asymptotic runtime
or memory analysis in computational complexity theory. A `LawfulGrowthRate` instance puts
constraints on this set behaving well in various ways.
-/
abbrev GrowthRate := Set (ℕ → ℕ)

I realize this is a big file at the moment. I've already done a lot of cleanup (many of the proofs, but not the majority, were from Aristotle) but there's still plenty of room for improvement I'll admit. It's doing a lot of things, and a lot of them are messy! Converting between Real logs and powers, Ints, and Nats; different bases, different powers, showing these things are all equivalent. The hope is that when these invariably come up in other asymptotic analysis, it's a bit easier; and it gives a "canonical spelling" for certain classes of growth rates, in a way that is currently lacking.

I do think it's probably better to split the file, and I'm hoping for input as to where. Also especially looking for feedback on the choice of LawfulGrowthRate axioms, I could reasonably see having more or fewer.

@chenson2018 chenson2018 self-assigned this Apr 7, 2026
@sorrachai
Copy link
Copy Markdown
Collaborator

I would recommend adding additional common classes (nearly linear and almost linear time have been standard terms in TCS).

GrowthRate.subpolynomial: Growth as O(n^{o(1)}) (e.g., 2^{sqrt{\log n}})
GrowthRate.nearly_linear: Growth as O(n * (log n)^k)
GrowthRate.almost_linear: Growth as O(n * n^o(1))

I would reserve quasi-linear for O(n*(alpha(n)), where alpha(n) is the inverse Ackermann function.

@ctchou
Copy link
Copy Markdown
Collaborator

ctchou commented May 28, 2026

There is also the iterated logarithm:
https://en.wikipedia.org/wiki/Iterated_logarithm

@Timeroot
Copy link
Copy Markdown
Author

I agree about subpolynomial certainly.

The Foo-linear terminology seems like a mess in the literature!

Here's a source for "quasilinear" meaning O(n * log^k(n)): https://en.wikipedia.org/wiki/Time_complexity#Quasilinear_time
The other first two results I find for "quasilinear" agree as well, https://link.springer.com/chapter/10.1007/3-540-57785-8_134 and https://cse.buffalo.edu/~regan/papers/pdf/NRS95.pdf

I'm not sure about nearly_linear and almost_linear being particularly standard. When I google them, the first results I get are
https://arxiv.org/abs/2011.05365
https://arxiv.org/abs/1304.2338
which use (respectively) use the words "nearly-linear time" and "almost-linear-time" to refer to the same O(n ^ {1 + o(1)}) behavior. https://arxiv.org/abs/2407.10830 uses "almost-linear time" this way as well.

https://cstheory.stackexchange.com/questions/39584/examples-of-quasilinear-vs-essentially-linear-time-translatable-models uses quasilinear to mean O(n * log^k(n)), and "essentially linear" to mean O(n ^ {1 + o(1)}).
https://www.sciencedirect.com/science/article/pii/S0196677404000732 and https://cr.yp.to/papers/powers-ams.pdf seem to also use "essentially linear" for this as well.

https://cstheory.stackexchange.com/a/36697 suggests O(n ^ (1 + epsilon)) -- or more precisely, the intersection of these for all epsilon > 0 -- to mean "near-linear", but doesn't seem to cite any literature uses of this form.

It's worth noting that "quasilinear" has an unrelated meaning coming from convex analysis, and this notion is actually in Mathlib. It has a separate meaning in economics as well (a function in two variables which is linear in one variable but not the other).

I tried looking up "pseudolinear" to see if that was a thing, it seems not to be.

Based on this, it seems like quasilinear = O(n * log^k(n)) is quite standard for a growth rate, and we should keep that. All of "nearly linear", "almost linear", and "essentially linear" seem to typically imply O(n ^ {1 + o(1)}). I think using any of those three terms listed would be fine and I couldn't find a clear winner. I couldn't find any standard term for O(n * alpha(n)) or O(n ^ (1 + epsilon). I'm not convinced we need one for the former (I want to add a Mul instance on GrowthRate that makes it mostly easy), and for the latter maybe I would suggest "inf_linear"?

@Timeroot
Copy link
Copy Markdown
Author

For now, I'll look at adding GrowthRate.inverseAckermann and GrowthRate.iteratedLog, and I'll write some Add and Mul instances to show how that can work.

I'll start on splitting the files.

@sorrachai
Copy link
Copy Markdown
Collaborator

sorrachai commented May 28, 2026

The term "nearly linear" vs. "almost linear" has been standard for the last 10 years (you can check any algorithm conference from the past 10 years, and you will find this term with high probability). In your reference, $\tilde O(n)$ refers to nearly linear time. In your referred paper, they use tilde to omit the polylogs factor, whereas they list the subpolynomial factor explicitly in $\tilde O(n^{1+o(1)})$ which refers to almost linear time. If you look closely at that paper, they define the tilde notation to hide nearly linear time. You don't see quasi-linear terms in recent algorithm papers anymore (in the last 10 years).

Screenshot 2026-05-28 at 20 31 10

You can sample a few more if you want to get the most robust statistics.

@Timeroot
Copy link
Copy Markdown
Author

Timeroot commented May 28, 2026

Now I'm confused. Surely \tilde{O}(n ^{1 + o(1)}) is the same as O(n^{1 + o(1)}), and would be (under your definition) "almost-linear time", and not "nearly-linear time"? And the title of the paper is "nearly-linear time", and their main result is that the algorithm runs in time \tilde{O}(n ^{1 + o(1)}).

I liked your suggestion of just looking directly at recent conferences. (I wasn't aware that this might be a recent, 10-year shift in terminology, so especially thanks for that tip!) I took a quick survey of papers accepted to STOC 2025 with some variant of 'linear' in their title or abstract, to get statistics:

FOCS 2025:

I guess that seems pretty compelling. :)

So I'll change quasilinear to nearLinear, and add almostLinear. I'm using the underscore-free version of the name because these are defs (Sets of functions) and I want to follow that convention.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants