From f259d225c5352692980b3c1d8093bb888b396af7 Mon Sep 17 00:00:00 2001 From: Bruce Collie Date: Thu, 22 Feb 2024 17:25:24 +0000 Subject: [PATCH] Add attribute-based overloading (#3985) This PR is the next step towards the changes suggested in https://github.com/runtimeverification/k/issues/3966; having added a new syntax for the `symbol(_)` attribute in #3978, we can now address the second use of `klabel(_)` (marking overloaded sets of symbols). The changes in this PR are as follows: * Add a new attribute `overload(_)` that behaves identically to the current semantics of `klabel(_)` with respect to the construction of the overload partial ordering, but that does not participate in the _naming_ of symbols. * Add checks for the new attribute: * Disallow `klabel(_)` and `overload(_)` on the same production; overload sets must be converted fully to use the new mechanism. * Warn when an overload set of only one production is generated. * Document the semantics of overloading in the user manual based on our conversations in #3966. * Add a new integration test that checks a simple example of overloading works as expected to disambiguate parses. Once this PR is merged, we can begin the long process of deprecating `klabel(_)` and `symbol`! --------- Co-authored-by: rv-jenkins --- docs/user_manual.md | 54 +++++++++++- .../checkWarns/singletonOverload.k | 11 +++ .../checkWarns/singletonOverload.k.out | 11 +++ .../regression-new/checks/overloadKLabel.k | 6 ++ .../checks/overloadKLabel.k.out | 6 ++ .../tests/regression-new/overload/Makefile | 6 ++ .../tests/regression-new/overload/exp.test | 1 + .../regression-new/overload/exp.test.out | 3 + .../tests/regression-new/overload/lval.test | 1 + .../regression-new/overload/lval.test.out | 3 + .../tests/regression-new/overload/test.k | 13 +++ .../kframework/compile/checks/CheckAtt.java | 9 ++ .../java/org/kframework/kompile/Kompile.java | 23 +++++ .../utils/errorsystem/KException.java | 1 + .../scala/org/kframework/attributes/Att.scala | 1 + .../org/kframework/definition/outer.scala | 86 ++++++++++++++++--- 16 files changed, 218 insertions(+), 17 deletions(-) create mode 100644 k-distribution/tests/regression-new/checkWarns/singletonOverload.k create mode 100644 k-distribution/tests/regression-new/checkWarns/singletonOverload.k.out create mode 100644 k-distribution/tests/regression-new/checks/overloadKLabel.k create mode 100644 k-distribution/tests/regression-new/checks/overloadKLabel.k.out create mode 100644 k-distribution/tests/regression-new/overload/Makefile create mode 100644 k-distribution/tests/regression-new/overload/exp.test create mode 100644 k-distribution/tests/regression-new/overload/exp.test.out create mode 100644 k-distribution/tests/regression-new/overload/lval.test create mode 100644 k-distribution/tests/regression-new/overload/lval.test.out create mode 100644 k-distribution/tests/regression-new/overload/test.k diff --git a/docs/user_manual.md b/docs/user_manual.md index cfb08129172..2d5d0388565 100644 --- a/docs/user_manual.md +++ b/docs/user_manual.md @@ -349,11 +349,55 @@ this way; it would be an error to apply `symbol(foo)` to another production in the module above. Additionally, `symbol(_)` with an argument may not co-occur with the `klabel(_)` attribute (see below). +### `overload` attribute + +K supports _subsort overloading_[^maude-overload] on symbols, whereby a +constructor can have a more specific sort for certain arguments. For example, +consider the following productions derived from a C-like language semantics: +```k +syntax Exp ::= LVal + | Exp "." Id +syntax LVal ::= LVal "." Id +``` + +Here, it is useful for the result of the dot operator to be an `LVal` if the +left-hand side is itself an `LVal`. However, there is an issue with the code +as written: if `L()` is a term of sort `LVal`, then the program `L() . x` has a +parsing ambiguity between the two productions for the dot operator. To resolve +this, we can mark the productions as _overloads_: +```k +syntax Exp ::= LVal + | Exp "." Id [overload(_._)] +syntax LVal ::= LVal "." Id [overload(_._)] +``` + +Now, the parser will select the _most specific_ overloaded production when it +resolves ambiguities in `L() . x` (that is, `L() . x` parses to a term of sort +`LVal`. + +Formally, the compiler organises productions into a partial order that defines +the overload relation as follows. We say that `P` is a more specific overload +of `Q` if: +* `P` and `Q` have the same `overload(_)` attribute. Note that the argument + supplied has no semantic meaning other than as a key grouping productions + together. +* Let `S_P` be the sort of `P`, and `S_p1` etc. be the sorts of its arguments + (c.f. for `Q`). The tuple `(S_P, S_p1, ..., S_pN)` must be elementwise + _strictly less than_ `(S_Q, S_q1, ..., S_qN)` according to the definition's + subsorting relationship. That is, a term from production `P` is a restriction + of one from production `Q`; when its arguments are more precise, we can give + the result a more precise sort. + ### `klabel(_)` and `symbol` attributes **Note: the `klabel(_), symbol` approach described in this section is a legacy -feature that will be removed in the future. In new code, it should currently -only be used to opt in to symbol overloading.** +feature that will be removed in the future. New code should use the `symbol(_)` +and `overload(_)` attributes to opt into explicit naming and overloading +respectively.** + +_References here to "overloading" are explained in the section above; the use +of the `klabel(_)` attribute without `symbol` is equivalent to the new +`overload(_)` syntax._ By default K generates for each syntax definition a long and obfuscated klabel string, which serves as a unique internal identifier and also is used in kast @@ -3014,6 +3058,7 @@ arguments. A legend describing how to interpret the index follows. | `multiplicity = "_"` | cell | all | [Collection Cells: `multiplicity` and `type` attributes](#collection-cells-multiplicity-and-type-attributes) | | `non-assoc` | prod | all | [Symbol priority and associativity](#symbol-priority-and-associativity) | | `one-path` | claim | haskell | [`all-path` and `one-path` attributes to distinguish reachability claims](#all-path-and-one-path-attributes-to-distinguish-reachability-claims) | +| `overload(_)` | prod | all | [`overload(_)` attribute](#overload-attribute) | | `owise` | rule | all | [`owise` and `priority` attributes](#owise-and-priority-attributes) | | `prec(_)` | token | all | [`prec` attribute](#prec-attribute) | | `prefer` | prod | all | [Symbol priority and associativity](#symbol-priority-and-associativity) | @@ -3147,7 +3192,8 @@ sed 's/hook(//' | sed 's/)//' | sort | uniq | grep -v org.kframework All of these hooks will also eventually need documentation. -[^1]: More precisely, a lightly-customized debugger built using the LLDB Python - API. [^unique-symbol]: Except for in a very limited number of special cases from the K standard library. +[^maude-overload]: The [Maude documentation](https://maude.lcc.uma.es/maude-manual/maude-manualch3.html#x15-310003.6) + has an example in a context that's somewhat similar to K; discussion of + _ad-hoc_ overloading is not relevant. diff --git a/k-distribution/tests/regression-new/checkWarns/singletonOverload.k b/k-distribution/tests/regression-new/checkWarns/singletonOverload.k new file mode 100644 index 00000000000..6466b993c45 --- /dev/null +++ b/k-distribution/tests/regression-new/checkWarns/singletonOverload.k @@ -0,0 +1,11 @@ +module SINGLETONOVERLOAD-SYNTAX +endmodule + +module SINGLETONOVERLOAD + imports ID + + syntax LVal ::= L() [unused] + | LVal "." Id [unused, overload(_.)] + syntax Exp ::= LVal + | Exp "." Id [unused, overload(_._)] +endmodule diff --git a/k-distribution/tests/regression-new/checkWarns/singletonOverload.k.out b/k-distribution/tests/regression-new/checkWarns/singletonOverload.k.out new file mode 100644 index 00000000000..9c7d52ab526 --- /dev/null +++ b/k-distribution/tests/regression-new/checkWarns/singletonOverload.k.out @@ -0,0 +1,11 @@ +[Error] Compiler: Production has an `overload(_)` attribute but is not an overload of any other production. + Source(singletonOverload.k) + Location(8,19,8,53) + 8 | | LVal "." Id [unused, overload(_.)] + . ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +[Error] Compiler: Production has an `overload(_)` attribute but is not an overload of any other production. + Source(singletonOverload.k) + Location(10,19,10,54) + 10 | | Exp "." Id [unused, overload(_._)] + . ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +[Error] Compiler: Had 2 structural errors. diff --git a/k-distribution/tests/regression-new/checks/overloadKLabel.k b/k-distribution/tests/regression-new/checks/overloadKLabel.k new file mode 100644 index 00000000000..6543bc08c37 --- /dev/null +++ b/k-distribution/tests/regression-new/checks/overloadKLabel.k @@ -0,0 +1,6 @@ +module OVERLOADKLABEL + imports ID + syntax Exp ::= LVal + | Exp "." Id [klabel(_._), overload(_._)] + syntax LVal ::= LVal "." Id [overload(_._)] +endmodule diff --git a/k-distribution/tests/regression-new/checks/overloadKLabel.k.out b/k-distribution/tests/regression-new/checks/overloadKLabel.k.out new file mode 100644 index 00000000000..44d93e75af2 --- /dev/null +++ b/k-distribution/tests/regression-new/checks/overloadKLabel.k.out @@ -0,0 +1,6 @@ +[Error] Compiler: The attributes `klabel(_)` and `overload(_)` may not occur together. + Source(overloadKLabel.k) + Location(4,19,4,59) + 4 | | Exp "." Id [klabel(_._), overload(_._)] + . ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +[Error] Compiler: Had 1 structural errors. diff --git a/k-distribution/tests/regression-new/overload/Makefile b/k-distribution/tests/regression-new/overload/Makefile new file mode 100644 index 00000000000..86158b0f8d9 --- /dev/null +++ b/k-distribution/tests/regression-new/overload/Makefile @@ -0,0 +1,6 @@ +DEF=test +EXT=test +TESTDIR=. +KOMPILE_FLAGS=--syntax-module TEST + +include ../../../include/kframework/ktest.mak diff --git a/k-distribution/tests/regression-new/overload/exp.test b/k-distribution/tests/regression-new/overload/exp.test new file mode 100644 index 00000000000..c2dd4496210 --- /dev/null +++ b/k-distribution/tests/regression-new/overload/exp.test @@ -0,0 +1 @@ +E().x diff --git a/k-distribution/tests/regression-new/overload/exp.test.out b/k-distribution/tests/regression-new/overload/exp.test.out new file mode 100644 index 00000000000..acdaffbbc93 --- /dev/null +++ b/k-distribution/tests/regression-new/overload/exp.test.out @@ -0,0 +1,3 @@ + + E ( ) . x ~> .K + diff --git a/k-distribution/tests/regression-new/overload/lval.test b/k-distribution/tests/regression-new/overload/lval.test new file mode 100644 index 00000000000..01809623228 --- /dev/null +++ b/k-distribution/tests/regression-new/overload/lval.test @@ -0,0 +1 @@ +L().x diff --git a/k-distribution/tests/regression-new/overload/lval.test.out b/k-distribution/tests/regression-new/overload/lval.test.out new file mode 100644 index 00000000000..9fbc8d8cdd8 --- /dev/null +++ b/k-distribution/tests/regression-new/overload/lval.test.out @@ -0,0 +1,3 @@ + + done ( ) ~> .K + diff --git a/k-distribution/tests/regression-new/overload/test.k b/k-distribution/tests/regression-new/overload/test.k new file mode 100644 index 00000000000..4def9d98c6b --- /dev/null +++ b/k-distribution/tests/regression-new/overload/test.k @@ -0,0 +1,13 @@ +module TEST + imports ID + + syntax LVal ::= L() + syntax Exp ::= E() + | LVal + + syntax Exp ::= Exp "." Id [overload(_._)] + syntax LVal ::= LVal "." Id [overload(_._)] + + syntax KItem ::= done() + rule _:LVal => done() +endmodule diff --git a/kernel/src/main/java/org/kframework/compile/checks/CheckAtt.java b/kernel/src/main/java/org/kframework/compile/checks/CheckAtt.java index 75c6e6272b7..468a8c1a009 100644 --- a/kernel/src/main/java/org/kframework/compile/checks/CheckAtt.java +++ b/kernel/src/main/java/org/kframework/compile/checks/CheckAtt.java @@ -71,6 +71,7 @@ private void checkProduction(Production prod) { checkTerminatorKLabel(prod); checkLatex(prod); checkSymbolKLabel(prod); + checkKLabelOverload(prod); } private void checkUnrecognizedAtts(T term) { @@ -349,4 +350,12 @@ private void checkSymbolKLabel(Production prod) { } } } + + private void checkKLabelOverload(Production prod) { + if (prod.att().contains(Att.KLABEL()) && prod.att().contains(Att.OVERLOAD())) { + errors.add( + KEMException.compilerError( + "The attributes `klabel(_)` and `overload(_)` may not occur together.", prod)); + } + } } diff --git a/kernel/src/main/java/org/kframework/kompile/Kompile.java b/kernel/src/main/java/org/kframework/kompile/Kompile.java index 5513c993fc4..985d0d70e2e 100644 --- a/kernel/src/main/java/org/kframework/kompile/Kompile.java +++ b/kernel/src/main/java/org/kframework/kompile/Kompile.java @@ -66,6 +66,7 @@ import org.kframework.utils.Stopwatch; import org.kframework.utils.StringUtil; import org.kframework.utils.errorsystem.KEMException; +import org.kframework.utils.errorsystem.KException; import org.kframework.utils.errorsystem.KExceptionManager; import org.kframework.utils.file.FileUtil; import org.kframework.utils.file.JarInfo; @@ -544,6 +545,8 @@ public void structuralChecks( Option kModule, Set excludedModuleTags) { checkAnywhereRules(modules); + checkOverloads(modules); + boolean isSymbolic = excludedModuleTags.contains(Att.CONCRETE()); CheckRHSVariables checkRHSVariables = new CheckRHSVariables(errors, !isSymbolic, kompileOptions.backend); @@ -631,6 +634,26 @@ public void structuralChecks( } } + private void checkOverloads(Module module) { + var withOverload = module.productions().filter(p -> p.att().contains(Att.OVERLOAD())).toSeq(); + + stream(withOverload) + .forEach( + p -> { + if (!module.overloads().elements().contains(p)) { + kem.registerCompilerWarning( + KException.ExceptionType.SINGLETON_OVERLOAD, + errors, + "Production has an `overload(_)` attribute but is not an overload of any other production.", + p); + } + }); + } + + private void checkOverloads(scala.collection.Set modules) { + stream(modules).forEach(this::checkOverloads); + } + private void checkAnywhereRules(scala.collection.Set modules) { if (kompileOptions.backend.equals(Backends.HASKELL) && !kompileOptions.allowAnywhereRulesHaskell) { diff --git a/kore/src/main/java/org/kframework/utils/errorsystem/KException.java b/kore/src/main/java/org/kframework/utils/errorsystem/KException.java index 4cdbc270cc0..01d62084973 100755 --- a/kore/src/main/java/org/kframework/utils/errorsystem/KException.java +++ b/kore/src/main/java/org/kframework/utils/errorsystem/KException.java @@ -140,6 +140,7 @@ public enum ExceptionType { DEPRECATED_DIRECTORY_FLAG, DEPRECATED_SYMBOL, MISSING_HOOK, + SINGLETON_OVERLOAD, FIRST_HIDDEN, // warnings below here are hidden by default USELESS_RULE, UNRESOLVED_FUNCTION_SYMBOL, diff --git a/kore/src/main/scala/org/kframework/attributes/Att.scala b/kore/src/main/scala/org/kframework/attributes/Att.scala index 1ff01063413..7c2f6f00a5f 100644 --- a/kore/src/main/scala/org/kframework/attributes/Att.scala +++ b/kore/src/main/scala/org/kframework/attributes/Att.scala @@ -251,6 +251,7 @@ object Att { final val NO_EVALUATORS = Key.builtin("no-evaluators", KeyParameter.Forbidden, onlyon[Production]) final val ONE_PATH = Key.builtin("one-path", KeyParameter.Forbidden, onlyon2[Claim, Module]) final val OWISE = Key.builtin("owise", KeyParameter.Forbidden, onlyon[Rule]) + final val OVERLOAD = Key.builtin("overload", KeyParameter.Required, onlyon[Production]) final val PARSER = Key.builtin("parser", KeyParameter.Required, onlyon[Production]) final val PREC = Key.builtin("prec", KeyParameter.Required, onlyon[Production]) final val PREFER = Key.builtin("prefer", KeyParameter.Forbidden, onlyon[Production]) diff --git a/kore/src/main/scala/org/kframework/definition/outer.scala b/kore/src/main/scala/org/kframework/definition/outer.scala index 2a18b199f66..13a8e1c7459 100644 --- a/kore/src/main/scala/org/kframework/definition/outer.scala +++ b/kore/src/main/scala/org/kframework/definition/outer.scala @@ -56,25 +56,85 @@ trait Sorting { POSet(subsortRelations) } - def computeOverloadPOSet(subsorts: POSet[Sort], prods: Set[Production]): POSet[Production] = { - def isLessThan(p1: Production, p2: Production): Boolean = - p1.klabel.isDefined && - p1.klabelAtt == p2.klabelAtt && - p1.nonterminals.size == p2.nonterminals.size && - subsorts.lessThanEq(p1.sort, p2.sort) && - p1.nonterminals - .zip(p2.nonterminals) - .forall(pair => subsorts.lessThanEq(pair._1.sort, pair._2.sort)) && - (p1.sort != p2.sort || p1.nonterminals.map(_.sort) != p2.nonterminals.map(_.sort)) && - p1 != p2 + /** + * Let `p` and `q` be the following productions: + * {{{ + * P ::= p(P1, ..., PN) + * Q ::= q(Q1, ..., QN) + * }}} + * Let `<:` be the subsort relation for this definition. We say that `p` is strictly less than `q` + * with respect to the _overload partial order_ if: + * - `p` and `q` are not the same production + * - `P <: Q` + * - for all n, `Pn <: Qn` + * - `p` and `q` are not _identically_ sorted + * That is, a production `p` is substitutable where a production `q` is expected. + * + * This ordering defines which productions participate in overload selection with each other. + */ + private def isLessThan(p1: Production, p2: Production, subsorts: POSet[Sort]): Boolean = + p1.nonterminals.size == p2.nonterminals.size && + subsorts.lessThanEq(p1.sort, p2.sort) && + p1.nonterminals + .zip(p2.nonterminals) + .forall(pair => subsorts.lessThanEq(pair._1.sort, pair._2.sort)) && + (p1.sort != p2.sort || p1.nonterminals.map(_.sort) != p2.nonterminals.map(_.sort)) && + p1 != p2 + + /** + * Compute an overload ordering based on productions with the same `overload(_)` attribute. + */ + private def computeAttributeOverloadPOSet( + subsorts: POSet[Sort], + prods: Set[Production] + ): Set[(Production, Production)] = { + val prodsToConsider = + prods.toSeq.filter(_.att.contains(Att.OVERLOAD)).groupBy(_.att.get(Att.OVERLOAD)) + val pairs: Iterable[(Production, Production)] = for { + x <- prodsToConsider.values + p1 <- x + p2 <- x if isLessThan(p1, p2, subsorts) + } yield (p1, p2) + pairs.toSet + } + + /** + * Compute an overload ordering based on productions with the same `klabel`. This ordering will be + * deprecated in the future in favour of the explicit `overload(_)` attribute. + */ + private def computeKLabelOverloadPOSet( + subsorts: POSet[Sort], + prods: Set[Production] + ): Set[(Production, Production)] = { + def areKLabelOverloadable(p1: Production, p2: Production): Boolean = + p1.klabel.isDefined && p1.klabelAtt == p2.klabelAtt && isLessThan(p1, p2, subsorts) + val prodsForOverloads = prods.toSeq.filter(_.klabelAtt.isDefined).groupBy(_.klabelAtt) val pairs: Iterable[(Production, Production)] = for { x <- prodsForOverloads.values p1 <- x - p2 <- x if isLessThan(p1, p2) + p2 <- x if areKLabelOverloadable(p1, p2) } yield (p1, p2) - POSet(pairs.toSet) + pairs.toSet } + + /** + * Combine the orderings induced by `klabel(_)` and `overload(_)` to produce an overall partial + * ordering for production overloading. + * + * In the future, the `klabel(_)` mechanism will be removed. + * + * Note that for now, while the two methods are both supported, we rely here on the compiler + * rejecting productions that use both attributes to ensure that the two orderings are disjoint. + */ + def computeOverloadPOSet( + subsorts: POSet[Sort], + prods: Set[Production] + ): POSet[Production] = + POSet( + computeAttributeOverloadPOSet(subsorts, prods) ++ computeKLabelOverloadPOSet(subsorts, prods) + ) + } object Module {