Skip to content

Booster: Optimize KMap internal data structure to improve Map operation performance #4138

@Stevengre

Description

@Stevengre

Background

Booster currently uses [(Term, Term)] (a Haskell linked list) as the key-value pair storage for KMapF:

data TermF t = ... | KMapF KMapDefinition [(t, t)] (Maybe t) | ...

This design has several performance bottlenecks when maps grow in size. This issue provides a detailed analysis of the root causes and proposes four possible approaches for discussion.


Problem Analysis

1. Core operations have excessive complexity

All key-based lookup operations use O(n) linear scans — lookup, findIndex, partition, and elem on the pairs list:

mapLookupHook :: BuiltinFunction
mapLookupHook args
| [KMap _ pairs _mbRest, key] <- args =
-- if the key is not found, return Nothing (no result),
-- regardless of whether the key _could_ still be there.
pure $ lookup key pairs
| [_other, _] <- args =
pure Nothing -- not an internalised map, maybe a function call
| otherwise =
arityError "MAP.lookup" 2 args

mapUpdateHook :: BuiltinFunction
mapUpdateHook args
| [KMap def pairs mbRest, key, newValue] <- args = do
case findIndex ((== key) . fst) pairs of
Just idx ->
-- key was found (syntactically), update pairs list
let newPairs = take idx pairs <> ((key, newValue) : drop (idx + 1) pairs)
in pure $ Just $ KMap def newPairs mbRest
Nothing -- key could be in unevaluated or opaque part
| Just _ <- mbRest ->
pure Nothing -- have opaque part, no result
| any (not . isConstructorLike_ . fst) pairs ->
pure Nothing -- have unevaluated keys, no result
| not $ isConstructorLike_ key ->
pure Nothing -- unevaluated update key, no result
| otherwise -> -- key certain to be absent, no rest: add pair
pure $ Just $ KMap def ((key, newValue) : pairs) Nothing
| [_other, _, _] <- args =
pure Nothing -- not an internalised map, maybe a function call
| otherwise =
arityError "MAP.update" 3 args

mapRemoveHook :: BuiltinFunction
mapRemoveHook args
| [m@(KMap def pairs mbRest), key] <- args = do
case findIndex ((== key) . fst) pairs of
Just idx ->
-- key was found (syntactically), remove it
let newPairs = take idx pairs <> drop (idx + 1) pairs
in pure $ Just $ KMap def newPairs mbRest
Nothing -- key could be in unevaluated or opaque part
| Just _ <- mbRest ->
pure Nothing -- have opaque part, no result
| any (not . isConstructorLike_ . fst) pairs ->
pure Nothing -- have unevaluated keys, no result
| not $ isConstructorLike_ key ->
pure Nothing -- remove key unevaluated, no result
| otherwise -> -- key certain to be absent, no rest: map unchanged
pure $ Just m
| [_other, _] <- args =
pure Nothing -- not an internalised map, maybe a function call
| otherwise =
arityError "MAP.remove" 2 args

mapInKeysHook :: BuiltinFunction
mapInKeysHook args
| [key, KMap _ pairs mbRest] <- args = do
-- only consider evaluated keys, return Nothing if any unevaluated ones are present
let (eval'edKeys, uneval'edKeys) =
partition isConstructorLike_ (map fst pairs)
case (key `elem` eval'edKeys, key `elem` uneval'edKeys) of
(True, _) ->
-- constructor-like (evaluated) key is present
pure $ Just $ boolTerm True
(False, True) ->
-- syntactically-equal unevaluated key is present
pure $ Just $ boolTerm True
(False, False)
| Nothing <- mbRest -- no opaque rest
, isConstructorLike_ key -- key to search is evaluated
, null uneval'edKeys -> -- no keys unevaluated
pure $ Just $ boolTerm False
| otherwise -> -- key could be present once evaluated
pure Nothing
| [_, _other] <- args = do
-- other `shouldHaveSort` "SortMap"
pure Nothing -- not an internalised map, maybe a function call
| otherwise =
arityError "MAP.in_keys" 2 args

2. Smart constructor unconditionally sorts and deduplicates

Every construction of a KMap via the smart constructor calls sortAndDeduplicate — O(n log n) with expensive Ord. This is triggered after every substitution and evaluation, even when keys have not changed:

pattern KMap :: KMapDefinition -> [(Term, Term)] -> Maybe Term -> Term
pattern KMap def keyVals rest <- Term _ (KMapF def keyVals rest)
where
KMap def keyVals rest =
let argAttributes =
case (keyVals, rest) of
([], Nothing) -> mempty
([], Just s) -> getAttributes s
(_ : _, Nothing) -> foldl1' (<>) $ concatMap (\(k, v) -> [getAttributes k, getAttributes v]) keyVals
(_ : _, Just r) ->
foldl' (<>) (getAttributes r) $ concatMap (\(k, v) -> [getAttributes k, getAttributes v]) keyVals
(keyVals', rest') = case rest of
Just (KMap def' kvs r) | def' == def -> (kvs, r)
r -> ([], r)
newKeyVals = sortAndDeduplicate $ keyVals ++ keyVals'
newRest = rest'
in case (newKeyVals, newRest) of
([], Just r) -> r
_ ->
Term
argAttributes
{ hash =
Hashable.hash
( "KMap" :: ByteString
, def
, map (\(k, v) -> (hash $ getAttributes k, hash $ getAttributes v)) newKeyVals
, hash . getAttributes <$> newRest
)
}
$ KMapF def newKeyVals newRest

sortAndDeduplicate :: Ord a => [a] -> [a]
sortAndDeduplicate = Set.toAscList . Set.fromList

3. Ord Term is not optimized

Ord Term is auto-derived, comparing TermAttributes fields in declaration order. The first field compared is variables :: Set Variable (O(k) cost), while the cached hash :: Int (O(1)) is only the third field. In contrast, Eq Term is already manually optimized with hash-first comparison:

data TermAttributes = TermAttributes
{ variables :: !(Set Variable)
, isEvaluated :: !Bool
-- ^ false for function calls, true for
-- variables, recursive through AndTerm
, hash :: !Int
, isConstructorLike :: !Bool
-- ^ Means that logic equality is the same as syntactic equality.
-- True for domain values and constructor symbols (recursive
-- through arg.s), recursive through AndTerm.
, canBeEvaluated :: !Bool
-- ^ false for function calls, variables, and AndTerms
}
deriving stock (Eq, Ord, Show, Generic, Data, Lift)
deriving anyclass (NFData, Hashable)

instance Eq Term where
Term TermAttributes{hash = hash1} t1f == Term TermAttributes{hash = hash2} t2f =
hash1 == hash2 && t1f == t2f -- compare directly to cater for collisions

4. Matching repeatedly creates temporary data structure conversions

matchMaps converts [(Term, Term)] to a temporary Data.Map for intersection/difference on every match. The subject map is converted and discarded each time — pure O(n log n) overhead:

let patternKeyVals = map (first (Substitution.substituteInTerm st.mSubstitution)) patKeyVals
-- check for duplicate keys
checkDuplicateKeys patternKeyVals patRest
checkDuplicateKeys subjKeyVals subjRest
let patMap = Map.fromList patternKeyVals
subjMap = Map.fromList subjKeyVals
-- handles syntactically identical keys in pattern and subject
commonMap = Map.intersectionWith (,) patMap subjMap
patExtra = patMap `Map.withoutKeys` Map.keysSet commonMap
subjExtra = subjMap `Map.withoutKeys` Map.keysSet commonMap
rest <-
matchRemainderMaps
(toRemainderMap (Map.toList patExtra) patRest)
(toRemainderMap (Map.toList subjExtra) subjRest)
enqueueRegularProblems $ (Seq.fromList $ Map.elems commonMap) >< rest

5. No distinction between concrete and symbolic keys

Builtin operations and matching frequently check isConstructorLike_ and call partition, indicating concrete and symbolic keys are semantically different. The current storage does not reflect this, requiring runtime checks every time:

toRemainderMap :: [(Term, Term)] -> Maybe Term -> RemainderMap
toRemainderMap kvs rest =
let (conc, sym) = partitionConcreteKeys kvs
in foldr
(uncurry ConstructorKey)
(Rest $ foldr (uncurry OtherKey) (maybe EmptyMap Remainder rest) sym)
conc
where
partitionConcreteKeys :: [(Term, Term)] -> ([(Term, Term)], [(Term, Term)])
partitionConcreteKeys = partition (\(Term attrs _, _) -> attrs.isConstructorLike)


Proposed Approaches

Approach A: Optimize Ord Term + avoid unnecessary sorting

Keep [(t, t)] unchanged, make two optimizations:

A1. Hand-write Ord Term with hash-first comparison:

instance Ord Term where
    compare (Term a1 t1) (Term a2 t2) =
        compare a1.hash a2.hash <> compare t1 t2

A2. After substitution/evaluation, detect whether keys have changed; skip sortAndDeduplicate if unchanged.

Pros Cons
Smallest change, lowest risk Single key operations remain O(n) linear scan
Does not affect TermF's Functor/Foldable/Traversable Matching still requires temporary Map.fromList conversion
Beneficial regardless of which other approach is adopted Does not change asymptotic complexity, only optimizes constant factors

Approach B: Data.Map Term Term + optimized Ord Term

Replace [(t, t)] with Data.Map Term Term, with hand-written Ord Term (hash-first).

Pros Cons
Single key operations drop from O(n) to O(log n) Requires hand-written Functor/Foldable/Traversable for TermF
Matching needs no temporary conversion When keys change, still requires O(n log n) rebuild
Built-in sort and dedup, smart constructor simplifies Does not distinguish concrete/symbolic keys
Output determinism (maintains sorted order) Relies on Ord Term, still slower than O(1) hash even when optimized
intersection/difference and other set operations directly available

Regarding the TermF Functor constraint: currently [(t,t)] allows derived fmap to automatically apply to both keys and values, while Data.Map's fmap only applies to values. After investigation, only 3 functions implicitly rely on the derived instance to process KMapF keys: modifyVariablesInT, checkTermSymbols, and getCell — none are on performance hot paths (substitution, matching, and evaluation all manually pattern match). Hand-writing Functor/Foldable/Traversable instances resolves this.

Approach C: Data.HashMap Term Term

Leverage Term's existing cached hash (Hashable Term is O(1)), replacing [(t, t)] with Data.HashMap Term Term.

Pros Cons
Single key operations O(1) amortized Output non-determinismHashMap.toList order is unstable, affecting test golden files, serialization consistency, debugging
Construction O(n), no sorting needed No efficient isSubmapOfBy
Fully utilizes cached hash Does not distinguish concrete/symbolic keys
Rebuild after substitution/evaluation is O(n) Externalise needs sorting for deterministic output → additional O(n log n)
Requires hand-written Functor/Foldable/Traversable for TermF Worst case O(n) on hash collision

Approach D: Separate concrete / symbolic storage

Following the pattern of the old kore backend's NormalizedAc, separate key-value pairs by key type:

KMapF KMapDefinition
    (HashMap Term Term)   -- concrete keys (constructorLike = True, no variables)
    [(Term, Term)]        -- symbolic keys (contain variables or function calls, typically 0-2)
    (Maybe Term)          -- opaque rest

Reference implementation in the old kore backend:

data NormalizedAc (collection :: Type -> Type -> Type) key child = NormalizedAc
{ elementsWithVariables :: [Element collection child]
-- ^ Non-concrete elements of the structure.
-- These would be of sorts @(Int, String)@ for a map from @Int@ to @String@.
, concreteElements :: HashMap key (Value collection child)
-- ^ Concrete elements of the structure.
-- These would be of sorts @(Int, String)@ for a map from @Int@ to @String@.
, opaque :: [child]
-- ^ Unoptimized (i.e. non-element) parts of the structure.
}

Pros Cons
Concrete key lookup O(1) Largest amount of changes
Substitution can skip concrete keys entirely (no variables) Requires hand-written Functor/Foldable/Traversable for TermF
Matching needs no temporary conversion or partition Every operation must handle both parts separately
isConstructorLike_ runtime checks eliminated After substitution, symbolic keys may become concrete, requiring reclassification
Production-proven in kore's NormalizedAc Need to ensure consistent criteria for concrete key determination
Perfectly matches the actual concrete/symbolic distinction in K semantics Externalise needs sorting concrete keys for deterministic output

Full Pipeline Comparison Summary

n = map size, p = pattern pairs (typically 1-3), n_s = symbolic pairs (typically 0-2).
Common case: subject map has all concrete keys, pattern has a few symbolic keys.

Stage Current A (Ord opt) B (Map) C (HashMap) D (Separate)
Internalise O(n log n) expensive Ord O(n log n) fast Ord O(n log n) fast Ord O(n) O(n)
Matching: conversion O(n log n) expensive Ord O(n log n) fast Ord O(0) O(0) O(0)
Matching: intersect/diff O(p + n) O(p + n) O(p + n) O(min(p,n)) O(p)
Matching: partition O(p + n) O(p + n) O(p + n) O(p + n) O(0)
Substitution: iterate O(n) all O(n) all O(n) all O(n) all O(n) value + O(n_s) key
Substitution: rebuild O(n log n) unconditional O(0) common case O(n) map value O(n) map value O(n_s)
Evaluation: iterate O(n) key+value O(n) key+value O(n) key+value O(n) key+value O(n) value only
Evaluation: rebuild O(n log n) unconditional O(0) common case O(n) map value O(n) map value O(0) key unchanged
Single key lookup O(n) O(n) O(log n) O(1) O(1)
Output determinism Yes Yes Yes No (needs extra sort) Needs sort for concrete keys
Per-rule total ~O(n log n) × 3 expensive Ord ~O(n log n) × 2 fast Ord ~O(n) × 3 ~O(n) × 3 ~O(n) + O(p)

Notes

  • KSet has similar issues ([t] storage + sortAndDeduplicate on every construction) and could benefit from the same optimization approach.
  • KList using [t] is reasonable (ordered, no dedup needed), though LIST.get and other random access operations could benefit from Data.Seq if needed.
  • Approach A's Ord Term optimization is beneficial regardless of which other approach is adopted.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions