-
Notifications
You must be signed in to change notification settings - Fork 43
Booster: Optimize KMap internal data structure to improve Map operation performance #4138
Description
Background
Booster currently uses [(Term, Term)] (a Haskell linked list) as the key-value pair storage for KMapF:
data TermF t = ... | KMapF KMapDefinition [(t, t)] (Maybe t) | ...This design has several performance bottlenecks when maps grow in size. This issue provides a detailed analysis of the root causes and proposes four possible approaches for discussion.
Problem Analysis
1. Core operations have excessive complexity
All key-based lookup operations use O(n) linear scans — lookup, findIndex, partition, and elem on the pairs list:
haskell-backend/booster/library/Booster/Builtin/MAP.hs
Lines 150 to 159 in c7e22cb
| mapLookupHook :: BuiltinFunction | |
| mapLookupHook args | |
| | [KMap _ pairs _mbRest, key] <- args = | |
| -- if the key is not found, return Nothing (no result), | |
| -- regardless of whether the key _could_ still be there. | |
| pure $ lookup key pairs | |
| | [_other, _] <- args = | |
| pure Nothing -- not an internalised map, maybe a function call | |
| | otherwise = | |
| arityError "MAP.lookup" 2 args |
haskell-backend/booster/library/Booster/Builtin/MAP.hs
Lines 48 to 68 in c7e22cb
| mapUpdateHook :: BuiltinFunction | |
| mapUpdateHook args | |
| | [KMap def pairs mbRest, key, newValue] <- args = do | |
| case findIndex ((== key) . fst) pairs of | |
| Just idx -> | |
| -- key was found (syntactically), update pairs list | |
| let newPairs = take idx pairs <> ((key, newValue) : drop (idx + 1) pairs) | |
| in pure $ Just $ KMap def newPairs mbRest | |
| Nothing -- key could be in unevaluated or opaque part | |
| | Just _ <- mbRest -> | |
| pure Nothing -- have opaque part, no result | |
| | any (not . isConstructorLike_ . fst) pairs -> | |
| pure Nothing -- have unevaluated keys, no result | |
| | not $ isConstructorLike_ key -> | |
| pure Nothing -- unevaluated update key, no result | |
| | otherwise -> -- key certain to be absent, no rest: add pair | |
| pure $ Just $ KMap def ((key, newValue) : pairs) Nothing | |
| | [_other, _, _] <- args = | |
| pure Nothing -- not an internalised map, maybe a function call | |
| | otherwise = | |
| arityError "MAP.update" 3 args |
haskell-backend/booster/library/Booster/Builtin/MAP.hs
Lines 118 to 138 in c7e22cb
| mapRemoveHook :: BuiltinFunction | |
| mapRemoveHook args | |
| | [m@(KMap def pairs mbRest), key] <- args = do | |
| case findIndex ((== key) . fst) pairs of | |
| Just idx -> | |
| -- key was found (syntactically), remove it | |
| let newPairs = take idx pairs <> drop (idx + 1) pairs | |
| in pure $ Just $ KMap def newPairs mbRest | |
| Nothing -- key could be in unevaluated or opaque part | |
| | Just _ <- mbRest -> | |
| pure Nothing -- have opaque part, no result | |
| | any (not . isConstructorLike_ . fst) pairs -> | |
| pure Nothing -- have unevaluated keys, no result | |
| | not $ isConstructorLike_ key -> | |
| pure Nothing -- remove key unevaluated, no result | |
| | otherwise -> -- key certain to be absent, no rest: map unchanged | |
| pure $ Just m | |
| | [_other, _] <- args = | |
| pure Nothing -- not an internalised map, maybe a function call | |
| | otherwise = | |
| arityError "MAP.remove" 2 args |
haskell-backend/booster/library/Booster/Builtin/MAP.hs
Lines 182 to 206 in c7e22cb
| mapInKeysHook :: BuiltinFunction | |
| mapInKeysHook args | |
| | [key, KMap _ pairs mbRest] <- args = do | |
| -- only consider evaluated keys, return Nothing if any unevaluated ones are present | |
| let (eval'edKeys, uneval'edKeys) = | |
| partition isConstructorLike_ (map fst pairs) | |
| case (key `elem` eval'edKeys, key `elem` uneval'edKeys) of | |
| (True, _) -> | |
| -- constructor-like (evaluated) key is present | |
| pure $ Just $ boolTerm True | |
| (False, True) -> | |
| -- syntactically-equal unevaluated key is present | |
| pure $ Just $ boolTerm True | |
| (False, False) | |
| | Nothing <- mbRest -- no opaque rest | |
| , isConstructorLike_ key -- key to search is evaluated | |
| , null uneval'edKeys -> -- no keys unevaluated | |
| pure $ Just $ boolTerm False | |
| | otherwise -> -- key could be present once evaluated | |
| pure Nothing | |
| | [_, _other] <- args = do | |
| -- other `shouldHaveSort` "SortMap" | |
| pure Nothing -- not an internalised map, maybe a function call | |
| | otherwise = | |
| arityError "MAP.in_keys" 2 args |
2. Smart constructor unconditionally sorts and deduplicates
Every construction of a KMap via the smart constructor calls sortAndDeduplicate — O(n log n) with expensive Ord. This is triggered after every substitution and evaluation, even when keys have not changed:
haskell-backend/booster/library/Booster/Pattern/Base.hs
Lines 580 to 609 in c7e22cb
| pattern KMap :: KMapDefinition -> [(Term, Term)] -> Maybe Term -> Term | |
| pattern KMap def keyVals rest <- Term _ (KMapF def keyVals rest) | |
| where | |
| KMap def keyVals rest = | |
| let argAttributes = | |
| case (keyVals, rest) of | |
| ([], Nothing) -> mempty | |
| ([], Just s) -> getAttributes s | |
| (_ : _, Nothing) -> foldl1' (<>) $ concatMap (\(k, v) -> [getAttributes k, getAttributes v]) keyVals | |
| (_ : _, Just r) -> | |
| foldl' (<>) (getAttributes r) $ concatMap (\(k, v) -> [getAttributes k, getAttributes v]) keyVals | |
| (keyVals', rest') = case rest of | |
| Just (KMap def' kvs r) | def' == def -> (kvs, r) | |
| r -> ([], r) | |
| newKeyVals = sortAndDeduplicate $ keyVals ++ keyVals' | |
| newRest = rest' | |
| in case (newKeyVals, newRest) of | |
| ([], Just r) -> r | |
| _ -> | |
| Term | |
| argAttributes | |
| { hash = | |
| Hashable.hash | |
| ( "KMap" :: ByteString | |
| , def | |
| , map (\(k, v) -> (hash $ getAttributes k, hash $ getAttributes v)) newKeyVals | |
| , hash . getAttributes <$> newRest | |
| ) | |
| } | |
| $ KMapF def newKeyVals newRest |
haskell-backend/booster/library/Booster/Pattern/Base.hs
Lines 168 to 169 in c7e22cb
| sortAndDeduplicate :: Ord a => [a] -> [a] | |
| sortAndDeduplicate = Set.toAscList . Set.fromList |
3. Ord Term is not optimized
Ord Term is auto-derived, comparing TermAttributes fields in declaration order. The first field compared is variables :: Set Variable (O(k) cost), while the cached hash :: Int (O(1)) is only the third field. In contrast, Eq Term is already manually optimized with hash-first comparison:
haskell-backend/booster/library/Booster/Pattern/Base.hs
Lines 121 to 135 in c7e22cb
| data TermAttributes = TermAttributes | |
| { variables :: !(Set Variable) | |
| , isEvaluated :: !Bool | |
| -- ^ false for function calls, true for | |
| -- variables, recursive through AndTerm | |
| , hash :: !Int | |
| , isConstructorLike :: !Bool | |
| -- ^ Means that logic equality is the same as syntactic equality. | |
| -- True for domain values and constructor symbols (recursive | |
| -- through arg.s), recursive through AndTerm. | |
| , canBeEvaluated :: !Bool | |
| -- ^ false for function calls, variables, and AndTerms | |
| } | |
| deriving stock (Eq, Ord, Show, Generic, Data, Lift) | |
| deriving anyclass (NFData, Hashable) |
haskell-backend/booster/library/Booster/Pattern/Base.hs
Lines 155 to 157 in c7e22cb
| instance Eq Term where | |
| Term TermAttributes{hash = hash1} t1f == Term TermAttributes{hash = hash2} t2f = | |
| hash1 == hash2 && t1f == t2f -- compare directly to cater for collisions |
4. Matching repeatedly creates temporary data structure conversions
matchMaps converts [(Term, Term)] to a temporary Data.Map for intersection/difference on every match. The subject map is converted and discarded each time — pure O(n log n) overhead:
haskell-backend/booster/library/Booster/Pattern/Match.hs
Lines 699 to 715 in c7e22cb
| let patternKeyVals = map (first (Substitution.substituteInTerm st.mSubstitution)) patKeyVals | |
| -- check for duplicate keys | |
| checkDuplicateKeys patternKeyVals patRest | |
| checkDuplicateKeys subjKeyVals subjRest | |
| let patMap = Map.fromList patternKeyVals | |
| subjMap = Map.fromList subjKeyVals | |
| -- handles syntactically identical keys in pattern and subject | |
| commonMap = Map.intersectionWith (,) patMap subjMap | |
| patExtra = patMap `Map.withoutKeys` Map.keysSet commonMap | |
| subjExtra = subjMap `Map.withoutKeys` Map.keysSet commonMap | |
| rest <- | |
| matchRemainderMaps | |
| (toRemainderMap (Map.toList patExtra) patRest) | |
| (toRemainderMap (Map.toList subjExtra) subjRest) | |
| enqueueRegularProblems $ (Seq.fromList $ Map.elems commonMap) >< rest |
5. No distinction between concrete and symbolic keys
Builtin operations and matching frequently check isConstructorLike_ and call partition, indicating concrete and symbolic keys are semantically different. The current storage does not reflect this, requiring runtime checks every time:
haskell-backend/booster/library/Booster/Pattern/Match.hs
Lines 626 to 635 in c7e22cb
| toRemainderMap :: [(Term, Term)] -> Maybe Term -> RemainderMap | |
| toRemainderMap kvs rest = | |
| let (conc, sym) = partitionConcreteKeys kvs | |
| in foldr | |
| (uncurry ConstructorKey) | |
| (Rest $ foldr (uncurry OtherKey) (maybe EmptyMap Remainder rest) sym) | |
| conc | |
| where | |
| partitionConcreteKeys :: [(Term, Term)] -> ([(Term, Term)], [(Term, Term)]) | |
| partitionConcreteKeys = partition (\(Term attrs _, _) -> attrs.isConstructorLike) |
Proposed Approaches
Approach A: Optimize Ord Term + avoid unnecessary sorting
Keep [(t, t)] unchanged, make two optimizations:
A1. Hand-write Ord Term with hash-first comparison:
instance Ord Term where
compare (Term a1 t1) (Term a2 t2) =
compare a1.hash a2.hash <> compare t1 t2A2. After substitution/evaluation, detect whether keys have changed; skip sortAndDeduplicate if unchanged.
| Pros | Cons |
|---|---|
| Smallest change, lowest risk | Single key operations remain O(n) linear scan |
Does not affect TermF's Functor/Foldable/Traversable |
Matching still requires temporary Map.fromList conversion |
| Beneficial regardless of which other approach is adopted | Does not change asymptotic complexity, only optimizes constant factors |
Approach B: Data.Map Term Term + optimized Ord Term
Replace [(t, t)] with Data.Map Term Term, with hand-written Ord Term (hash-first).
| Pros | Cons |
|---|---|
| Single key operations drop from O(n) to O(log n) | Requires hand-written Functor/Foldable/Traversable for TermF |
| Matching needs no temporary conversion | When keys change, still requires O(n log n) rebuild |
| Built-in sort and dedup, smart constructor simplifies | Does not distinguish concrete/symbolic keys |
| Output determinism (maintains sorted order) | Relies on Ord Term, still slower than O(1) hash even when optimized |
intersection/difference and other set operations directly available |
Regarding the TermF Functor constraint: currently [(t,t)] allows derived fmap to automatically apply to both keys and values, while Data.Map's fmap only applies to values. After investigation, only 3 functions implicitly rely on the derived instance to process KMapF keys: modifyVariablesInT, checkTermSymbols, and getCell — none are on performance hot paths (substitution, matching, and evaluation all manually pattern match). Hand-writing Functor/Foldable/Traversable instances resolves this.
Approach C: Data.HashMap Term Term
Leverage Term's existing cached hash (Hashable Term is O(1)), replacing [(t, t)] with Data.HashMap Term Term.
| Pros | Cons |
|---|---|
| Single key operations O(1) amortized | Output non-determinism — HashMap.toList order is unstable, affecting test golden files, serialization consistency, debugging |
| Construction O(n), no sorting needed | No efficient isSubmapOfBy |
| Fully utilizes cached hash | Does not distinguish concrete/symbolic keys |
| Rebuild after substitution/evaluation is O(n) | Externalise needs sorting for deterministic output → additional O(n log n) |
Requires hand-written Functor/Foldable/Traversable for TermF |
Worst case O(n) on hash collision |
Approach D: Separate concrete / symbolic storage
Following the pattern of the old kore backend's NormalizedAc, separate key-value pairs by key type:
KMapF KMapDefinition
(HashMap Term Term) -- concrete keys (constructorLike = True, no variables)
[(Term, Term)] -- symbolic keys (contain variables or function calls, typically 0-2)
(Maybe Term) -- opaque restReference implementation in the old kore backend:
haskell-backend/kore/src/Kore/Internal/NormalizedAc.hs
Lines 169 to 178 in c7e22cb
| data NormalizedAc (collection :: Type -> Type -> Type) key child = NormalizedAc | |
| { elementsWithVariables :: [Element collection child] | |
| -- ^ Non-concrete elements of the structure. | |
| -- These would be of sorts @(Int, String)@ for a map from @Int@ to @String@. | |
| , concreteElements :: HashMap key (Value collection child) | |
| -- ^ Concrete elements of the structure. | |
| -- These would be of sorts @(Int, String)@ for a map from @Int@ to @String@. | |
| , opaque :: [child] | |
| -- ^ Unoptimized (i.e. non-element) parts of the structure. | |
| } |
| Pros | Cons |
|---|---|
| Concrete key lookup O(1) | Largest amount of changes |
| Substitution can skip concrete keys entirely (no variables) | Requires hand-written Functor/Foldable/Traversable for TermF |
| Matching needs no temporary conversion or partition | Every operation must handle both parts separately |
isConstructorLike_ runtime checks eliminated |
After substitution, symbolic keys may become concrete, requiring reclassification |
Production-proven in kore's NormalizedAc |
Need to ensure consistent criteria for concrete key determination |
| Perfectly matches the actual concrete/symbolic distinction in K semantics | Externalise needs sorting concrete keys for deterministic output |
Full Pipeline Comparison Summary
n = map size, p = pattern pairs (typically 1-3), n_s = symbolic pairs (typically 0-2).
Common case: subject map has all concrete keys, pattern has a few symbolic keys.
| Stage | Current | A (Ord opt) | B (Map) | C (HashMap) | D (Separate) |
|---|---|---|---|---|---|
| Internalise | O(n log n) expensive Ord | O(n log n) fast Ord | O(n log n) fast Ord | O(n) | O(n) |
| Matching: conversion | O(n log n) expensive Ord | O(n log n) fast Ord | O(0) | O(0) | O(0) |
| Matching: intersect/diff | O(p + n) | O(p + n) | O(p + n) | O(min(p,n)) | O(p) |
| Matching: partition | O(p + n) | O(p + n) | O(p + n) | O(p + n) | O(0) |
| Substitution: iterate | O(n) all | O(n) all | O(n) all | O(n) all | O(n) value + O(n_s) key |
| Substitution: rebuild | O(n log n) unconditional | O(0) common case | O(n) map value | O(n) map value | O(n_s) |
| Evaluation: iterate | O(n) key+value | O(n) key+value | O(n) key+value | O(n) key+value | O(n) value only |
| Evaluation: rebuild | O(n log n) unconditional | O(0) common case | O(n) map value | O(n) map value | O(0) key unchanged |
| Single key lookup | O(n) | O(n) | O(log n) | O(1) | O(1) |
| Output determinism | Yes | Yes | Yes | No (needs extra sort) | Needs sort for concrete keys |
| Per-rule total | ~O(n log n) × 3 expensive Ord | ~O(n log n) × 2 fast Ord | ~O(n) × 3 | ~O(n) × 3 | ~O(n) + O(p) |
Notes
- KSet has similar issues (
[t]storage +sortAndDeduplicateon every construction) and could benefit from the same optimization approach. - KList using
[t]is reasonable (ordered, no dedup needed), thoughLIST.getand other random access operations could benefit fromData.Seqif needed. - Approach A's
Ord Termoptimization is beneficial regardless of which other approach is adopted.