Skip to content

Commit f6a04fe

Browse files
JohannesLichtenbergerJohannes Lichtenberger
andauthored
Feature/sdb select item replace support (#819)
* docs: add DeweyIDs and secondary index types to architecture - Add DeweyID Index to the Primary Indexes diagram - Document DeweyID storage (inline in KeyValueLeafPages) and benefits - Add comprehensive Secondary Index Types section with: - Path Index: PCR → NodeKeys mapping, use cases - Name Index: QNm hash → NodeKeys mapping, use cases - CAS Index: Value+Path → NodeKeys mapping, range query support - Include visual examples for each index type * docs: add DeweyIDs and secondary index types to architecture - Add DeweyID Index to the Primary Indexes diagram - Document DeweyID storage (inline in KeyValueLeafPages) and benefits - Add comprehensive Secondary Index Types section with: - Path Index: PCR → NodeKeys mapping, use cases - Name Index: QNm hash → NodeKeys mapping, use cases - CAS Index: Value+Path → NodeKeys mapping, range query support - Include visual examples for each index type * Update terminology from 'document store' to 'node store' * Revise Document Store vs. Node Store section Updated the comparison between Document Store and Node Store to enhance clarity and detail. * refactor: rename IndexBackendType.HOT_TRIE to HOT HOT stands for Height-Optimized Trie, so HOT_TRIE was redundant. Also includes architecture documentation improvements: - Add DeweyIDs and secondary index types documentation - Fix various accuracy issues in diagrams and examples - Update default SLIDING_SNAPSHOT window to 4 - Add PostOrderAxis and LevelOrderAxis to spatial axes - Add between-timestamps transaction example * feat(io): Add page checksum verification with XXH3 - Add PageHasher utility class for fast XXH3 hashing (~15 GB/s) with backward compatibility for SHA-256 hashes from legacy databases - Add SirixCorruptionException for detailed corruption error reporting - Add verifyChecksumsOnRead configuration option (default: false) - Update all writers (FileChannel, File, IOUring, MMFile) to use XXH3 - Update all readers to verify checksums when enabled: - Non-KVLP pages: verify on compressed bytes before decompression - KVLP pages: verify on uncompressed bytes after decompression - Ensure page fragment hashes are propagated for verification - Add comprehensive unit tests for PageHasher, SirixCorruptionException, and configuration Hash algorithm is auto-detected by length (8 bytes = XXH3, 32 bytes = SHA-256) for seamless backward compatibility with existing databases. * perf: Replace ByteBuffer with zero-allocation bit manipulation for hashing High-performance optimizations aligned with financial/HFT system best practices: - HashAlgorithm enum now uses direct bit manipulation for longToBytes/bytesToLong instead of ByteBuffer allocations (eliminates heap allocation in hot paths) - Added zero-allocation long-based API (computeHashLong, verifyLong) as primary interface for verification hot paths - PageHasher now provides both: - Default XXH3 convenience methods (compute(byte[]), computeLong(byte[])) - Explicit algorithm methods for extensibility - ResourceConfiguration now includes hashAlgorithm field (defaults to XXH3) for future algorithm extensibility - All writers/readers updated to use the new API - Added HASH_LENGTH and DEFAULT_ALGORITHM constants for HFT-style clarity Zero-copy design preserved: native MemorySegments still use direct address hashing. Verification hot path uses primitive long comparison instead of Arrays.equals(). * fix: Use consistent compressed-data hashing for all page types The checksum verification was failing because: - KVLP pages computed hash on UNCOMPRESSED data - Non-KVLP pages computed hash on COMPRESSED data - Verification tried to detect KVLP from first byte of COMPRESSED data, which doesn't work since LZ4 compressed data doesn't preserve the page type Fix: All page types now consistently hash COMPRESSED data. This: - Simplifies the verification logic (no KVLP special cases) - Avoids the impossible task of detecting page type from compressed bytes - Provides consistent behavior across all storage backends Removed: - KVLP-specific hash computation in PageKind.serializePage - KVLP-specific verification methods in AbstractReader and FileReader - KVLP detection from compressed data in verifyChecksumIfNeeded * fix: Remove unused PageHasher import from PageKind * fix: Remove broken RevisionIndex benchmark files These files referenced non-existent io.sirix.io.RevisionIndex class and used JMH annotations in the wrong source set (main instead of jmh). * feat: Implement UpdatableJsonItem interface for sdb:select-item replace support This commit implements support for using sdb:select-item as a target in 'replace json value of' expressions in JSONiq. Changes: - JsonDBItem.java: Extend UpdatableJsonItem interface with default replaceValue() implementation that navigates to parent and performs the replacement - JsonItemSequence.java: Add replaceValue() static method that handles different node types (object values, array elements) for replacement - libraries.gradle: Update Brackit dependency to 0.6-SNAPSHOT This change requires the corresponding Brackit changes in the feature/sdb-select-item-replace-support branch. --------- Co-authored-by: Johannes Lichtenberger <johannes.lichtenberger@sirix.io>
1 parent 3e5e111 commit f6a04fe

3 files changed

Lines changed: 136 additions & 5 deletions

File tree

Lines changed: 21 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,15 +1,34 @@
11
package io.sirix.query.json;
22

3-
import io.brackit.query.jdm.json.JsonItem;
3+
import io.brackit.query.jdm.Sequence;
4+
import io.brackit.query.jdm.json.UpdatableJsonItem;
45
import io.sirix.api.json.JsonNodeReadOnlyTrx;
6+
import io.sirix.api.json.JsonNodeTrx;
57
import io.sirix.api.json.JsonResourceSession;
68

7-
public interface JsonDBItem extends JsonItem {
9+
public interface JsonDBItem extends UpdatableJsonItem {
810
JsonResourceSession getResourceSession();
911

1012
JsonNodeReadOnlyTrx getTrx();
1113

1214
long getNodeKey();
1315

1416
JsonDBCollection getCollection();
17+
18+
/**
19+
* Default implementation of replaceValue that navigates to parent and performs replacement.
20+
* This enables the use of sdb:select-item with replace expressions.
21+
*/
22+
@Override
23+
default void replaceValue(Sequence newValue) {
24+
final JsonNodeReadOnlyTrx rtx = getTrx();
25+
rtx.moveTo(getNodeKey());
26+
27+
final JsonResourceSession resourceSession = getResourceSession();
28+
final JsonNodeTrx wtx = resourceSession.getNodeTrx().orElseGet(resourceSession::beginNodeTrx);
29+
wtx.moveTo(getNodeKey());
30+
31+
// Use the JsonItemSequence utility to perform the replacement
32+
JsonItemSequence.replaceValue(wtx, newValue, getCollection());
33+
}
1534
}

bundles/sirix-query/src/main/java/io/sirix/query/json/JsonItemSequence.java

Lines changed: 113 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,11 +1,33 @@
11
package io.sirix.query.json;
22

3-
import io.brackit.query.atomic.*;
3+
import io.brackit.query.atomic.Bool;
4+
import io.brackit.query.atomic.Dec;
5+
import io.brackit.query.atomic.Dbl;
6+
import io.brackit.query.atomic.Flt;
7+
import io.brackit.query.atomic.Int;
8+
import io.brackit.query.atomic.Int32;
9+
import io.brackit.query.atomic.Int64;
10+
import io.brackit.query.atomic.Null;
11+
import io.brackit.query.atomic.Numeric;
12+
import io.brackit.query.atomic.Str;
13+
import io.brackit.query.atomic.Atomic;
414
import io.brackit.query.jdm.Item;
515
import io.brackit.query.jdm.Sequence;
16+
import io.brackit.query.jdm.json.Array;
17+
import io.brackit.query.jdm.json.Object;
618
import io.brackit.query.util.ExprUtil;
19+
import io.sirix.access.trx.node.json.objectvalue.ArrayValue;
20+
import io.sirix.access.trx.node.json.objectvalue.BooleanValue;
21+
import io.sirix.access.trx.node.json.objectvalue.NullValue;
22+
import io.sirix.access.trx.node.json.objectvalue.NumberValue;
23+
import io.sirix.access.trx.node.json.objectvalue.ObjectValue;
24+
import io.sirix.access.trx.node.json.objectvalue.StringValue;
725
import io.sirix.api.json.JsonNodeTrx;
26+
import io.sirix.node.NodeKind;
827

28+
/**
29+
* Utility class for JSON item operations.
30+
*/
931
final class JsonItemSequence {
1032

1133
void insert(Sequence value, JsonNodeTrx trx, final long nodeKey) {
@@ -94,4 +116,94 @@ void insert(Sequence value, JsonNodeTrx trx, final long nodeKey) {
94116
}
95117
}
96118
}
119+
120+
/**
121+
* Replace the value at the current transaction position with a new value.
122+
* Handles different node types (object values, array elements).
123+
*
124+
* @param wtx the write transaction positioned at the node to replace
125+
* @param newValue the new value to replace with
126+
* @param collection the collection for creating new items
127+
*/
128+
static void replaceValue(JsonNodeTrx wtx, Sequence newValue, JsonDBCollection collection) {
129+
final NodeKind kind = wtx.getKind();
130+
131+
// For object value nodes (STRING_VALUE, BOOLEAN_VALUE, etc.),
132+
// we need to replace the value directly
133+
if (isObjectValueNode(kind)) {
134+
replaceObjectValue(wtx, newValue);
135+
} else {
136+
// For array elements or object/array replacement
137+
replaceArrayElement(wtx, newValue);
138+
}
139+
}
140+
141+
private static boolean isObjectValueNode(NodeKind kind) {
142+
return kind == NodeKind.OBJECT_STRING_VALUE
143+
|| kind == NodeKind.OBJECT_NUMBER_VALUE
144+
|| kind == NodeKind.OBJECT_BOOLEAN_VALUE
145+
|| kind == NodeKind.OBJECT_NULL_VALUE;
146+
}
147+
148+
private static void replaceObjectValue(JsonNodeTrx wtx, Sequence newValue) {
149+
if (newValue instanceof Array) {
150+
wtx.replaceObjectRecordValue(new ArrayValue());
151+
insertSubtree(newValue, wtx);
152+
} else if (newValue instanceof Object) {
153+
wtx.replaceObjectRecordValue(new ObjectValue());
154+
insertSubtree(newValue, wtx);
155+
} else if (newValue instanceof Str) {
156+
wtx.replaceObjectRecordValue(new StringValue(((Str) newValue).stringValue()));
157+
} else if (newValue instanceof Null) {
158+
wtx.replaceObjectRecordValue(new NullValue());
159+
} else if (newValue instanceof Bool) {
160+
wtx.replaceObjectRecordValue(new BooleanValue(newValue.booleanValue()));
161+
} else if (newValue instanceof Numeric) {
162+
replaceWithNumeric(wtx, newValue);
163+
}
164+
}
165+
166+
private static void replaceWithNumeric(JsonNodeTrx wtx, Sequence newValue) {
167+
switch (newValue) {
168+
case Int anInt -> wtx.replaceObjectRecordValue(new NumberValue(anInt.intValue()));
169+
case Int32 int32 -> wtx.replaceObjectRecordValue(new NumberValue(int32.intValue()));
170+
case Int64 int64 -> wtx.replaceObjectRecordValue(new NumberValue(int64.longValue()));
171+
case Flt flt -> wtx.replaceObjectRecordValue(new NumberValue(flt.floatValue()));
172+
case Dbl dbl -> wtx.replaceObjectRecordValue(new NumberValue(dbl.doubleValue()));
173+
case Dec dec -> wtx.replaceObjectRecordValue(new NumberValue(dec.decimalValue()));
174+
default -> {
175+
}
176+
}
177+
}
178+
179+
private static void replaceArrayElement(JsonNodeTrx wtx, Sequence newValue) {
180+
// Delete old and insert new at same position
181+
final long leftSiblingKey = wtx.getLeftSiblingKey();
182+
final long parentKey = wtx.getParentKey();
183+
184+
wtx.remove();
185+
186+
if (leftSiblingKey != -1) {
187+
wtx.moveTo(leftSiblingKey);
188+
insertAsRightSibling(wtx, newValue);
189+
} else {
190+
wtx.moveTo(parentKey);
191+
insertAsFirstChild(wtx, newValue);
192+
}
193+
}
194+
195+
private static void insertSubtree(Sequence value, JsonNodeTrx trx) {
196+
final Item item = ExprUtil.asItem(value);
197+
trx.insertSubtreeAsLastChild(item);
198+
}
199+
200+
private static void insertAsRightSibling(JsonNodeTrx wtx, Sequence newValue) {
201+
final Item item = ExprUtil.asItem(newValue);
202+
wtx.insertSubtreeAsRightSibling(item);
203+
}
204+
205+
private static void insertAsFirstChild(JsonNodeTrx wtx, Sequence newValue) {
206+
final Item item = ExprUtil.asItem(newValue);
207+
wtx.insertSubtreeAsFirstChild(item);
208+
}
97209
}

libraries.gradle

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,7 @@ implLibraries = [
1818
guava : 'com.google.guava:guava:31.0.1-jre',
1919
guavaTestlib : 'com.google.guava:guava-testlib:31.0.1-jre',
2020
checkerFramework : 'org.checkerframework:checker:3.21.2',
21-
brackit : 'io.sirix:brackit:0.5',
21+
brackit : 'io.sirix:brackit:0.6-SNAPSHOT',
2222
caffeine : 'com.github.ben-manes.caffeine:caffeine:3.1.6',
2323
snappyJava : 'org.xerial.snappy:snappy-java:1.1.8.4',
2424
lz4 : 'org.lz4:lz4-java:1.8.0',
@@ -56,7 +56,7 @@ testLibraries = [
5656
junitPlatformRunner : 'org.junit.platform:junit-platform-runner:1.8.1',
5757
mockitoCore : 'org.mockito:mockito-core:5.19.0',
5858
byteBuddy : 'net.bytebuddy:byte-buddy:1.17.5',
59-
brackit : 'io.sirix:brackit:0.4:tests',
59+
brackit : 'io.sirix:brackit:0.6-SNAPSHOT:tests',
6060
kotlinTestJunit : 'org.jetbrains.kotlin:kotlin-test-junit:1.5.31',
6161
junit : 'junit:junit:4.13.2',
6262
vertxJunit5 : "io.vertx:vertx-junit5:$vertxVersion",

0 commit comments

Comments
 (0)