Skip to content
Open
Show file tree
Hide file tree
Changes from 43 commits
Commits
Show all changes
49 commits
Select commit Hold shift + click to select a range
2d569f7
wip: add OpenTelemetry metrics instrumentation
PavelPashov Feb 20, 2026
364ed40
add noop metrics
PavelPashov Nov 6, 2025
2e3cb2a
fix: check if metrics are initilized in sendCommand
PavelPashov Nov 7, 2025
cd31dfa
fix(otel): optimize metrics tracking by eliminating promise chaining …
PavelPashov Nov 10, 2025
0d55b20
refactor(otel): organize metrics into specialized metric groups
PavelPashov Nov 12, 2025
67d99ca
fix(otel): revert metrics to be tracked in the sendCommand method
PavelPashov Nov 17, 2025
ce74dc1
perf(metrics): optimize command metrics with factory pattern and inli…
PavelPashov Nov 18, 2025
0eedb01
feat: Add new OTEL_ATTRIBUTES constants
PavelPashov Jan 16, 2026
a14d159
feat: Add client-side-caching metric group with 4 new metric names
PavelPashov Jan 16, 2026
46ebd09
feat: Create error categorization stub function
PavelPashov Jan 16, 2026
cfe88dc
feat: Add pool name formatting utility function
PavelPashov Jan 16, 2026
bccbdf7
feat: Refactor recordConnectionCreateTime to use closure pattern
PavelPashov Jan 16, 2026
91305c4
feat: Rename redis.client.errors.handled to redis.client.errors
PavelPashov Jan 16, 2026
28c1a1b
feat: Update IOTelResiliencyMetrics interface with internal flag and …
PavelPashov Jan 16, 2026
27a090a
feat: Add redis.client.connection.notification attribute to maintenan…
PavelPashov Jan 16, 2026
63d2bcf
feat: Add db.response.status_code attribute extraction for Redis errors
PavelPashov Jan 16, 2026
3a0d9de
feat: Wire redis.client.connection.closed metric with close reason at…
PavelPashov Jan 17, 2026
97e1539
feat: Add db.client.connection.wait_time method with closure pattern
PavelPashov Jan 17, 2026
d5bce49
feat: Add db.client.connection.use_time method with closure pattern
PavelPashov Jan 17, 2026
eda93d2
feat: Integrate wait_time and use_time metrics into connection pool
PavelPashov Jan 17, 2026
bfa15be
feat: Add CSC metric instruments to registerInstruments
PavelPashov Jan 17, 2026
c2bc36e
feat: Create IOTelClientSideCacheMetrics interface and implementation…
PavelPashov Jan 17, 2026
55aba1d
feat: Wire redis.client.csc.requests metric to cache hit/miss detection
PavelPashov Jan 17, 2026
f0a47cb
feat: Wire redis.client.csc.items metric to track cache size changes
PavelPashov Jan 17, 2026
9a08be7
feat: Wire redis.client.csc.evictions metric with eviction reason
PavelPashov Jan 17, 2026
41f0188
feat: Wire redis.client.csc.network_saved metric to estimate bytes sa…
PavelPashov Jan 17, 2026
d22a0fb
refactor: Add count parameter to recordCacheEviction to batch evictio…
PavelPashov Jan 17, 2026
996d4ee
fix: typo
PavelPashov Jan 17, 2026
c0feedb
feat: Implement redis.client.pubsub.messages metric for pub/sub publi…
PavelPashov Jan 17, 2026
406a9c6
feat: Implement redis.client.stream.produce.messages metric for strea…
PavelPashov Jan 17, 2026
d167cc9
refactor: move pub/sub and stream metrics recording to command wrapper
PavelPashov Jan 18, 2026
d377b92
refactor: align metric groups with instrumentation spec
PavelPashov Jan 19, 2026
ce23121
feat: add error classification helper and enrich metrics attributes
PavelPashov Jan 21, 2026
89aaf1b
refactor: replace command wrapper with onSuccess hook for metrics
PavelPashov Jan 24, 2026
912e27d
feat(client): add client identity tracking for OpenTelemetry metrics
PavelPashov Feb 25, 2026
ad2fd8d
feat(otel): convert metrics to observable gauges with client registry
PavelPashov Feb 16, 2026
e843d0d
feat(otel): refine metrics coverage and remove connection use_time in…
PavelPashov Feb 25, 2026
6d14d9c
refactor(otel): resolve metric attributes via clientId registry lookup
PavelPashov Feb 20, 2026
4305d30
refactor(otel): propagate clientId through runtime metrics paths
PavelPashov Feb 20, 2026
19c95b8
test(otel): expand metrics coverage and add test utilities
PavelPashov Feb 20, 2026
34cd991
test(otel): fix flaky test
PavelPashov Feb 20, 2026
ee5f2b0
feat(otel): align metric attributes/config and expand observability c…
PavelPashov Feb 25, 2026
d0f42d3
test(otel): add maintenance metrics e2e scenario with standalone FI c…
PavelPashov Feb 25, 2026
3e19cad
refactor(otel): use instrumentation scope name and stop injecting res…
PavelPashov Feb 25, 2026
baac384
fix(opentelemetry): rename stream bucket config
PavelPashov Feb 26, 2026
08c7413
docs(opentelemetry): add metrics docs/examples
PavelPashov Feb 26, 2026
f8a51c9
fix(otel): scope redirection error dedupe to cluster retry path
PavelPashov Mar 4, 2026
181d086
fix(opentelemetry): normalize db.namespace and server.port to strings…
PavelPashov Mar 4, 2026
bc14732
fix(otel): disable recordNetworkBytesSaved for CSC
PavelPashov Mar 5, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
92 changes: 90 additions & 2 deletions package-lock.json

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions packages/client/index.ts
Original file line number Diff line number Diff line change
Expand Up @@ -36,3 +36,4 @@ export { GEO_REPLY_WITH, GeoReplyWith } from './lib/commands/GEOSEARCH_WITH';
export { SetOptions, CLIENT_KILL_FILTERS, FAILOVER_MODES, CLUSTER_SLOT_STATES, COMMAND_LIST_FILTER_BY, REDIS_FLUSH_MODES } from './lib/commands'

export { BasicClientSideCache, BasicPooledClientSideCache } from './lib/client/cache';
export { OpenTelemetry } from './lib/opentelemetry';
1 change: 1 addition & 0 deletions packages/client/lib/RESP/types.ts
Original file line number Diff line number Diff line change
Expand Up @@ -293,6 +293,7 @@ export type Command = {
TRANSFORM_LEGACY_REPLY?: boolean;
transformReply: TransformReply | Record<RespVersions, TransformReply>;
unstableResp3?: boolean;
onSuccess?: (args: ReadonlyArray<RedisArgument>, reply: unknown, clientId: string) => void;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nkaradzhov we might want to rename onSuccess to something more metrics/otel related

};

export type RedisCommands = Record<string, Command>;
Expand Down
40 changes: 39 additions & 1 deletion packages/client/lib/client/cache.ts
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@ import { EventEmitter } from 'stream';
import RedisClient from '.';
import { RedisArgument, ReplyUnion, TransformReply, TypeMapping } from '../RESP/types';
import { BasicCommandParser } from './parser';
import { OTelMetrics, CSC_RESULT, CSC_EVICTION_REASON } from '../opentelemetry';

/**
* A snapshot of cache statistics.
Expand Down Expand Up @@ -484,6 +485,7 @@ export abstract class ClientSideCacheProvider extends EventEmitter {
abstract invalidate(key: RedisArgument | null): void;
abstract clear(): void;
abstract stats(): CacheStats;
abstract size(): number;
abstract onError(): void;
abstract onClose(): void;
}
Expand Down Expand Up @@ -551,21 +553,41 @@ export class BasicClientSideCache extends ClientSideCacheProvider {

// "2"
let cacheEntry = this.get(cacheKey);

if (cacheEntry) {
// If instanceof is "too slow", can add a "type" and then use an "as" cast to call proper getters.
if (cacheEntry instanceof ClientSideCacheEntryValue) { // "2b1"
this.#statsCounter.recordHits(1);
OTelMetrics.instance.clientSideCacheMetrics.recordCacheRequest(
CSC_RESULT.HIT,
client._clientId,
);
// Estimate bytes saved by avoiding network round-trip
// Note: JSON.stringify approximation; actual RESP wire size may differ (especially for Buffers)
const bytesEstimate = JSON.stringify(cacheEntry.value).length;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think about wrapping it with Buffer.byteLength(JSON.stringify(cacheEntry.value), 'utf8');? If data to be stored isn't just ASCII it makes sense to measure UTF-8 bytes count

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good catch!

OTelMetrics.instance.clientSideCacheMetrics.recordNetworkBytesSaved(
bytesEstimate,
client._clientId,
);

return structuredClone(cacheEntry.value);
} else if (cacheEntry instanceof ClientSideCacheEntryPromise) { // 2b2
// This counts as a miss since the value hasn't been fully loaded yet.
this.#statsCounter.recordMisses(1);
OTelMetrics.instance.clientSideCacheMetrics.recordCacheRequest(
CSC_RESULT.MISS,
client._clientId,
);
reply = await cacheEntry.promise;
} else {
throw new Error("unknown cache entry type");
}
} else { // 3/3a
this.#statsCounter.recordMisses(1);
OTelMetrics.instance.clientSideCacheMetrics.recordCacheRequest(
CSC_RESULT.MISS,
client._clientId,
);

const startTime = performance.now();
const promise = fn();
Expand Down Expand Up @@ -616,22 +638,34 @@ export class BasicClientSideCache extends ClientSideCacheProvider {

override invalidate(key: RedisArgument | null) {
if (key === null) {
// Server requested to invalidate all keys
const oldSize = this.size();
this.clear(false);
// Record invalidations as server-initiated evictions
if (oldSize > 0) {
OTelMetrics.instance.clientSideCacheMetrics.recordCacheEviction(CSC_EVICTION_REASON.INVALIDATION, oldSize);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cache eviction metrics missing clientId for attribute resolution

Low Severity

All recordCacheEviction calls omit clientId, so resolveClientAttributes returns undefined and the resulting eviction metrics lack server.address, server.port, and db.client.connection.pool.name attributes. In contrast, recordCacheRequest and recordNetworkBytesSaved in the same class do pass client._clientId, making CSC eviction metrics inconsistent with other CSC metrics and impossible to correlate by server.

Additional Locations (2)

Fix in Cursor Fix in Web

}
this.emit("invalidate", key);

return;
}

const keySet = this.#keyToCacheKeySetMap.get(key.toString());
if (keySet) {
let deletedCount = 0;
for (const cacheKey of keySet) {
const entry = this.#cacheKeyToEntryMap.get(cacheKey);
if (entry) {
entry.invalidate();
deletedCount++;
}
this.#cacheKeyToEntryMap.delete(cacheKey);
}
this.#keyToCacheKeySetMap.delete(key.toString());
if (deletedCount > 0) {
// Record invalidations as server-initiated evictions
OTelMetrics.instance.clientSideCacheMetrics.recordCacheEviction(CSC_EVICTION_REASON.INVALIDATION, deletedCount);
}
}

this.emit('invalidate', key);
Expand Down Expand Up @@ -660,6 +694,8 @@ export class BasicClientSideCache extends ClientSideCacheProvider {
if (val && !val.validate()) {
this.delete(cacheKey);
this.#statsCounter.recordEvictions(1);
// Entry failed validation - this is TTL expiry since invalidation marks are handled separately
OTelMetrics.instance.clientSideCacheMetrics.recordCacheEviction(CSC_EVICTION_REASON.TTL);
this.emit("cache-evict", cacheKey);

return undefined;
Expand Down Expand Up @@ -690,13 +726,15 @@ export class BasicClientSideCache extends ClientSideCacheProvider {
const oldEntry = this.#cacheKeyToEntryMap.get(cacheKey);

if (oldEntry) {
count--; // overwriting, so not incrementig
count--; // overwriting, so not incrementing
oldEntry.invalidate();
}

if (this.maxEntries > 0 && count >= this.maxEntries) {
this.deleteOldest();
this.#statsCounter.recordEvictions(1);
// Eviction due to cache capacity limit
OTelMetrics.instance.clientSideCacheMetrics.recordCacheEviction(CSC_EVICTION_REASON.FULL);
}

this.#cacheKeyToEntryMap.set(cacheKey, cacheEntry);
Expand Down
Loading
Loading