MAINT: add bytes metrics into opensearch source #3646

chenqi0805 · 2023-11-13T20:17:00Z

Description

This PR adds bytesReceived and bytesProcessed metrics into opensearch source.

Issues Resolved

Resolves #[Issue number to be closed when this PR is merged]

Check List

New functionality includes testing.
New functionality has a documentation issue. Please link to it in this PR.
- New functionality has javadoc added
Commits are signed with a real name per the DCO

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Signed-off-by: George Chen <[email protected]>

dlvenable · 2023-11-13T20:31:18Z

...rch/src/main/java/org/opensearch/dataprepper/plugins/source/opensearch/worker/PitWorker.java

@@ -191,11 +195,14 @@ private void processIndex(final SourcePartition<OpenSearchIndexProgressState> op

            searchWithSearchAfterResults.getDocuments().stream().map(Record::new).forEach(record -> {
                try {
+                    final long documentBytes = objectMapper.writeValueAsBytes(record.getData().getJsonNode()).length;


Writing this to bytes is going to add a performance hit.

Also, there is a possibility that this will be somewhat different than the input since we are looking at the Event here rather than the actual JSON document.

The JsonNode in the event is essentially SearchResults/hit/source JsonNode. That document unit is uniform across bytesReceived and bytesProcessed so that user can make a comparison. I am open to other alternative unit.

I see that we set it to the document _source here:

data-prepper/data-prepper-plugins/opensearch/src/main/java/org/opensearch/dataprepper/plugins/source/opensearch/worker/client/OpenSearchAccessor.java

Line 296 in b80b565

.withData(hit.source())

That may or may not be the best metric, but I think it can work.

I do not have a better solution for performance hit yet since the SDK client returns ObjectNode...

dlvenable · 2023-11-13T20:36:57Z

...src/test/java/org/opensearch/dataprepper/plugins/source/opensearch/worker/PitWorkerTest.java

@@ -381,7 +419,9 @@ void run_with_getNextPartition_with_valid_existing_point_in_time_does_not_create
        verify(sourceCoordinator, times(0)).saveProgressStateForPartition(eq(partitionKey), eq(openSearchIndexProgressState));
        verify(sourceCoordinator, times(0)).updatePartitionForAcknowledgmentWait(anyString(), any(Duration.class));

+        verify(bytesReceivedSummary, times(3)).record(0L);


This doesn't verify the behavior sufficiently. I think we should mock ObjectMapper to return a byte[] of sizes that we can check against.

e.g.

int expectedDataSize1 = 10; int expectedDataSize2 = 20; ... when(objectMapper.writeValueAsBytes(testData1)).thenReturn(expectedDataSize1); ... verify(bytesReceivedSummary).record(expectedDataSize1); verify(bytesReceivedSummary).record(expectedDataSize2); ...

Signed-off-by: George Chen <[email protected]>

MAINT: add bytes metrics

81abc29

Signed-off-by: George Chen <[email protected]>

chenqi0805 requested review from engechas, graytaylor0, dinujoh, kkondaka, asifsmohammed, dlvenable and oeyh as code owners November 13, 2023 20:17

dlvenable reviewed Nov 13, 2023

View reviewed changes

MAINT: test cases

de980f1

Signed-off-by: George Chen <[email protected]>

chenqi0805 requested a review from dlvenable November 13, 2023 22:34

dlvenable approved these changes Nov 13, 2023

View reviewed changes

kkondaka approved these changes Nov 13, 2023

View reviewed changes

chenqi0805 merged commit abfe319 into opensearch-project:main Nov 13, 2023
62 of 64 checks passed

chenqi0805 deleted the maint/opensearch-source-bytes-metrics branch November 13, 2023 22:54

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MAINT: add bytes metrics into opensearch source #3646

MAINT: add bytes metrics into opensearch source #3646

chenqi0805 commented Nov 13, 2023

dlvenable Nov 13, 2023

chenqi0805 Nov 13, 2023 •

edited

Loading

dlvenable Nov 13, 2023

chenqi0805 Nov 13, 2023

dlvenable Nov 13, 2023

MAINT: add bytes metrics into opensearch source #3646

MAINT: add bytes metrics into opensearch source #3646

Conversation

chenqi0805 commented Nov 13, 2023

Description

Issues Resolved

Check List

dlvenable Nov 13, 2023

Choose a reason for hiding this comment

chenqi0805 Nov 13, 2023 • edited Loading

Choose a reason for hiding this comment

dlvenable Nov 13, 2023

Choose a reason for hiding this comment

chenqi0805 Nov 13, 2023

Choose a reason for hiding this comment

dlvenable Nov 13, 2023

Choose a reason for hiding this comment

chenqi0805 Nov 13, 2023 •

edited

Loading