Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhancements to map_to_list processor #4033

Merged
merged 11 commits into from
Feb 14, 2024
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,8 @@

package org.opensearch.dataprepper.plugins.processor.mutateevent;

import com.fasterxml.jackson.core.JsonProcessingException;
import com.fasterxml.jackson.databind.ObjectMapper;
import org.opensearch.dataprepper.expression.ExpressionEvaluator;
import org.opensearch.dataprepper.metrics.PluginMetrics;
import org.opensearch.dataprepper.model.annotations.DataPrepperPlugin;
Expand All @@ -27,6 +29,7 @@
@DataPrepperPlugin(name = "map_to_list", pluginType = Processor.class, pluginConfigurationType = MapToListProcessorConfig.class)
public class MapToListProcessor extends AbstractProcessor<Record<Event>, Record<Event>> {
private static final Logger LOG = LoggerFactory.getLogger(MapToListProcessor.class);
private static final ObjectMapper OBJECT_MAPPER = new ObjectMapper();
private final MapToListProcessorConfig config;
private final ExpressionEvaluator expressionEvaluator;
private final Set<String> excludeKeySet = new HashSet<>();
Expand All @@ -49,36 +52,75 @@ public Collection<Record<Event>> doExecute(final Collection<Record<Event>> recor
}

try {
final Map<String, Object> sourceMap = recordEvent.get(config.getSource(), Map.class);
final List<Map<String, Object>> targetList = new ArrayList<>();

Map<String, Object> modifiedSourceMap = new HashMap<>();
for (final Map.Entry<String, Object> entry : sourceMap.entrySet()) {
if (excludeKeySet.contains(entry.getKey())) {
if (config.getRemoveProcessedFields()) {
modifiedSourceMap.put(entry.getKey(), entry.getValue());
final Map<String, Object> sourceMap = getSourceMap(recordEvent);

if (config.getConvertFieldToList()) {
final List<List<Object>> targetNestedList = new ArrayList<>();

for (final Map.Entry<String, Object> entry : sourceMap.entrySet()) {
if (!excludeKeySet.contains(entry.getKey())) {
targetNestedList.add(List.of(entry.getKey(), entry.getValue()));
}
continue;
}
targetList.add(Map.of(
config.getKeyName(), entry.getKey(),
config.getValueName(), entry.getValue()
));
}

if (config.getRemoveProcessedFields()) {
recordEvent.put(config.getSource(), modifiedSourceMap);
}
removeProcessedFields(sourceMap, recordEvent);
recordEvent.put(config.getTarget(), targetNestedList);
} else {
final List<Map<String, Object>> targetList = new ArrayList<>();
for (final Map.Entry<String, Object> entry : sourceMap.entrySet()) {
if (!excludeKeySet.contains(entry.getKey())) {
targetList.add(Map.of(
config.getKeyName(), entry.getKey(),
config.getValueName(), entry.getValue()
));
}
}
removeProcessedFields(sourceMap, recordEvent);
recordEvent.put(config.getTarget(), targetList);
}

recordEvent.put(config.getTarget(), targetList);
} catch (Exception e) {
LOG.error("Fail to perform Map to List operation", e);
//TODO: add tagging on failure
recordEvent.getMetadata().addTags(config.getTagsOnFailure());
}
}
return records;
}

private Map<String, Object> getSourceMap(Event recordEvent) throws JsonProcessingException {
final Map<String, Object> sourceMap;
if (config.getSource() == null) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a desirable feature?

If so, can we make it take in / instead of null? I think this could lead to confusion when users don't set the source field they are probably not getting what they want.

Copy link
Collaborator Author

@oeyh oeyh Feb 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we will need it for #3965.

If so, can we make it take in / instead of null? I think this could lead to confusion when users don't set the source field they are probably not getting what they want.

In JsonPointer specification, it is actually the empty string "" that refers to the whole document and "/" refers to the field with "" as key. If we remove the restrictions in JacksonEvent that key cannot be empty string, we should already support referring to the root document with Json Pointer "". What do you think of doing it this way?

Copy link
Collaborator Author

@oeyh oeyh Feb 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did some testing if we use "/" as source, it will try to get the value with empty string as key.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@oeyh , That is a very useful finding.

I think we will need to make the syntax be explicit then:

source: ''

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implemented this in recent commits.

// Source is root
sourceMap = OBJECT_MAPPER.treeToValue(recordEvent.getJsonNode(), Map.class);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a high level question here, if you get the entire JSON node if source is null, I think it's not going to work. Correct me if I am wrong.
For example, if the following is the event.

{
  "my-map": {
    "key1": "value1",
    "key2": "value2",
    "key3": "value3"
  }
}

The result will be

[
  ["my-map", {
    "key1": "value1",
    "key2": "value2",
    "key3": "value3"
  }]
]

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. I think adding test cases and documentation for euch cases would help.

Copy link
Collaborator Author

@oeyh oeyh Feb 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@asifsmohammed Good question, though it isn't necessarily related to root being the source. I think it's an issue with nested map in the source map. Currently we assume the source map structure is flat, so what you show is the expected result. We can have a recursive option to go deeper into each field in the future if needed.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added readme and more test cases.

} else {
sourceMap = recordEvent.get(config.getSource(), Map.class);
}
return sourceMap;
}

private void removeProcessedFields(Map<String, Object> sourceMap, Event recordEvent) {
if (!config.getRemoveProcessedFields()) {
return;
}

if (config.getSource() == null) {
// Source is root
for (final Map.Entry<String, Object> entry : sourceMap.entrySet()) {
if (excludeKeySet.contains(entry.getKey())) {
continue;
}
recordEvent.delete(entry.getKey());
}
} else {
Map<String, Object> modifiedSourceMap = new HashMap<>();
for (final Map.Entry<String, Object> entry : sourceMap.entrySet()) {
if (excludeKeySet.contains(entry.getKey())) {
modifiedSourceMap.put(entry.getKey(), entry.getValue());
}
}
recordEvent.put(config.getSource(), modifiedSourceMap);
}
}

@Override
public void prepareForShutdown() {
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -18,8 +18,6 @@ public class MapToListProcessorConfig {
private static final List<String> DEFAULT_EXCLUDE_KEYS = new ArrayList<>();
private static final boolean DEFAULT_REMOVE_PROCESSED_FIELDS = false;

@NotEmpty
@NotNull
@JsonProperty("source")
private String source;

Expand All @@ -43,6 +41,12 @@ public class MapToListProcessorConfig {
@JsonProperty("remove_processed_fields")
private boolean removeProcessedFields = DEFAULT_REMOVE_PROCESSED_FIELDS;

@JsonProperty("convert_field_to_list")
private boolean convertFieldToList = false;

@JsonProperty("tags_on_failure")
private List<String> tagsOnFailure;

public String getSource() {
return source;
}
Expand Down Expand Up @@ -70,4 +74,12 @@ public List<String> getExcludeKeys() {
public boolean getRemoveProcessedFields() {
return removeProcessedFields;
}

public boolean getConvertFieldToList() {
return convertFieldToList;
}

public List<String> getTagsOnFailure() {
return tagsOnFailure;
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@

import java.util.ArrayList;
import java.util.Collections;
import java.util.HashSet;
import java.util.List;
import java.util.Map;
import java.util.UUID;
Expand Down Expand Up @@ -48,6 +49,8 @@ void setUp() {
lenient().when(mockConfig.getMapToListWhen()).thenReturn(null);
lenient().when(mockConfig.getExcludeKeys()).thenReturn(new ArrayList<>());
lenient().when(mockConfig.getRemoveProcessedFields()).thenReturn(false);
lenient().when(mockConfig.getConvertFieldToList()).thenReturn(false);
lenient().when(mockConfig.getTagsOnFailure()).thenReturn(new ArrayList<>());
}

@Test
Expand All @@ -72,6 +75,28 @@ void testMapToListSuccessWithDefaultOptions() {
assertSourceMapUnchanged(resultEvent);
}

@Test
void testMapToListSuccessWithRootAsSource() {
when(mockConfig.getSource()).thenReturn(null);

final MapToListProcessor processor = createObjectUnderTest();
final Record<Event> testRecord = createFlatTestRecord();
final List<Record<Event>> resultRecord = (List<Record<Event>>) processor.doExecute(Collections.singletonList(testRecord));

assertThat(resultRecord.size(), is(1));

final Event resultEvent = resultRecord.get(0).getData();
List<Map<String, Object>> resultList = resultEvent.get("my-list", List.class);

assertThat(resultList.size(), is(3));
assertThat(resultList, containsInAnyOrder(
Map.of("key", "key1", "value", "value1"),
Map.of("key", "key2", "value", "value2"),
Map.of("key", "key3", "value", "value3")
));
assertSourceMapUnchangedForFlatRecord(resultEvent);
}

@Test
void testMapToListSuccessWithCustomKeyNameValueName() {
final String keyName = "custom-key-name";
Expand Down Expand Up @@ -130,6 +155,25 @@ void testExcludedKeysAreNotProcessed() {
assertSourceMapUnchanged(resultEvent);
}

@Test
void testExcludedKeysAreNotProcessedWithRootAsSource() {
when(mockConfig.getSource()).thenReturn(null);
when(mockConfig.getExcludeKeys()).thenReturn(List.of("key1", "key3", "key5"));

final MapToListProcessor processor = createObjectUnderTest();
final Record<Event> testRecord = createFlatTestRecord();
final List<Record<Event>> resultRecord = (List<Record<Event>>) processor.doExecute(Collections.singletonList(testRecord));

assertThat(resultRecord.size(), is(1));

final Event resultEvent = resultRecord.get(0).getData();
List<Map<String, Object>> resultList = resultEvent.get("my-list", List.class);

assertThat(resultList.size(), is(1));
assertThat(resultList.get(0), is(Map.of("key", "key2", "value", "value2")));
assertSourceMapUnchangedForFlatRecord(resultEvent);
}

@Test
void testRemoveProcessedFields() {
when(mockConfig.getExcludeKeys()).thenReturn(List.of("key1", "key3", "key5"));
Expand All @@ -155,6 +199,76 @@ void testRemoveProcessedFields() {
assertThat(resultEvent.get("my-map/key3", String.class), is("value3"));
}

@Test
void testRemoveProcessedFieldsWithRootAsSource() {
when(mockConfig.getSource()).thenReturn(null);
when(mockConfig.getExcludeKeys()).thenReturn(List.of("key1", "key3", "key5"));
when(mockConfig.getRemoveProcessedFields()).thenReturn(true);

final MapToListProcessor processor = createObjectUnderTest();
final Record<Event> testRecord = createFlatTestRecord();
final List<Record<Event>> resultRecord = (List<Record<Event>>) processor.doExecute(Collections.singletonList(testRecord));

assertThat(resultRecord.size(), is(1));

final Event resultEvent = resultRecord.get(0).getData();
List<Map<String, Object>> resultList = resultEvent.get("my-list", List.class);

assertThat(resultList.size(), is(1));
assertThat(resultList.get(0), is(Map.of("key", "key2", "value", "value2")));

assertThat(resultEvent.containsKey("key1"), is(true));
assertThat(resultEvent.get("key1", String.class), is("value1"));
assertThat(resultEvent.containsKey("key2"), is(false));
assertThat(resultEvent.containsKey("key3"), is(true));
assertThat(resultEvent.get("key3", String.class), is("value3"));
}

@Test
public void testConvertFieldToListSuccess() {
when(mockConfig.getConvertFieldToList()).thenReturn(true);

final MapToListProcessor processor = createObjectUnderTest();
final Record<Event> testRecord = createTestRecord();
final List<Record<Event>> resultRecord = (List<Record<Event>>) processor.doExecute(Collections.singletonList(testRecord));

assertThat(resultRecord.size(), is(1));

final Event resultEvent = resultRecord.get(0).getData();
List<List<Object>> resultList = resultEvent.get("my-list", List.class);

assertThat(resultList.size(), is(3));
assertThat(resultList, containsInAnyOrder(
List.of("key1", "value1"),
List.of("key2", "value2"),
List.of("key3", "value3")
));
assertSourceMapUnchanged(resultEvent);
}

@Test
public void testConvertFieldToListSuccessWithRootAsSource() {
when(mockConfig.getSource()).thenReturn(null);
when(mockConfig.getConvertFieldToList()).thenReturn(true);

final MapToListProcessor processor = createObjectUnderTest();
final Record<Event> testRecord = createFlatTestRecord();
final List<Record<Event>> resultRecord = (List<Record<Event>>) processor.doExecute(Collections.singletonList(testRecord));

assertThat(resultRecord.size(), is(1));

final Event resultEvent = resultRecord.get(0).getData();
List<List<Object>> resultList = resultEvent.get("my-list", List.class);

assertThat(resultList.size(), is(3));
assertThat(resultList, containsInAnyOrder(
List.of("key1", "value1"),
List.of("key2", "value2"),
List.of("key3", "value3")
));
assertSourceMapUnchangedForFlatRecord(resultEvent);
}

@Test
public void testEventNotProcessedWhenTheWhenConditionIsFalse() {
final String whenCondition = UUID.randomUUID().toString();
Expand All @@ -172,6 +286,25 @@ public void testEventNotProcessedWhenTheWhenConditionIsFalse() {
assertSourceMapUnchanged(resultEvent);
}

@Test
void testFailureTagsAreAddedWhenException() {
// non-existing source key
when(mockConfig.getSource()).thenReturn("my-other-map");
final List<String> testTags = List.of("tag1", "tag2");
when(mockConfig.getTagsOnFailure()).thenReturn(testTags);

final MapToListProcessor processor = createObjectUnderTest();
final Record<Event> testRecord = createTestRecord();
final List<Record<Event>> resultRecord = (List<Record<Event>>) processor.doExecute(Collections.singletonList(testRecord));

assertThat(resultRecord.size(), is(1));

final Event resultEvent = resultRecord.get(0).getData();
assertThat(resultEvent.containsKey("my-list"), is(false));
assertSourceMapUnchanged(resultEvent);
assertThat(resultEvent.getMetadata().getTags(), is(new HashSet<>(testTags)));
}

private MapToListProcessor createObjectUnderTest() {
return new MapToListProcessor(pluginMetrics, mockConfig, expressionEvaluator);
}
Expand All @@ -188,10 +321,28 @@ private Record<Event> createTestRecord() {
return new Record<>(event);
}

private Record<Event> createFlatTestRecord() {
final Map<String, String> data = Map.of(
"key1", "value1",
"key2", "value2",
"key3", "value3");
final Event event = JacksonEvent.builder()
.withData(data)
.withEventType("event")
.build();
return new Record<>(event);
}

private void assertSourceMapUnchanged(final Event resultEvent) {
assertThat(resultEvent.containsKey("my-map"), is(true));
assertThat(resultEvent.get("my-map/key1", String.class), is("value1"));
assertThat(resultEvent.get("my-map/key2", String.class), is("value2"));
assertThat(resultEvent.get("my-map/key3", String.class), is("value3"));
}

private void assertSourceMapUnchangedForFlatRecord(final Event resultEvent) {
assertThat(resultEvent.get("key1", String.class), is("value1"));
assertThat(resultEvent.get("key2", String.class), is("value2"));
assertThat(resultEvent.get("key3", String.class), is("value3"));
}
}
Loading