Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PPL: Add json_extract function #3262

Open
wants to merge 79 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
79 commits
Select commit Hold shift + click to select a range
70152eb
added implementation
14yapkc1 Jan 3, 2025
76c3995
added doctest, integ-tests, and unit tests
14yapkc1 Jan 6, 2025
ce2c551
addressed PR comments
kenrickyap Jan 6, 2025
ad1bde3
fixed unit tests
kenrickyap Jan 7, 2025
ccf47a2
addressed pr comments
kenrickyap Jan 7, 2025
acc76a0
addressed PR comments
kenrickyap Jan 7, 2025
519c6f2
removed unused dependencies
kenrickyap Jan 7, 2025
2e319fe
linting
kenrickyap Jan 7, 2025
ee0820d
addressed pr comment and rolling back disabled test case
kenrickyap Jan 8, 2025
d44fc5a
Merge branch 'main' into feature/json-valid
kenrickyap Jan 8, 2025
3407d4a
removed disabled import
kenrickyap Jan 9, 2025
7ef6cc9
Update docs/user/ppl/functions/json.rst
kenrickyap Jan 9, 2025
e5e90ac
Update integ-test/src/test/java/org/opensearch/sql/ppl/JsonFunctionIT…
kenrickyap Jan 9, 2025
2187a5a
nit
kenrickyap Jan 9, 2025
5e1e488
Merge branch 'feature/json-valid' of https://github.com/Bit-Quill/ope…
kenrickyap Jan 9, 2025
3512b33
fixed integ test
kenrickyap Jan 9, 2025
9fea606
change text type to keyword
kenrickyap Jan 9, 2025
fbc54bc
addressed PR comments
kenrickyap Jan 10, 2025
31ad2a4
fix doc-test
kenrickyap Jan 11, 2025
2b2a8f3
added null test
kenrickyap Jan 14, 2025
dc96563
Merge branch 'main' into feature/json-valid
acarbonetto Jan 15, 2025
1913bfe
SQL: adding error case unit tests for json_valid
acarbonetto Jan 15, 2025
67d979d
json_valid: null and missing should return false
acarbonetto Jan 15, 2025
aa6b723
PPL: Add json and cast to json functions
acarbonetto Jan 8, 2025
4c99235
PPL: Update json cast for review
acarbonetto Jan 8, 2025
9ccde7f
Fix testes
acarbonetto Jan 9, 2025
4306bf3
spotless
acarbonetto Jan 9, 2025
613137b
Fix tests
acarbonetto Jan 14, 2025
ab28872
SPOTLESS
acarbonetto Jan 14, 2025
3ec16e0
Clean up for merge
acarbonetto Jan 15, 2025
6dbf37b
added implementation
14yapkc1 Jan 3, 2025
b8c6d68
added doctest, integ-tests, and unit tests
14yapkc1 Jan 6, 2025
afb668c
addressed pr comments
kenrickyap Jan 7, 2025
54ef183
addressed PR comments
kenrickyap Jan 7, 2025
d841394
removed unused dependencies
kenrickyap Jan 7, 2025
25fb527
linting
kenrickyap Jan 7, 2025
4a20d08
addressed pr comment and rolling back disabled test case
kenrickyap Jan 8, 2025
fdc4729
removed disabled import
kenrickyap Jan 9, 2025
707a0b9
nit
kenrickyap Jan 9, 2025
4f28211
Update integ-test/src/test/java/org/opensearch/sql/ppl/JsonFunctionIT…
kenrickyap Jan 9, 2025
9ec6335
fixed integ test
kenrickyap Jan 9, 2025
3324e66
SQL: adding error case unit tests for json_valid
acarbonetto Jan 15, 2025
7123c35
json_valid: null and missing should return false
acarbonetto Jan 15, 2025
dbca991
PPL: Add json and cast to json functions
acarbonetto Jan 8, 2025
7df87cb
PPL: Update json cast for review
acarbonetto Jan 8, 2025
cd45fcc
Fix testes
acarbonetto Jan 9, 2025
6f5dc07
spotless
acarbonetto Jan 9, 2025
0aae36e
Fix tests
acarbonetto Jan 14, 2025
b225f28
SPOTLESS
acarbonetto Jan 14, 2025
78af4f8
Clean up for merge
acarbonetto Jan 15, 2025
b84282a
clean up unit tests
acarbonetto Jan 15, 2025
1e23286
Add casting from undefined
acarbonetto Jan 15, 2025
343f5a2
Add cast to scalar from undefined expression
acarbonetto Jan 16, 2025
e8b6df3
Add test for missing/null
acarbonetto Jan 16, 2025
ab9be75
Clean up merge conflicts
acarbonetto Jan 17, 2025
788be9d
Fix jacoco coverage
acarbonetto Jan 17, 2025
a9721bf
Move to Switch by json type
acarbonetto Jan 17, 2025
daa95ff
Merge branch 'main' into feature/acarbo_json_cast_ppl
acarbonetto Jan 20, 2025
018e462
functionality implemented
kenrickyap Jan 20, 2025
c6c6cc1
Remove conflicted files
acarbonetto Jan 21, 2025
a5652ea
Add doctext row
acarbonetto Jan 21, 2025
2cd10a2
added integ-test and doc test
kenrickyap Jan 22, 2025
cd78ddd
fixed integ tests
kenrickyap Jan 22, 2025
afb385f
unit tests
kenrickyap Jan 23, 2025
0e91b2e
Merge branch 'main' into feature/json-extract
kenrickyap Jan 23, 2025
794db8a
finnished unit tests
kenrickyap Jan 23, 2025
0f0b8d4
update doctest
kenrickyap Jan 23, 2025
f030057
addessed comments
kenrickyap Jan 27, 2025
2b08007
added addition edge cases for unit tests
kenrickyap Jan 28, 2025
be52786
Merge branch 'feature/acarbo_json_cast_ppl' into feature/json-extract
kenrickyap Jan 28, 2025
6bd2f40
Merge branch 'feature/acarbo_json_cast_ppl' into feature/json-extract
kenrickyap Jan 28, 2025
0b9e9e4
Merge branch 'feature/json-extract' of https://github.com/Bit-Quill/o…
kenrickyap Jan 28, 2025
6678be4
addressed PR comments
kenrickyap Jan 29, 2025
e57fa21
fix code coverage
14yapkc1 Jan 30, 2025
112be65
Update core/src/test/java/org/opensearch/sql/expression/json/JsonFunc…
kenrickyap Jan 30, 2025
306ac97
address comments
kenrickyap Jan 30, 2025
75e9cc3
fix build error
kenrickyap Jan 30, 2025
77827bb
Merge branch 'main' into feature/json-extract
kenrickyap Jan 31, 2025
80f44e2
add header
kenrickyap Jan 31, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions core/build.gradle
Original file line number Diff line number Diff line change
Expand Up @@ -54,6 +54,7 @@ dependencies {
api "com.fasterxml.jackson.core:jackson-core:${versions.jackson}"
api "com.fasterxml.jackson.core:jackson-databind:${versions.jackson_databind}"
api "com.fasterxml.jackson.core:jackson-annotations:${versions.jackson}"
api group: 'com.jayway.jsonpath', name: 'json-path', version: '2.9.0'
api group: 'com.google.code.gson', name: 'gson', version: '2.8.9'
api group: 'com.tdunning', name: 't-digest', version: '3.3'
api project(':common')
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@
import static org.opensearch.sql.expression.function.BuiltinFunctionName.CAST_TO_FLOAT;
import static org.opensearch.sql.expression.function.BuiltinFunctionName.CAST_TO_INT;
import static org.opensearch.sql.expression.function.BuiltinFunctionName.CAST_TO_IP;
import static org.opensearch.sql.expression.function.BuiltinFunctionName.CAST_TO_JSON;
import static org.opensearch.sql.expression.function.BuiltinFunctionName.CAST_TO_LONG;
import static org.opensearch.sql.expression.function.BuiltinFunctionName.CAST_TO_SHORT;
import static org.opensearch.sql.expression.function.BuiltinFunctionName.CAST_TO_STRING;
Expand Down Expand Up @@ -56,6 +57,7 @@ public class Cast extends UnresolvedExpression {
.put("timestamp", CAST_TO_TIMESTAMP.getName())
.put("datetime", CAST_TO_DATETIME.getName())
.put("ip", CAST_TO_IP.getName())
.put("json", CAST_TO_JSON.getName())
.build();

/** The source expression cast from. */
Expand Down
12 changes: 12 additions & 0 deletions core/src/main/java/org/opensearch/sql/expression/DSL.java
Original file line number Diff line number Diff line change
Expand Up @@ -687,6 +687,14 @@ public static FunctionExpression jsonValid(Expression... expressions) {
return compile(FunctionProperties.None, BuiltinFunctionName.JSON_VALID, expressions);
}

public static FunctionExpression jsonExtract(Expression... expressions) {
return compile(FunctionProperties.None, BuiltinFunctionName.JSON_EXTRACT, expressions);
}
kenrickyap marked this conversation as resolved.
Show resolved Hide resolved

public static FunctionExpression stringToJson(Expression value) {
return compile(FunctionProperties.None, BuiltinFunctionName.JSON, value);
}

public static Aggregator avg(Expression... expressions) {
return aggregate(BuiltinFunctionName.AVG, expressions);
}
Expand Down Expand Up @@ -843,6 +851,10 @@ public static FunctionExpression castIp(Expression value) {
return compile(FunctionProperties.None, BuiltinFunctionName.CAST_TO_IP, value);
}

public static FunctionExpression castJson(Expression value) {
return compile(FunctionProperties.None, BuiltinFunctionName.CAST_TO_JSON, value);
}

public static FunctionExpression typeof(Expression value) {
return compile(FunctionProperties.None, BuiltinFunctionName.TYPEOF, value);
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -206,6 +206,8 @@ public enum BuiltinFunctionName {

/** Json Functions. */
JSON_VALID(FunctionName.of("json_valid")),
JSON(FunctionName.of("json")),
JSON_EXTRACT(FunctionName.of("json_extract")),

/** GEOSPATIAL Functions. */
GEOIP(FunctionName.of("geoip")),
Expand Down Expand Up @@ -238,6 +240,7 @@ public enum BuiltinFunctionName {
CAST_TO_TIMESTAMP(FunctionName.of("cast_to_timestamp")),
CAST_TO_DATETIME(FunctionName.of("cast_to_datetime")),
CAST_TO_IP(FunctionName.of("cast_to_ip")),
CAST_TO_JSON(FunctionName.of("cast_to_json")),
TYPEOF(FunctionName.of("typeof")),

/** Relevance Function. */
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -7,8 +7,10 @@

import static org.opensearch.sql.data.type.ExprCoreType.BOOLEAN;
import static org.opensearch.sql.data.type.ExprCoreType.STRING;
import static org.opensearch.sql.data.type.ExprCoreType.UNDEFINED;
import static org.opensearch.sql.expression.function.FunctionDSL.define;
import static org.opensearch.sql.expression.function.FunctionDSL.impl;
import static org.opensearch.sql.expression.function.FunctionDSL.nullMissingHandling;

import lombok.experimental.UtilityClass;
import org.opensearch.sql.expression.function.BuiltinFunctionName;
Expand All @@ -20,10 +22,24 @@
public class JsonFunctions {
public void register(BuiltinFunctionRepository repository) {
repository.register(jsonValid());
repository.register(jsonFunction());
repository.register(jsonExtract());
}

private DefaultFunctionResolver jsonValid() {
return define(
BuiltinFunctionName.JSON_VALID.getName(), impl(JsonUtils::isValidJson, BOOLEAN, STRING));
}

private DefaultFunctionResolver jsonFunction() {
return define(
BuiltinFunctionName.JSON.getName(),
impl(nullMissingHandling(JsonUtils::castJson), UNDEFINED, STRING));
}
kenrickyap marked this conversation as resolved.
Show resolved Hide resolved

private DefaultFunctionResolver jsonExtract() {
return define(
BuiltinFunctionName.JSON_EXTRACT.getName(),
impl(JsonUtils::extractJson, UNDEFINED, STRING, STRING));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

}
}
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@
import static org.opensearch.sql.data.type.ExprCoreType.STRING;
import static org.opensearch.sql.data.type.ExprCoreType.TIME;
import static org.opensearch.sql.data.type.ExprCoreType.TIMESTAMP;
import static org.opensearch.sql.data.type.ExprCoreType.UNDEFINED;
import static org.opensearch.sql.expression.function.FunctionDSL.impl;
import static org.opensearch.sql.expression.function.FunctionDSL.implWithProperties;
import static org.opensearch.sql.expression.function.FunctionDSL.nullMissingHandling;
Expand All @@ -42,6 +43,7 @@
import org.opensearch.sql.expression.function.BuiltinFunctionRepository;
import org.opensearch.sql.expression.function.DefaultFunctionResolver;
import org.opensearch.sql.expression.function.FunctionDSL;
import org.opensearch.sql.utils.JsonUtils;

@UtilityClass
public class TypeCastOperators {
Expand All @@ -57,6 +59,7 @@ public static void register(BuiltinFunctionRepository repository) {
repository.register(castToDouble());
repository.register(castToBoolean());
repository.register(castToIp());
repository.register(castToJson());
repository.register(castToDate());
repository.register(castToTime());
repository.register(castToTimestamp());
Expand Down Expand Up @@ -105,7 +108,8 @@ private static DefaultFunctionResolver castToShort() {
impl(
nullMissingHandling((v) -> new ExprShortValue(v.booleanValue() ? 1 : 0)),
SHORT,
BOOLEAN));
BOOLEAN),
impl(nullMissingHandling((v) -> v), SHORT, UNDEFINED));
}

private static DefaultFunctionResolver castToInt() {
Expand All @@ -119,7 +123,8 @@ private static DefaultFunctionResolver castToInt() {
impl(
nullMissingHandling((v) -> new ExprIntegerValue(v.booleanValue() ? 1 : 0)),
INTEGER,
BOOLEAN));
BOOLEAN),
impl(nullMissingHandling((v) -> v), INTEGER, UNDEFINED));
}

private static DefaultFunctionResolver castToLong() {
Expand All @@ -133,7 +138,8 @@ private static DefaultFunctionResolver castToLong() {
impl(
nullMissingHandling((v) -> new ExprLongValue(v.booleanValue() ? 1L : 0L)),
LONG,
BOOLEAN));
BOOLEAN),
impl(nullMissingHandling((v) -> v), LONG, UNDEFINED));
}

private static DefaultFunctionResolver castToFloat() {
Expand All @@ -147,7 +153,8 @@ private static DefaultFunctionResolver castToFloat() {
impl(
nullMissingHandling((v) -> new ExprFloatValue(v.booleanValue() ? 1f : 0f)),
FLOAT,
BOOLEAN));
BOOLEAN),
impl(nullMissingHandling((v) -> v), FLOAT, UNDEFINED));
}

private static DefaultFunctionResolver castToDouble() {
Expand All @@ -161,7 +168,8 @@ private static DefaultFunctionResolver castToDouble() {
impl(
nullMissingHandling((v) -> new ExprDoubleValue(v.booleanValue() ? 1D : 0D)),
DOUBLE,
BOOLEAN));
BOOLEAN),
impl(nullMissingHandling((v) -> v), DOUBLE, UNDEFINED));
}

private static DefaultFunctionResolver castToBoolean() {
Expand All @@ -173,7 +181,8 @@ private static DefaultFunctionResolver castToBoolean() {
STRING),
impl(
nullMissingHandling((v) -> ExprBooleanValue.of(v.doubleValue() != 0)), BOOLEAN, DOUBLE),
impl(nullMissingHandling((v) -> v), BOOLEAN, BOOLEAN));
impl(nullMissingHandling((v) -> v), BOOLEAN, BOOLEAN),
impl(nullMissingHandling((v) -> v), BOOLEAN, UNDEFINED));
}

private static DefaultFunctionResolver castToIp() {
Expand All @@ -183,6 +192,12 @@ private static DefaultFunctionResolver castToIp() {
impl(nullMissingHandling((v) -> v), IP, IP));
}

private static DefaultFunctionResolver castToJson() {
return FunctionDSL.define(
BuiltinFunctionName.CAST_TO_JSON.getName(),
impl(nullMissingHandling(JsonUtils::castJson), UNDEFINED, STRING));
}

private static DefaultFunctionResolver castToDate() {
return FunctionDSL.define(
BuiltinFunctionName.CAST_TO_DATE.getName(),
Expand Down
122 changes: 119 additions & 3 deletions core/src/main/java/org/opensearch/sql/utils/JsonUtils.java
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can u plz add license header ?
thanks

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added

Original file line number Diff line number Diff line change
@@ -1,18 +1,42 @@
/*
* Copyright OpenSearch Contributors
* SPDX-License-Identifier: Apache-2.0
*/

package org.opensearch.sql.utils;

import static org.opensearch.sql.data.model.ExprValueUtils.*;

import com.fasterxml.jackson.core.JsonProcessingException;
import com.fasterxml.jackson.databind.JsonNode;
import com.fasterxml.jackson.databind.ObjectMapper;
import com.jayway.jsonpath.InvalidJsonException;
import com.jayway.jsonpath.InvalidPathException;
import com.jayway.jsonpath.JsonPath;
import com.jayway.jsonpath.PathNotFoundException;
import java.util.LinkedHashMap;
import java.util.LinkedList;
import java.util.List;
import java.util.Map;
import lombok.experimental.UtilityClass;
import org.opensearch.sql.data.model.ExprBooleanValue;
import org.opensearch.sql.data.model.ExprCollectionValue;
import org.opensearch.sql.data.model.ExprDoubleValue;
import org.opensearch.sql.data.model.ExprIntegerValue;
import org.opensearch.sql.data.model.ExprNullValue;
import org.opensearch.sql.data.model.ExprStringValue;
import org.opensearch.sql.data.model.ExprTupleValue;
import org.opensearch.sql.data.model.ExprValue;
import org.opensearch.sql.data.model.ExprValueUtils;
import org.opensearch.sql.exception.SemanticCheckException;

@UtilityClass
public class JsonUtils {
/**
* Checks if given JSON string can be parsed as valid JSON.
*
* @param jsonExprValue JSON string (e.g. "{\"hello\": \"world\"}").
* @return true if the string can be parsed as valid JSON, else false.
* @return true if the string can be parsed as valid JSON, else false (including null or missing).
*/
public static ExprValue isValidJson(ExprValue jsonExprValue) {
ObjectMapper objectMapper = new ObjectMapper();
Expand All @@ -23,9 +47,101 @@ public static ExprValue isValidJson(ExprValue jsonExprValue) {

try {
objectMapper.readTree(jsonExprValue.stringValue());
return ExprValueUtils.LITERAL_TRUE;
return LITERAL_TRUE;
} catch (JsonProcessingException e) {
return ExprValueUtils.LITERAL_FALSE;
return LITERAL_FALSE;
}
}

/**
* Converts a JSON encoded string to a {@link ExprValue}. Expression type will be UNDEFINED.
*
* @param json JSON string (e.g. "{\"hello\": \"world\"}").
* @return ExprValue returns an expression that best represents the provided JSON-encoded string.
* <ol>
* <li>{@link ExprTupleValue} if the JSON is an object
* <li>{@link ExprCollectionValue} if the JSON is an array
* <li>{@link ExprDoubleValue} if the JSON is a floating-point number scalar
* <li>{@link ExprIntegerValue} if the JSON is an integral number scalar
* <li>{@link ExprStringValue} if the JSON is a string scalar
* <li>{@link ExprBooleanValue} if the JSON is a boolean scalar
* <li>{@link ExprNullValue} if the JSON is null, empty, or invalid
* </ol>
*/
public static ExprValue castJson(ExprValue json) {
ObjectMapper objectMapper = new ObjectMapper();
JsonNode jsonNode;
try {
jsonNode = objectMapper.readTree(json.stringValue());
} catch (JsonProcessingException e) {
final String errorFormat = "JSON string '%s' is not valid. Error details: %s";
throw new SemanticCheckException(String.format(errorFormat, json, e.getMessage()), e);
}

return processJsonNode(jsonNode);
}

/**
* Extract value of JSON string at given JSON path.
*
* @param json JSON string (e.g. "{\"hello\": \"world\"}").
* @param path JSON path (e.g. "$.hello")
* @return ExprValue of value at given path of json string.
*/
public static ExprValue extractJson(ExprValue json, ExprValue path) {
kenrickyap marked this conversation as resolved.
Show resolved Hide resolved
if (json == LITERAL_NULL || json == LITERAL_MISSING) {
return json;
}

String jsonString = json.stringValue();
String jsonPath = path.stringValue();

if (jsonString.isEmpty()) {
return LITERAL_NULL;
}

try {
Object results = JsonPath.parse(jsonString).read(jsonPath);
return ExprValueUtils.fromObjectValue(results);
} catch (PathNotFoundException e) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
} catch (PathNotFoundException e) {
} catch (PathNotFoundException ignored) {

return LITERAL_NULL;
} catch (InvalidPathException e) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please don't reuse variable names:

Suggested change
} catch (InvalidPathException e) {
} catch (InvalidPathException invalidPathException) {

final String errorFormat = "JSON path '%s' is not valid. Error details: %s";
throw new SemanticCheckException(String.format(errorFormat, path, e.getMessage()), e);
} catch (InvalidJsonException e) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please don't reuse variable names:

Suggested change
} catch (InvalidJsonException e) {
} catch (InvalidJsonException invalidJsonException) {

final String errorFormat = "JSON string '%s' is not valid. Error details: %s";
throw new SemanticCheckException(String.format(errorFormat, json, e.getMessage()), e);
}
}

private static ExprValue processJsonNode(JsonNode jsonNode) {
switch (jsonNode.getNodeType()) {
case ARRAY:
List<ExprValue> elements = new LinkedList<>();
for (var iter = jsonNode.iterator(); iter.hasNext(); ) {
jsonNode = iter.next();
elements.add(processJsonNode(jsonNode));
}
return new ExprCollectionValue(elements);
case OBJECT:
Map<String, ExprValue> values = new LinkedHashMap<>();
for (var iter = jsonNode.fields(); iter.hasNext(); ) {
Map.Entry<String, JsonNode> entry = iter.next();
values.put(entry.getKey(), processJsonNode(entry.getValue()));
}
return ExprTupleValue.fromExprValueMap(values);
case STRING:
return new ExprStringValue(jsonNode.asText());
case NUMBER:
if (jsonNode.isFloatingPointNumber()) {
return new ExprDoubleValue(jsonNode.asDouble());
}
return new ExprIntegerValue(jsonNode.asLong());
case BOOLEAN:
return jsonNode.asBoolean() ? LITERAL_TRUE : LITERAL_FALSE;
default:
// in all other cases, return null
return LITERAL_NULL;
}
}
}
Loading
Loading