Skip to content

Skip RS CTRL-CHAR to support JSON Text Sequence (RFC7464) #1414

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 11 commits into from
Mar 31, 2025
8 changes: 8 additions & 0 deletions release-notes/CREDITS-2.x
Original file line number Diff line number Diff line change
Expand Up @@ -467,6 +467,14 @@ Haruki (@stackunderflow111)
when custom characterEscape is used
(2.18.3)

Yanming Zhou (@quaff)
* Requested #633: Allow skipping `RS` CTRL-CHAR to support JSON Text Sequences
(2.19.0)

Fawzi Essam (@iifawzi)
* Contributed #633: Allow skipping `RS` CTRL-CHAR to support JSON Text Sequences
(2.19.0)

Eduard Gomoliako (@Gems)
* Contributed #1356: Make `JsonGenerator::writeTypePrefix` method to not write a
`WRAPPER_ARRAY` when `typeIdDef.id == null`
Expand Down
3 changes: 3 additions & 0 deletions release-notes/VERSION-2.x
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,9 @@ a pure JSON library.

2.19.0 (not yet released)

#633: Allow skipping `RS` CTRL-CHAR to support JSON Text Sequences
(requested by Yanming Z)
(contributed by Fawzi E)
#1328: Optimize handling of `JsonPointer.head()`
#1356: Make `JsonGenerator::writeTypePrefix` method to not write a
`WRAPPER_ARRAY` when `typeIdDef.id == null`
Expand Down
6 changes: 6 additions & 0 deletions src/main/java/com/fasterxml/jackson/core/JsonParser.java
Original file line number Diff line number Diff line change
Expand Up @@ -193,6 +193,12 @@ public enum Feature {
@Deprecated
ALLOW_UNQUOTED_CONTROL_CHARS(false),

/**
* @deprecated Use {@link com.fasterxml.jackson.core.json.JsonReadFeature#ALLOW_RS_CONTROL_CHAR} instead
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May look funny but JsonParser.Feature is to be removed from 3.0 yet is needed internally (all StreamReadFeatures and JsonReadFeatures must map back to one entry).
But we don't want users to use JsonParser.Feature any more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's good to know about that to help take care of it next time.
Thanks for your collaboration on the PRs!

*/
@Deprecated // but due to technical reasons we need this entry too
ALLOW_RS_CONTROL_CHAR(false),

/**
* Feature that can be enabled to accept quoting of all character
* using backslash quoting mechanism: if not enabled, only characters
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,7 @@ public abstract class ParserMinimalBase extends JsonParser
protected final static int INT_LF = '\n';
protected final static int INT_CR = '\r';
protected final static int INT_SPACE = 0x0020;
protected final static int INT_RS = 0x001E;

// Markup
protected final static int INT_LBRACKET = '[';
Expand Down Expand Up @@ -769,6 +770,10 @@ protected void reportUnexpectedNumberChar(int ch, String comment) throws JsonPar
protected void _throwInvalidSpace(int i) throws JsonParseException {
char c = (char) i;
String msg = "Illegal character ("+_getCharDesc(c)+"): only regular white space (\\r, \\n, \\t) is allowed between tokens";

if (i == INT_RS) {
msg += " (consider enabling `JsonReadFeature.ALLOW_RS_CONTROL_CHAR` feature to allow use of Record Separators (\\u001E).";
}
throw _constructReadException(msg);
}

Expand Down
14 changes: 14 additions & 0 deletions src/main/java/com/fasterxml/jackson/core/json/JsonParserBase.java
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,9 @@ public abstract class JsonParserBase
protected final static int FEAT_MASK_NON_NUM_NUMBERS = Feature.ALLOW_NON_NUMERIC_NUMBERS.getMask();
@SuppressWarnings("deprecation")
protected final static int FEAT_MASK_ALLOW_MISSING = Feature.ALLOW_MISSING_VALUES.getMask();
@SuppressWarnings("deprecation")
protected final static int FEAT_MASK_ALLOW_CTRL_RS = Feature.ALLOW_RS_CONTROL_CHAR.getMask();

protected final static int FEAT_MASK_ALLOW_SINGLE_QUOTES = Feature.ALLOW_SINGLE_QUOTES.getMask();
protected final static int FEAT_MASK_ALLOW_UNQUOTED_NAMES = Feature.ALLOW_UNQUOTED_FIELD_NAMES.getMask();
protected final static int FEAT_MASK_ALLOW_JAVA_COMMENTS = Feature.ALLOW_COMMENTS.getMask();
Expand Down Expand Up @@ -130,4 +133,15 @@ public final JsonLocation getCurrentLocation() {
public final JsonLocation getTokenLocation() {
return currentTokenLocation();
}

/*
/**********************************************************************
/* Other helper methods
/**********************************************************************
*/

// @since 2.19
protected boolean _isAllowedCtrlCharRS(int i) {
return (i == INT_RS) && (_features & FEAT_MASK_ALLOW_CTRL_RS) != 0;
}
}
13 changes: 13 additions & 0 deletions src/main/java/com/fasterxml/jackson/core/json/JsonReadFeature.java
Original file line number Diff line number Diff line change
Expand Up @@ -80,6 +80,19 @@ public enum JsonReadFeature
@SuppressWarnings("deprecation")
ALLOW_UNESCAPED_CONTROL_CHARS(false, JsonParser.Feature.ALLOW_UNQUOTED_CONTROL_CHARS),

/**
* Feature that determines whether parser will allow
* Record Separator (RS) control character ({@code 0x1E})
* as part of ignorable whitespace in JSON input, similar to the TAB character.
* <p>
* Since the official JSON specification permits only a limited set of control
* characters as whitespace, this is a non-standard feature and is disabled by default.
*
* @since 2.19
*/
@SuppressWarnings("deprecation")
ALLOW_RS_CONTROL_CHAR(false, JsonParser.Feature.ALLOW_RS_CONTROL_CHAR),

/**
* Feature that can be enabled to accept quoting of all character
* using backslash quoting mechanism: if not enabled, only characters
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2504,7 +2504,7 @@ private final int _skipWSOrEnd() throws IOException
_currInputRowStart = _inputPtr;
} else if (i == INT_CR) {
_skipCR();
} else if (i != INT_TAB) {
} else if (i != INT_TAB && !_isAllowedCtrlCharRS(i)) {
_throwInvalidSpace(i);
}
}
Expand All @@ -2524,7 +2524,7 @@ private final int _skipWSOrEnd() throws IOException
_currInputRowStart = _inputPtr;
} else if (i == INT_CR) {
_skipCR();
} else if (i != INT_TAB) {
} else if (i != INT_TAB && !_isAllowedCtrlCharRS(i)) {
_throwInvalidSpace(i);
}
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -3080,7 +3080,7 @@ private final int _skipWSOrEnd() throws IOException
_currInputRowStart = _inputPtr;
} else if (i == INT_CR) {
_skipCR();
} else if (i != INT_TAB) {
} else if (i != INT_TAB && !_isAllowedCtrlCharRS(i)) {
_throwInvalidSpace(i);
}
}
Expand All @@ -3100,7 +3100,7 @@ private final int _skipWSOrEnd() throws IOException
_currInputRowStart = _inputPtr;
} else if (i == INT_CR) {
_skipCR();
} else if (i != INT_TAB) {
} else if (i != INT_TAB && !_isAllowedCtrlCharRS(i)) {
_throwInvalidSpace(i);
}
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -942,7 +942,7 @@ private final int _skipWS(int ch) throws IOException
} else if (ch == INT_CR) {
++_currInputRowAlt;
_currInputRowStart = _inputPtr;
} else if (ch != INT_TAB) {
} else if (ch != INT_TAB && !_isAllowedCtrlCharRS(ch)) {
_throwInvalidSpace(ch);
}
}
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,73 @@
package com.fasterxml.jackson.core.read;

import static org.junit.jupiter.api.Assertions.assertEquals;
import static org.junit.jupiter.api.Assertions.fail;

import java.io.StringReader;
import java.nio.charset.StandardCharsets;

import org.junit.jupiter.api.Test;

import com.fasterxml.jackson.core.*;
import com.fasterxml.jackson.core.exc.StreamReadException;
import com.fasterxml.jackson.core.json.JsonReadFeature;
import com.fasterxml.jackson.core.json.async.NonBlockingJsonParser;

// for [core#633]: optionally allow Record-Separator ctrl char
class NonStandardAllowRSTest
extends JUnit5TestBase
{
@Test
void recordSeparatorEnabled() throws Exception {
doRecordSeparationTest(true);
}

@Test
void recordSeparatorDisabled() throws Exception {
doRecordSeparationTest(false);
}

// Testing record separation for all parser implementations
private void doRecordSeparationTest(boolean recordSeparation) throws Exception {
String contents = "{\"key\":true}\u001E";
JsonFactory factory = JsonFactory.builder()
.configure(JsonReadFeature.ALLOW_RS_CONTROL_CHAR, recordSeparation)
.build();
try (JsonParser parser = factory.createParser(contents)) {
verifyRecordSeparation(parser, recordSeparation);
}
try (JsonParser parser = factory.createParser(new StringReader(contents))) {
verifyRecordSeparation(parser, recordSeparation);
}
try (JsonParser parser = factory.createParser(contents.getBytes(StandardCharsets.UTF_8))) {
verifyRecordSeparation(parser, recordSeparation);
}
try (NonBlockingJsonParser parser = (NonBlockingJsonParser) factory.createNonBlockingByteArrayParser()) {
byte[] data = contents.getBytes(StandardCharsets.UTF_8);
parser.feedInput(data, 0, data.length);
parser.endOfInput();
verifyRecordSeparation(parser, recordSeparation);
}
}

private void verifyRecordSeparation(JsonParser parser, boolean recordSeparation) throws Exception {
try {
assertToken(JsonToken.START_OBJECT, parser.nextToken());
String field1 = parser.nextFieldName();
assertEquals("key", field1);
assertToken(JsonToken.VALUE_TRUE, parser.nextToken());
assertToken(JsonToken.END_OBJECT, parser.nextToken());
parser.nextToken(); // RS token
if (!recordSeparation) {
fail("Should have thrown an exception");
}
} catch (StreamReadException e) {
if (!recordSeparation) {
verifyException(e, "Illegal character ((CTRL-CHAR");
verifyException(e, "consider enabling `JsonReadFeature.ALLOW_RS_CONTROL_CHAR`");
} else {
throw e;
}
}
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@
import org.junit.jupiter.api.Test;

import com.fasterxml.jackson.core.*;
import com.fasterxml.jackson.core.exc.StreamReadException;
import com.fasterxml.jackson.core.json.JsonReadFeature;

import static org.junit.jupiter.api.Assertions.*;
Expand Down Expand Up @@ -57,6 +58,20 @@ void tabsEnabled() throws Exception
_testTabsEnabled(true);
}

@Test
void recordSeparatorDefault() throws Exception
{
_testRecordSeparatorDefault(false);
_testRecordSeparatorDefault(true);
}

@Test
void recordSeparatorEnabled() throws Exception
{
_testRecordSeparatorEnabled(false);
_testRecordSeparatorEnabled(true);
}

/*
/****************************************************************
/* Secondary test methods
Expand Down Expand Up @@ -134,4 +149,43 @@ private void _testTabsEnabled(boolean useStream) throws Exception
assertToken(JsonToken.END_OBJECT, p.nextToken());
p.close();
}

private void _testRecordSeparatorDefault(boolean useStream) throws Exception {
JsonFactory f = new JsonFactory();
String JSON = "[\"val:\"]\u001E";

try (JsonParser p = useStream ? createParserUsingStream(f, JSON, "UTF-8") : createParserUsingReader(f, JSON)) {
assertToken(JsonToken.START_ARRAY, p.nextToken());
try {
p.nextToken(); // val
p.nextToken(); // ]
p.nextToken(); // RS token
fail("Expected exception");
} catch (StreamReadException e) {
verifyException(e, "Illegal character ((CTRL-CHAR");
verifyException(e, "consider enabling `JsonReadFeature.ALLOW_RS_CONTROL_CHAR`");
}
}
}

private void _testRecordSeparatorEnabled(boolean useStream) throws Exception
{
JsonFactory f = JsonFactory.builder()
.configure(JsonReadFeature.ALLOW_RS_CONTROL_CHAR, true)
.build();

String FIELD = "key";
String VALUE = "value";
String JSON = "{ "+q(FIELD)+" : "+q(VALUE)+"}\u001E";
JsonParser p = useStream ? createParserUsingStream(f, JSON, "UTF-8") : createParserUsingReader(f, JSON);

assertToken(JsonToken.START_OBJECT, p.nextToken());
assertToken(JsonToken.FIELD_NAME, p.nextToken());
assertEquals(FIELD, p.getText());
assertToken(JsonToken.VALUE_STRING, p.nextToken());
assertEquals(VALUE, p.getText());
assertToken(JsonToken.END_OBJECT, p.nextToken());
p.nextToken(); // RS token
p.close();
}
}