Redact sensitive information in catalog queries #24563

piotrrzysko · 2024-12-23T11:40:02Z

Description

This a follow-up to #24562 that introduces redacting of security-sensitive information in statements containing connector properties, specifically:

CREATE CATALOG
EXPLAIN CREATE CATALOG
EXPLAIN ANALYZE CREATE CATALOG

The current approach is as follows:

For syntactically valid statements, only properties containing sensitive information are masked.
If a valid query references a nonexistent connector, all properties are masked.
If a query fails before or during parsing, the entire query is masked

Redacted queries are returned through the REST API, the system.runtime.queries table, and query events (QueryCreatedEvent and QueryCompletedEvent).

Notice that currently this PR includes two commits from #24562.

Additional context and related issues

Release notes

( ) This is not user-visible or is docs only, and no release notes are required.
( ) Release notes are required. Please propose a release note for me.
(x) Release notes are required, with the following suggested text:

## Section
* Redact sensitive information in statements containing connector properties. ({issue}`23106`)

The SPI will be used by the engine to redact security-sensitive information in statements that manage catalogs. It has been added at the connector factory level, rather than the connector level, to allow more flexibility in retrieving properties. In some cases, we want to perform redacting before a connector is initiated. For example, when we create a new catalog by issuing the CREATE CATALOG statement.

Exposed properties fall into one of the following categories: they are either explicitly marked as security-sensitive or are unknown. The connector assumes that unknown properties might be misspelled security-sensitive properties. The purpose of the included test is to identify security-sensitive properties that may be used by the connector. It uses the output generated by the maven-dependency-plugin, configured in the connector's pom.xml file. This output contains the connector's runtime classpath, which is then scanned to identify all property names annotated with @config. Scanning the classpath ensures that all configuration classes are included, even those used conditionally.

This commit introduces redacting of security-sensitive information in statements containing connector properties, specifically: * CREATE CATALOG * EXPLAIN CREATE CATALOG * EXPLAIN ANALYZE CREATE CATALOG The current approach is as follows: * For syntactically valid statements, only properties containing sensitive information are masked. * If a valid query references a nonexistent connector, all properties are masked. * If a query fails before or during parsing, the entire query is masked The redacted form is created in DispatchManager and is propagated to all places that create QueryInfo and BasicQueryInfo. Before this change, QueryInfo/BasicQueryInfo stored the raw query text received from the end user. From now on, the text will be altered for the cases listed above.

@JsonConstructor for TrimmedBasicQueryInfo was introduced to facilitate the deserialization of server responses in tests.

hashhar

Looks mostly good to me. Some comments.

hashhar · 2025-01-02T08:31:58Z

core/trino-main/src/main/java/io/trino/FeaturesConfig.java

+        return statementRedactingEnabled;
+    }
+
+    @Config("statement-redacting-enabled")


@mosabua for suggestions about config naming. 😄

I'm not sure we want an option to disable this. Maybe as a temporary kill switch, but we should remove this as soon as we are happy with this feature

Agreed, we can prefix with experimental. in that case like we have done in past to clarify this. Or maybe deprecated. from the beginning.

hashhar · 2025-01-02T08:32:19Z

core/trino-main/src/main/java/io/trino/connector/DefaultCatalogFactory.java

+    {
+        ConnectorFactory connectorFactory = connectorFactories.get(connectorName);
+        if (connectorFactory == null) {
+            // If someone tries to use a non-existent connector, we assume they


great catch, I didn't think of this case.

hashhar · 2025-01-02T08:35:05Z

core/trino-main/src/main/java/io/trino/sql/SensitiveStatementRedactor.java

+        }
+
+        @Override
+        protected Node visitCreateCatalog(CreateCatalog createCatalog, Void context)


Is there some way to notice when we need to add new node visitors here?

Should this be a "wrapper" like the various Forwarding*** classes and a test to assert that full set of methods is overridden? That way once new methods get added we'll explicitly need to either override to do no-op or to redact?

WDYT? Might be overkill for now so need to change anything - just to have a discussion.

hashhar · 2025-01-02T08:37:20Z

core/trino-main/src/main/java/io/trino/dispatcher/DispatchManager.java

@@ -240,7 +248,7 @@ private <C> void createQueryInternal(QueryId queryId, Span querySpan, Slug slug,
            DispatchQuery dispatchQuery = dispatchQueryFactory.createDispatchQuery(


this automatically also handles things like event listener and QueryResource right?

Might be worth to explicitly call it out in the commit message (although you do imply that by mentioning anything using QueryInfo/BasicQueryInfo).

dain · 2025-01-02T19:48:04Z

core/trino-main/src/main/java/io/trino/FeaturesConfig.java

+        return statementRedactingEnabled;
+    }
+
+    @Config("statement-redacting-enabled")


I'm not sure we want an option to disable this. Maybe as a temporary kill switch, but we should remove this as soon as we are happy with this feature

dain · 2025-01-02T19:55:45Z

core/trino-main/src/main/java/io/trino/sql/SensitiveStatementRedactor.java

+
+public class SensitiveStatementRedactor
+{
+    public static final String REDACTED_VALUE = "***";


We should consider a better value here than just ***. We could also consider using a special function like $redacted$(), which just throws exceptions if you try to actuall call that function.

*** seems to be almost what everyone uses for redaction.

Can you expand on the function idea? Is that to make it so that the output of SHOW CREATE CATALOG (as an example) is valid but still fails when you try to run it.

piotrrzysko added 2 commits December 23, 2024 12:31

cla-bot bot added the cla-signed label Dec 23, 2024

This was referenced Dec 23, 2024

Redact sensitive information in catalog queries #23104

Closed

Add connector SPI for returning redactable properties #24562

Draft

piotrrzysko added 3 commits December 23, 2024 14:44

Ensure queries in system.runtime.queries are redacted

042afdf

Ensure queries returned via REST API are redacted

ed595a1

@JsonConstructor for TrimmedBasicQueryInfo was introduced to facilitate the deserialization of server responses in tests.

piotrrzysko force-pushed the redact-sensitive-queries branch from 98470bb to ed595a1 Compare December 23, 2024 13:47

hashhar reviewed Jan 2, 2025

View reviewed changes

piotrrzysko mentioned this pull request Jan 2, 2025

Extend syntax for Dynamic Catalogs #22188

Open

3 tasks

dain reviewed Jan 2, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Redact sensitive information in catalog queries #24563

Redact sensitive information in catalog queries #24563

piotrrzysko commented Dec 23, 2024 •

edited

Loading

hashhar left a comment

hashhar Jan 2, 2025

dain Jan 2, 2025

hashhar Jan 3, 2025

hashhar Jan 2, 2025

hashhar Jan 2, 2025

hashhar Jan 2, 2025

dain Jan 2, 2025

dain Jan 2, 2025

hashhar Jan 3, 2025

		@@ -240,7 +248,7 @@ private <C> void createQueryInternal(QueryId queryId, Span querySpan, Slug slug,
		DispatchQuery dispatchQuery = dispatchQueryFactory.createDispatchQuery(

Redact sensitive information in catalog queries #24563

Are you sure you want to change the base?

Redact sensitive information in catalog queries #24563

Conversation

piotrrzysko commented Dec 23, 2024 • edited Loading

Description

Additional context and related issues

Release notes

hashhar left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

piotrrzysko commented Dec 23, 2024 •

edited

Loading