Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add connector SPI for returning redactable properties #24562

Draft
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

piotrrzysko
Copy link
Member

@piotrrzysko piotrrzysko commented Dec 23, 2024

Description

An alternative approach to #23103. The main difference is that in this approach, the properties requiring redaction are selected from those provided by the user, rather than always returning a static set of predefined security-sensitive properties. The benefits are as follows:

  • By default (if a connector doesn't implement the SPI), all properties are masked.
  • Unknown (potentially misspelled) properties can also be treated as redactable.

This PR includes an implementation of the new SPI for the PostgreSQL connector. Once we confirm that the approach is correct, we will apply it to the remaining connectors.

Here is a PR demonstrating how the new SPI could be used to mask security-sensitive properties in queries related to creating catalogs: #24563.

Additional context and related issues

Resolves #22887.

Release notes

( ) This is not user-visible or is docs only, and no release notes are required.
( ) Release notes are required. Please propose a release note for me.
(x) Release notes are required, with the following suggested text:

## SPI
* Add connector SPI for returning redactable properties ({issue}`22887`)

The SPI will be used by the engine to redact security-sensitive
information in statements that manage catalogs. It has been added at the
connector factory level, rather than the connector level, to allow more
flexibility in retrieving properties. In some cases, we want to perform
redacting before a connector is initiated. For example, when we create a
new catalog by issuing the CREATE CATALOG statement.
Exposed properties fall into one of the following categories: they are
either explicitly marked as security-sensitive or are unknown. The
connector assumes that unknown properties might be misspelled
security-sensitive properties.

The purpose of the included test is to identify security-sensitive
properties that may be used by the connector. It uses the output
generated by the maven-dependency-plugin, configured in the connector's
pom.xml file. This output contains the connector's runtime classpath,
which is then scanned to identify all property names annotated with
@config. Scanning the classpath ensures that all configuration classes
are included, even those used conditionally.

public interface ConnectorFactory
{
String getName();

@CheckReturnValue
Connector create(String catalogName, Map<String, String> config, ConnectorContext context);

/**
* Extracts property names from the provided set that may include security-sensitive
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Extracts -> Returns

feels clearer to me.

getRedactablePropertyNames -> getSecuritySensitivePropertyNames? (to align with Airlift's ConfigurationMetadata#isSecuritySensitive and @ConfigSecuritySensitive)? WDYT?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had the exact same thought about renaming to security sensitive

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in commit message: even those used conditionally. -> even those used conditionally or contributed by other modules.

Am I right that scanning the classpath will also include cases where properties are contributed from other modules?

@@ -34,4 +53,97 @@ public void testCreateConnector()
"bootstrap.quiet", "true"),
new TestingPostgreSqlConnectorContext()).shutdown();
}

@Test
void testUnknownPropertiesAreRedactable()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

testUnknownPropertiesAreRedactable -> testUnknownPropertiesAreSecuritySensitive (if you decide to change the SPI method name). Here and elsewhere.


public interface ConnectorFactory
{
String getName();

@CheckReturnValue
Connector create(String catalogName, Map<String, String> config, ConnectorContext context);

/**
* Extracts property names from the provided set that may include security-sensitive
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had the exact same thought about renaming to security sensitive

{
checkArgument(!isNullOrEmpty(name), "name is null or empty");
this.name = name;
this.module = requireNonNull(module, "module is null");
Set<Class<?>> configClasses = ImmutableSet.<Class<?>>builder()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of attempting to list every configuration class, I think we should modify ConfigurationFactory in Airlift to extract the properties for us. I'm thinking (just thoughts after a brief look) that we have a method to extract all properties from a set of modules, and classify them into used, unused, and unknown. With used and unused having sub classification for secure or unsecure.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

Add system to identify security sensitive catalog properties
3 participants