- 
                Notifications
    You must be signed in to change notification settings 
- Fork 3.4k
Add MAP type support to the ClickHouse connector #26957
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
| Reviewer's GuideThis PR implements native ClickHouse MAP type support by extending the client’s type mapping and write mapping logic. It injects type operators, refactors toColumnMapping to route MAP column types through a new mapColumnMapping method that builds a Trino MapType from ClickHouse key/value definitions, and adds corresponding read/write functions. Connector behavior flags and tests are updated to cover MAP round-trips. Sequence diagram for MAP type read/write round-tripsequenceDiagram
    participant Client
    participant ClickHouseClient
    participant ClickHouseDB
    Client->>ClickHouseClient: Write MAP column
    ClickHouseClient->>ClickHouseDB: setObject(index, mapValue)
    ClickHouseDB-->>ClickHouseClient: Returns MAP column data
    ClickHouseClient->>Client: Read MAP column (buildMapValue)
Entity relationship diagram for ClickHouse MAP type mappingerDiagram
    CLICKHOUSE_COLUMN {
        string name
        ClickHouseDataType dataType
        int precision
        int scale
    }
    TRINO_MAP_TYPE {
        Type keyType
        Type valueType
    }
    COLUMN_MAPPING {
        Type type
        ObjectReadFunction readFunction
        ObjectWriteFunction writeFunction
    }
    CLICKHOUSE_COLUMN ||--o| TRINO_MAP_TYPE : "maps to"
    TRINO_MAP_TYPE ||--o| COLUMN_MAPPING : "used in"
    CLICKHOUSE_COLUMN ||--o| COLUMN_MAPPING : "key/value info"
Class diagram for updated ClickHouseClient MAP type supportclassDiagram
    class ClickHouseClient {
        - TypeOperators typeOperators
        - Type uuidType
        - Type ipAddressType
        + Optional<ColumnMapping> toColumnMapping(ConnectorSession, ConnectorColumnHandle, JdbcTypeHandle)
        + Optional<ColumnMapping> toColumnMapping(ConnectorSession, String, int, Optional<Integer>, Optional<Integer>)
        + Optional<ColumnMapping> mapColumnMapping(ConnectorSession, ClickHouseColumn, ClickHouseColumn)
        + WriteMapping toWriteMapping(ConnectorSession, Type)
        + static ObjectWriteFunction mapWriteFunction()
        + static void writeValue(Type, BlockBuilder, Object)
    }
    class MapType {
        + Type getKeyType()
        + Type getValueType()
    }
    class ColumnMapping {
        + static ColumnMapping objectMapping(Type, ObjectReadFunction, ObjectWriteFunction)
    }
    class ObjectReadFunction {
        + static ObjectReadFunction of(Class, Function)
    }
    class ObjectWriteFunction {
        + static ObjectWriteFunction of(Class, BiConsumer)
    }
    ClickHouseClient --> MapType : uses
    ClickHouseClient --> ColumnMapping : returns
    ClickHouseClient --> ObjectReadFunction : uses
    ClickHouseClient --> ObjectWriteFunction : uses
File-Level Changes
 Tips and commandsInteracting with Sourcery
 Customizing Your ExperienceAccess your dashboard to: 
 Getting Help
 | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey there - I've reviewed your changes and they look great!
Prompt for AI Agents
Please address the comments from this code review:
## Individual Comments
### Comment 1
<location> `plugin/trino-clickhouse/src/main/java/io/trino/plugin/clickhouse/ClickHouseClient.java:659-664` </location>
<code_context>
+        return columnMapping;
+    }
+
+    private Optional<ColumnMapping> toColumnMapping(
+            ConnectorSession session,
+            String typeName,
+            int jdbcType,
+            Optional<Integer> decimalDigits,
+            Optional<Integer> columnSize)
+    {
         ClickHouseVersion version = getClickHouseServerVersion(session);
</code_context>
<issue_to_address>
**issue:** Consider documenting the contract for 'toColumnMapping' regarding Optional parameters.
Since 'orElseThrow()' is called on these Optionals, missing values will cause exceptions. Please verify that all callers supply these parameters, or update the method to handle missing values appropriately.
</issue_to_address>
### Comment 2
<location> `plugin/trino-clickhouse/src/main/java/io/trino/plugin/clickhouse/ClickHouseClient.java:707-708` </location>
<code_context>
                 return Optional.of(varbinaryColumnMapping());
             case UUID:
                 return Optional.of(uuidColumnMapping());
+            case Map:
+                return mapColumnMapping(session, column.getKeyInfo(), column.getValueInfo());
             default:
                 // no-op
</code_context>
<issue_to_address>
**suggestion:** Map type handling may not cover all edge cases for key/value types.
Since unsupported key or value types result in an empty return, this may cause silent failures. Please consider adding explicit error handling or logging to improve user feedback.
Suggested implementation:
```java
        switch (columnDataType) {
            case Bool:
                return Optional.of(varbinaryColumnMapping());
            case UUID:
                return Optional.of(uuidColumnMapping());
            case Map: {
                Optional<ClickHouseColumnMapping> mapping = mapColumnMapping(session, column.getKeyInfo(), column.getValueInfo());
                if (!mapping.isPresent()) {
                    log.warn("Unsupported key or value type for Map column: keyInfo={}, valueInfo={}", column.getKeyInfo(), column.getValueInfo());
                    // Alternatively, you could throw an exception here if you want to fail fast:
                    // throw new IllegalArgumentException("Unsupported key or value type for Map column: keyInfo=" + column.getKeyInfo() + ", valueInfo=" + column.getValueInfo());
                }
                return mapping;
            }
            default:
                // no-op
        }
```
- Ensure that a logger named `log` is available in the class. If not, add:
  `private static final Logger log = Logger.get(ClickHouseClient.class);`
- Decide whether you want to log a warning (as above) or throw an exception for unsupported types.
</issue_to_address>
### Comment 3
<location> `plugin/trino-clickhouse/src/main/java/io/trino/plugin/clickhouse/ClickHouseClient.java:713` </location>
<code_context>
         }
-        switch (typeHandle.jdbcType()) {
+        switch (jdbcType) {
             case Types.TINYINT:
                 return Optional.of(tinyintColumnMapping());
</code_context>
<issue_to_address>
**issue (bug_risk):** Switching on 'jdbcType' instead of 'typeHandle.jdbcType()' may introduce subtle bugs.
Please verify that 'jdbcType' and 'typeHandle.jdbcType()' will always have matching values to prevent incorrect type mapping.
</issue_to_address>
### Comment 4
<location> `plugin/trino-clickhouse/src/test/java/io/trino/plugin/clickhouse/TestClickHouseConnectorTest.java:668-675` </location>
<code_context>
         }
     }
+    @Test
+    @Override
+    public void testInsertMap()
+    {
+        // TODO: Add more types here
</code_context>
<issue_to_address>
**suggestion (testing):** Add more comprehensive MAP type tests, including edge cases.
Please add tests for empty maps, multiple key-value pairs, null values, complex key/value types, and error conditions like duplicate keys or unsupported types to improve coverage.
```suggestion
    @Test
    @Override
    public void testInsertMap()
    {
        // Basic types
        testMapRoundTrip("INTEGER", "2");
        testMapRoundTrip("VARCHAR", "CAST('foobar' AS VARCHAR)");
        // Empty map
        try (TestTable table = newTrinoTable("test_insert_empty_map_", "(col map(INTEGER, INTEGER))")) {
            assertUpdate("INSERT INTO " + table.getName() + " VALUES map(ARRAY[], ARRAY[])", 1);
            assertThat(query("SELECT cardinality(col) FROM " + table.getName()))
                    .matches("VALUES 0");
        }
        // Multiple key-value pairs
        try (TestTable table = newTrinoTable("test_insert_multi_map_", "(col map(INTEGER, VARCHAR))")) {
            assertUpdate("INSERT INTO " + table.getName() + " VALUES map(ARRAY[1,2,3], ARRAY['a','b','c'])", 1);
            assertThat(query("SELECT col[1], col[2], col[3] FROM " + table.getName()))
                    .matches("VALUES ('a', 'b', 'c')");
        }
        // Null values
        try (TestTable table = newTrinoTable("test_insert_null_value_map_", "(col map(INTEGER, VARCHAR))")) {
            assertUpdate("INSERT INTO " + table.getName() + " VALUES map(ARRAY[1,2], ARRAY['x', NULL])", 1);
            assertThat(query("SELECT col[1], col[2] FROM " + table.getName()))
                    .matches("VALUES ('x', NULL)");
        }
        // Complex key/value types (e.g., ARRAY as value)
        try (TestTable table = newTrinoTable("test_insert_complex_map_", "(col map(INTEGER, ARRAY(INTEGER)))")) {
            assertUpdate("INSERT INTO " + table.getName() + " VALUES map(ARRAY[1,2], ARRAY[ARRAY[10,20], ARRAY[30]])", 1);
            assertThat(query("SELECT col[1], col[2] FROM " + table.getName()))
                    .matches("VALUES (ARRAY[10,20], ARRAY[30])");
        }
        // Error: duplicate keys
        try (TestTable table = newTrinoTable("test_insert_duplicate_key_map_", "(col map(INTEGER, INTEGER))")) {
            assertQueryFails(
                    "INSERT INTO " + table.getName() + " VALUES map(ARRAY[1,1], ARRAY[10,20])",
                    "Duplicate keys are not allowed in map");
        }
        // Error: unsupported key type (e.g., map as key)
        try (TestTable table = newTrinoTable("test_insert_unsupported_key_map_", "(col map(map(INTEGER, INTEGER), INTEGER))")) {
            assertQueryFails(
                    "INSERT INTO " + table.getName() + " VALUES map(ARRAY[map(ARRAY[1], ARRAY[2])], ARRAY[10])",
                    ".*map type is not supported as map key.*");
        }
    }
```
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
| private Optional<ColumnMapping> toColumnMapping( | ||
| ConnectorSession session, | ||
| String typeName, | ||
| int jdbcType, | ||
| Optional<Integer> decimalDigits, | ||
| Optional<Integer> columnSize) | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
issue: Consider documenting the contract for 'toColumnMapping' regarding Optional parameters.
Since 'orElseThrow()' is called on these Optionals, missing values will cause exceptions. Please verify that all callers supply these parameters, or update the method to handle missing values appropriately.
| case Map: | ||
| return mapColumnMapping(session, column.getKeyInfo(), column.getValueInfo()); | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion: Map type handling may not cover all edge cases for key/value types.
Since unsupported key or value types result in an empty return, this may cause silent failures. Please consider adding explicit error handling or logging to improve user feedback.
Suggested implementation:
        switch (columnDataType) {
            case Bool:
                return Optional.of(varbinaryColumnMapping());
            case UUID:
                return Optional.of(uuidColumnMapping());
            case Map: {
                Optional<ClickHouseColumnMapping> mapping = mapColumnMapping(session, column.getKeyInfo(), column.getValueInfo());
                if (!mapping.isPresent()) {
                    log.warn("Unsupported key or value type for Map column: keyInfo={}, valueInfo={}", column.getKeyInfo(), column.getValueInfo());
                    // Alternatively, you could throw an exception here if you want to fail fast:
                    // throw new IllegalArgumentException("Unsupported key or value type for Map column: keyInfo=" + column.getKeyInfo() + ", valueInfo=" + column.getValueInfo());
                }
                return mapping;
            }
            default:
                // no-op
        }- Ensure that a logger named logis available in the class. If not, add:
 private static final Logger log = Logger.get(ClickHouseClient.class);
- Decide whether you want to log a warning (as above) or throw an exception for unsupported types.
| } | ||
|  | ||
| switch (typeHandle.jdbcType()) { | ||
| switch (jdbcType) { | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
issue (bug_risk): Switching on 'jdbcType' instead of 'typeHandle.jdbcType()' may introduce subtle bugs.
Please verify that 'jdbcType' and 'typeHandle.jdbcType()' will always have matching values to prevent incorrect type mapping.
3cab91d    to
    b520b7c      
    Compare
  
    | .matches("VALUES " + value); | ||
| } | ||
| } | ||
|  | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please update BaseClickHouseTypeMapping as well? We use ***TypeMapping class to test type mapping in JDBC-based connectors.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep! Added tests for all the primitives at least. Got all of them working except Timestamp w/ TZ, not exactly sure why but the precision for a timestamp in a Map is coming back as 29 but it should be 0.
Going to add some tests for nested maps tomorrow.
| return Optional.of(varbinaryColumnMapping()); | ||
| case UUID: | ||
| return Optional.of(uuidColumnMapping()); | ||
| case Map: | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you update Type mapping section in clickhouse.md?
|  | ||
| private void testMapRoundTrip(String valueType, String value) | ||
| { | ||
| try (TestTable table = newTrinoTable("test_insert_map_", "(col map(INTEGER, %s) NOT NULL)".formatted(valueType))) { | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This fails without the NOT NULL right now, gotta figure that out or at least improve the error message
17fa9c9    to
    db32831      
    Compare
  
    db32831    to
    889c0a6      
    Compare
  
    | { | ||
| String jdbcTypeName = typeHandle.jdbcTypeName() | ||
| .orElseThrow(() -> new TrinoException(JDBC_ERROR, "Type name is missing: " + typeHandle)); | ||
|  | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unrelated change.
| throw new TrinoException(GENERIC_INTERNAL_ERROR, "Invalid InetAddress length: " + address.length); | ||
| } | ||
|  | ||
| byte[] bytes = parseIpAddressBytes(address); | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this refactoring needed for the current PR?
Description
Add
MAPtype support to the ClickHouse connector.https://clickhouse.com/docs/sql-reference/data-types/map
ClickHouse maps are multi-value, but when I tried that out I only got one value in the client.
I'll add docs if this looks good to reviewers.
Additional context and related issues
#7103
Release notes
( ) This is not user-visible or is docs only, and no release notes are required.
( ) Release notes are required. Please propose a release note for me.
( ) Release notes are required, with the following suggested text:
Summary by Sourcery
Add MAP type support to the ClickHouse connector by introducing read and write mappings for MapType, refactoring column mapping logic, and adding relevant tests.
New Features:
Enhancements:
Tests: