Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
42 changes: 37 additions & 5 deletions docs/content/docs/connectors/table/formats/protobuf.md
Original file line number Diff line number Diff line change
Expand Up @@ -151,10 +151,8 @@ Format Options
<td>
If this value is set to true, the format will read empty values as the default values defined in the proto file.
If the value is set to false, the format will generate null values if the data element does not exist in the binary protobuf message.
If proto syntax is proto3, users need to set this to true when using protobuf versions lower than 3.15 as older versions do not support
checking for field presence which can cause runtime compilation issues. Additionally, primtive types will be set to default values
instead of null as field presence cannot be checked for them. Please be aware that setting this to true will cause the deserialization
performance to be much slower depending on schema complexity and message size.
With Flink's current protobuf version (4.32.1), field presence is properly supported for proto3, allowing null handling for non-primitive types.
Please be aware that setting this to true will cause the deserialization performance to be much slower depending on schema complexity and message size.
</td>
</tr>
<tr>
Expand Down Expand Up @@ -291,4 +289,38 @@ OneOf field
In the serialization process, there's no guarantee that the Flink fields of the same one-of group only contain at most one valid value.
When serializing, each field is set in the order of Flink schema, so the field in the higher position will override the field in lower position in the same one-of group.

You can refer to [Language Guide (proto2)](https://developers.google.com/protocol-buffers/docs/proto) or [Language Guide (proto3)](https://developers.google.com/protocol-buffers/docs/proto3) for more information about Protobuf types.
Supported Protobuf Versions
------------

Flink uses protobuf-java 4.32.1 (corresponding to Protocol Buffers version 32), which includes support for:

- **Proto2 and Proto3 syntax**: Traditional `syntax = "proto2"` and `syntax = "proto3"` definitions
- **Protobuf Editions**: The new `edition = "2023"` and `edition = "2024"` syntax introduced in Protocol Buffers v27+
- **Improved proto3 field presence detection**: Better handling of optional fields without the limitations of older protobuf versions

### Using Protobuf Editions

Protobuf Editions provide a unified syntax that combines proto2 and proto3 functionality. If you're using Editions in your `.proto` files, Flink fully supports them:

```
edition = "2023";
package com.example;
option java_package = "com.example";
option java_multiple_files = true;

message SimpleTest {
int64 uid = 1;
string name = 2 [features.field_presence = EXPLICIT];
// ... rest of your message definition
}
```

Editions allow fine-grained control over feature behavior at the file, message, or field level, while maintaining backward compatibility with proto2 and proto3. For more information, see the [Protobuf Editions documentation](https://protobuf.dev/editions/overview/).

Additional Resources
----------------
For more information about Protocol Buffers, refer to:
- [Language Guide (proto2)](https://developers.google.com/protocol-buffers/docs/proto)
- [Language Guide (proto3)](https://developers.google.com/protocol-buffers/docs/proto3)
- [Language Guide (Editions)](https://protobuf.dev/programming-guides/editions/) - for the new Editions syntax
- [Protobuf Editions Overview](https://protobuf.dev/editions/overview/) - understand the motivation and benefits of Editions
2 changes: 1 addition & 1 deletion docs/content/release-notes/flink-2.1.md
Original file line number Diff line number Diff line change
Expand Up @@ -182,4 +182,4 @@ Bump flink-shaded version to 20.0 to support Smile format.
##### [FLINK-37760](https://issues.apache.org/jira/browse/FLINK-37760)

Bump parquet version to 1.15.3 to resolve parquet-avro module
vulnerability found in [CVE-2025-30065](https://nvd.nist.gov/vuln/detail/CVE-2025-30065).
vulnerability found in [CVE-2025-30065](https://nvd.nist.gov/vuln/detail/CVE-2025-30065).
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,6 @@
import org.apache.parquet.hadoop.ParquetWriter;
import org.apache.parquet.hadoop.api.WriteSupport;
import org.apache.parquet.io.OutputFile;
import org.apache.parquet.proto.ProtoWriteSupport;

/** Convenience builder for creating {@link ParquetWriterFactory} instances for Protobuf classes. */
public class ParquetProtoWriters {
Expand Down Expand Up @@ -62,7 +61,8 @@ protected ParquetProtoWriterBuilder<T> self() {

@Override
protected WriteSupport<T> getWriteSupport(Configuration conf) {
return new ProtoWriteSupport<>(clazz);
// Use patched implementation compatible with protobuf 4.x
return new PatchedProtoWriteSupport<>(clazz);
}
}

Expand Down
Loading