Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kafka local aws glue registry support #5377

Open
katelfox opened this issue Jan 30, 2025 · 3 comments
Open

Kafka local aws glue registry support #5377

katelfox opened this issue Jan 30, 2025 · 3 comments
Labels
enhancement New feature or request good first issue Good for newcomers

Comments

@katelfox
Copy link

Is your feature request related to a problem? Please describe.
Unable to run data prepper locally in a docker container pointing to a glue schema registry in another docker container (motoserver)

Describe the solution you'd like
All that is necessary for the support is to allow specifying the AWS registry url and populating AWSSchemaRegistryConstants.AWS_ENDPOINT in the configuration if it is specified

Describe alternatives you've considered (Optional)
At this point I don't know of any way to be able to consume messages from a local kafka with a local aws glue schema registry

Additional context
Add any other context or screenshots about the feature request here.

@dlvenable
Copy link
Member

It sounds like this could be a new configuration that we could add and use. @katelfox , Would you be able to add this feature?

@katelfox
Copy link
Author

katelfox commented Feb 6, 2025

Maybe -- my first attempt wasn't successful. I will hopefully have time to go back again and make it work. Would someone be able to help me/provide guidance? So far the other thing I've found that needs to happen is dummy user name and password to be passed in -- not sure of the best way to do that. For now I was adding it as environment variables when running the docker container.

@dlvenable
Copy link
Member

Hello @katelfox ,

I think we would need to set the endpoint by adding a new line into this section.

configs.put(AWSSchemaRegistryConstants.AVRO_RECORD_TYPE, AvroRecordType.GENERIC_RECORD.getName());
configs.put(AWSSchemaRegistryConstants.CACHE_TIME_TO_LIVE_MILLIS, "86400000");
configs.put(AWSSchemaRegistryConstants.CACHE_SIZE, "10");
configs.put(AWSSchemaRegistryConstants.COMPATIBILITY_SETTING, Compatibility.FULL);
glueDeserializer = new GlueSchemaRegistryKafkaDeserializer(awsGlueCredentialsProvider, configs);

We also need to make this configurable. So that means either using an existing configuration or using one that exists. I think we can use the existing registry_url configuration. I believe it is not used for Glue so we could make use of it.

You should be able to get it using kafkaConsumerConfig.getSchemaConfig().getRegistryURL() or something similar to that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request good first issue Good for newcomers
Projects
Development

No branches or pull requests

2 participants