-
Notifications
You must be signed in to change notification settings - Fork 75
fix: Ensure streams with hardcoded schema
attributes (e.g. REST and GraphQL streams) can have their schema overridden by the catalog
#2940
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Reviewer's Guide by SourceryThis pull request modifies the stream schema handling to ensure that the schema provided in the catalog overrides the default stream schema. This change ensures that record processing and replication key validation use the most up-to-date schema from the input catalog. Sequence diagram for applying the catalog and using the effective schemasequenceDiagram
participant Tap
participant Stream
participant Catalog
Tap->>Stream: apply_catalog(catalog)
activate Stream
Stream->>Catalog: catalog_entry.schema.to_dict()
Catalog-->>Stream: schema_dict
Stream->>Stream: _input_schema = schema_dict
deactivate Stream
Tap->>Stream: _generate_record_messages()
activate Stream
Stream->>Stream: schema = effective_schema
Stream->>Stream: conform_record_data_types(record, schema)
deactivate Stream
Updated class diagram for the Stream classclassDiagram
class Stream {
- _tap: Tap
- _tap_state: dict
- _tap_input_catalog: singer.Catalog | None
- _input_schema: dict | None
- _stream_maps: list[StreamMap] | None
- forced_replication_method: str | None
- _replication_key: str | None
+ effective_schema: dict
+ apply_catalog(catalog: singer.Catalog) : None
+ _generate_record_messages()
}
note for Stream._input_schema "New attribute to store the input schema from the catalog"
note for Stream.effective_schema "Returns the input schema if available, otherwise returns the stream's schema"
note for Stream.apply_catalog "Applies the catalog to the stream, including setting the input schema"
note for Stream._generate_record_messages "Uses the effective schema for record processing"
File-Level Changes
Possibly linked issues
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #2940 +/- ##
==========================================
+ Coverage 91.49% 91.50% +0.01%
==========================================
Files 63 63
Lines 5292 5300 +8
Branches 677 677
==========================================
+ Hits 4842 4850 +8
Misses 318 318
Partials 132 132 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
CodSpeed Performance ReportMerging #2940 will not alter performanceComparing Summary
|
…alog-schema-private
…alog-schema-private
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @edgarrmondragon - I've reviewed your changes - here's some feedback:
Overall Comments:
- Consider renaming
_input_schema
to_catalog_schema
for clarity. - It might be helpful to add a comment explaining why
_get_schema
is needed.
Here's what I looked at during the review
- 🟢 General issues: all looks good
- 🟢 Security: all looks good
- 🟢 Testing: all looks good
- 🟢 Complexity: all looks good
- 🟢 Documentation: all looks good
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
…alog-schema-private
…alog-schema-private
…alog-schema-private
@sourcery-ai review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @edgarrmondragon - I've reviewed your changes - here's some feedback:
Overall Comments:
- Consider renaming
effective_schema
toget_effective_schema
to clarify that it's a computed property. - It might be clearer to set
self._input_schema
tocatalog_entry.schema.to_dict()
in the__init__
method rather than inapply_catalog
.
Here's what I looked at during the review
- 🟢 General issues: all looks good
- 🟢 Security: all looks good
- 🟢 Testing: all looks good
- 🟢 Complexity: all looks good
- 🟢 Documentation: all looks good
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
schema
attributes (e.g. REST and GraphQL streams) can have their schema overridden by the catalog
Related
Summary by Sourcery
Modify stream schema handling to allow catalog-provided schema to override the default stream schema
Bug Fixes:
Enhancements:
_get_schema()
to dynamically retrieve the stream's schema, prioritizing input catalog schema over the default schema📚 Documentation preview 📚: https://meltano-sdk--2940.org.readthedocs.build/en/2940/
Summary by Sourcery
Improve stream schema handling to prioritize input catalog schema over default stream schema
Bug Fixes:
Enhancements:
effective_schema
property to dynamically retrieve the stream's schema, prioritizing input catalog schema