-
Notifications
You must be signed in to change notification settings - Fork 37
Description
Is your feature request related to a problem? Please describe.
Currently, only the neosync_email transformer has an invalid_email_action parameter to handle null/invalid input values. Other Neosync transformers (neosync_firstname, neosync_lastname, neosync_fullname, neosync_string) lack this capability, which causes:
- NOT NULL constraint violations when transformers return null for invalid/null input data
- Inconsistent behavior across the Neosync transformer family
- Configuration complexity where some transformers handle edge cases and others don't
- Data pipeline failures requiring workarounds with template transformers
Describe the solution you'd like
Add a generic invalid_value_action parameter to all Neosync transformers with consistent behavior and values:
invalid_value_action: generate # generate, passthrough, null, rejectParameter Values:
generate(default): Generate a valid transformed value even for null/invalid inputpassthrough: Keep the original value unchangednull: Explicitly set to nullreject: Throw an error and stop processing
Transformers that should support this:
neosync_email(already hasinvalid_email_action- could be aliased toinvalid_value_action)neosync_firstnameneosync_lastnameneosync_fullnameneosync_string
Example configuration:
transformations:
table_transformers:
- schema: public
table: users
column_transformers:
first_name:
name: neosync_firstname
parameters:
preserve_length: true
invalid_value_action: generate
last_name:
name: neosync_lastname
parameters:
preserve_length: true
invalid_value_action: generate
email:
name: neosync_email
parameters:
preserve_length: true
preserve_domain: true
invalid_value_action: generate # replaces invalid_email_actionDescribe alternatives you've considered
-
Current workaround: Using template transformers with conditional logic
name: template parameters: template: > {{ if and (ne .GetValue nil) (ne .GetValue "") }} {{ faker "first_name" }} {{ else }} FirstName {{ end }}
Problems: Verbose, inconsistent, requires template knowledge
-
Transformer-specific parameters: Adding
invalid_firstname_action,invalid_lastname_action, etc.
Problems: Increases configuration complexity, inconsistent naming -
Global default handling: Setting a global policy for all transformers
Problems: Less flexible, doesn't allow per-transformer customization
Additional context
This enhancement would:
- Improve consistency across all Neosync transformers
- Reduce configuration complexity with standardized parameter naming
- Prevent data pipeline failures from NOT NULL constraint violations
- Future-proof any new Neosync transformers added to pgstream
- Maintain backward compatibility by keeping
invalid_email_actionas an alias
Benefits for common use cases:
- Database replication: Handles missing/corrupt data gracefully during snapshots
- Data anonymization: Ensures all sensitive fields get properly masked even with dirty data
- Testing environments: Provides realistic test data even when source data has gaps
- Compliance: Guarantees consistent data protection across all transformed fields
Implementation suggestion:
The generic invalid_value_action parameter could be implemented at the pgstream integration layer, similar to how invalid_email_action currently works, providing consistent null/invalid value handling across all Neosync transformers without requiring changes to the underlying Neosync library.