-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[airflow
] Extend rule to check class attributes, methods, arguments (AIR302
)
#15083
Conversation
|
code | total | + violation | - violation | + fix | - fix |
---|---|---|---|---|---|
AIR302 | 7 | 7 | 0 | 0 | 0 |
7ea9b06
to
cbde476
Compare
6e665c4
to
3b8461f
Compare
@MichaReiser, Pinging in case we missed this one 🙂 Would be awesome if we could get this re-reviewed when bandwidth allowed. Thanks a lot! |
Any class in Airflow that inherits these class should not have these methods * `airflow.secrets.base_secrets.BaseSecretsBackend.get_conn_uri` → `airflow.secrets.base_secrets.BaseSecretsBackend.get_conn_value` * `airflow.secrets.base_secrets.BaseSecretsBackend.get_connections` → `airflow.secrets.base_secrets.BaseSecretsBackend.get_connection` * `airflow.hooks.base.BaseHook.get_connections` → use `get_connection` * `airflow.datasets.BaseDataset.iter_datasets` → `airflow.sdk.definitions.asset.BaseAsset.iter_assets` * `airflow.datasets.BaseDataset.iter_dataset_aliases` → `airflow.sdk.definitions.asset.BaseAsset.iter_asset_aliases`
…rules * `airflow.providers_manager.ProvidersManager.dataset_factories` → `airflow.providers_manager.ProvidersManager.asset_factories` * `airflow.providers_manager.ProvidersManager.dataset_uri_handlers` → `airflow.providers_manager.ProvidersManager.asset_uri_handlers` * `airflow.providers_manager.ProvidersManager.dataset_to_openlineage_converters` → `airflow.providers_manager.ProvidersManager.asset_to_openlineage_converters` * `airflow.lineage.hook.DatasetLineageInfo.dataset` → `airflow.lineage.hook.AssetLineageInfo.asset`
…y` is renamed as `max_active_tis_per_dag` in all airflow operators
…have been removed in AirflowPlugin
…nd its subclasses
* `airflow.operators.dagrun_operator.TriggerDagRunLink` → `airflow.operators.trigger_dagrun.TriggerDagRunLink` * `airflow.operators.dagrun_operator.TriggerDagRunOperator` → `airflow.operators.trigger_dagrun.TriggerDagRunOperator` * `airflow.operators.python_operator.BranchPythonOperator` → `airflow.operators.python.BranchPythonOperator` * `airflow.operators.python_operator.PythonOperator` → `airflow.operators.python.PythonOperator` * `airflow.operators.python_operator.PythonVirtualenvOperator` → `airflow.operators.python.PythonVirtualenvOperator` * `airflow.operators.python_operator.ShortCircuitOperator` → `airflow.operators.python.ShortCircuitOperator` * `airflow.operators.latest_only_operator.LatestOnlyOperator` → ` airflow.operators.latest_only.LatestOnlyOperator`
if let (Some(start_element), Some(last_element)) = (rest.first(), rest.last()) { | ||
start_element.starts_with(module) & last_element.ends_with(symbol_suffix) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should the starts_with
be instead a start_element == module
as I think we need to match the module exactly?
Also, I think this should be &&
and not &
(bitwise AND) ;)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The &&
typo makes me wonder how come the tests failed to catch this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm unable to push to this PR, so here's the change that I made locally to fix this and add some documentation:
diff --git a/crates/ruff_linter/src/rules/airflow/rules/removal_in_3.rs b/crates/ruff_linter/src/rules/airflow/rules/removal_in_3.rs
index 0f6831317..592329fde 100644
--- a/crates/ruff_linter/src/rules/airflow/rules/removal_in_3.rs
+++ b/crates/ruff_linter/src/rules/airflow/rules/removal_in_3.rs
@@ -11,21 +11,35 @@ use ruff_text_size::Ranged;
use crate::checkers::ast::Checker;
+/// Check whether the segments corresponding to the fully qualified name points to a symbol that's
+/// either a builtin or coming from one of the providers in Airflow.
+///
+/// The pattern it looks for are:
+/// - `airflow.providers.**.<module>.**.*<symbol_suffix>` for providers
+/// - `airflow.<module>.**.*<symbol_suffix>` for builtins
+///
+/// where `**` is one or more segments separated by a dot, and `*` is one or more characters.
+///
+/// Examples for the above patterns:
+/// - `airflow.providers.google.cloud.secrets.secret_manager.CloudSecretManagerBackend` (provider)
+/// - `airflow.secrets.base_secrets.BaseSecretsBackend` (builtin)
fn is_airflow_builtin_or_provider(segments: &[&str], module: &str, symbol_suffix: &str) -> bool {
match segments {
["airflow", "providers", rest @ ..] => {
if let (Some(pos), Some(last_element)) =
(rest.iter().position(|&s| s == module), rest.last())
{
+ // Check that the module is not the last element i.e., there's a symbol that's
+ // being used from the `module` that ends with `symbol_suffix`.
pos + 1 < rest.len() && last_element.ends_with(symbol_suffix)
} else {
false
}
}
- ["airflow", rest @ ..] => {
- if let (Some(start_element), Some(last_element)) = (rest.first(), rest.last()) {
- start_element.starts_with(module) & last_element.ends_with(symbol_suffix)
+ ["airflow", first, rest @ ..] => {
+ if let Some(last) = rest.last() {
+ first == module && last.ends_with(symbol_suffix)
} else {
false
}
@@ -35,18 +49,26 @@ fn is_airflow_builtin_or_provider(segments: &[&str], module: &str, symbol_suffix
}
}
+/// Check whether the symbol is coming from the `secrets` builtin or provider module which ends
+/// with `Backend`.
fn is_airflow_secret_backend(segments: &[&str]) -> bool {
is_airflow_builtin_or_provider(segments, "secrets", "Backend")
}
+/// Check whether the symbol is coming from the `hooks` builtin or provider module which ends
+/// with `Hook`.
fn is_airflow_hook(segments: &[&str]) -> bool {
is_airflow_builtin_or_provider(segments, "hooks", "Hook")
}
+/// Check whether the symbol is coming from the `operators` builtin or provider module which ends
+/// with `Operator`.
fn is_airflow_operator(segments: &[&str]) -> bool {
is_airflow_builtin_or_provider(segments, "operators", "Operator")
}
+/// Check whether the symbol is coming from the `log` builtin or provider module which ends
+/// with `TaskHandler`.
fn is_airflow_task_handler(segments: &[&str]) -> bool {
is_airflow_builtin_or_provider(segments, "log", "TaskHandler")
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Additionally, as convention these helper functions would go at the bottom after the rule logic i.e., the rule definition would be up top and then the entrypoint to the rule logic and then all the helper structs and functions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dhruvmanila Thanks! This helps a ton! Just updated it and add more docstring to each utility functions
airflow
] Extend rule to check class attributes, methods, arguments (AIR302
)
} | ||
|
||
// Check whether a removed Airflow argument is passed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In Rust, documentation comments starts with triple forward slashes (///
). I'll update this in my planned follow-up PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you!
I've a few minor nits but I can't push it to this branch, I'll create a follow-up PR instead.
Thanks a ton! Not sure why you cannot push to this branch 😢 I read through the refactor PR. Everything looks much better! But I noticed there one minor typo and created a PR for this fix #15211 |
Summary
Airflow 3.0 removes various deprecated functions, members, modules, and other values. They have been deprecated in 2.x, but the removal causes incompatibilities that we want to detect. This PR add rules for the following.
airflow.providers_manager.ProvidersManager.dataset_factories
→airflow.providers_manager.ProvidersManager.asset_factories
airflow.providers_manager.ProvidersManager.dataset_uri_handlers
→airflow.providers_manager.ProvidersManager.asset_uri_handlers
airflow.providers_manager.ProvidersManager.dataset_to_openlineage_converters
→airflow.providers_manager.ProvidersManager.asset_to_openlineage_converters
airflow.lineage.hook.DatasetLineageInfo.dataset
→airflow.lineage.hook.AssetLineageInfo.asset
airflow.secrets.base_secrets.BaseSecretsBackend.get_conn_uri
→airflow.secrets.base_secrets.BaseSecretsBackend.get_conn_value
airflow.secrets.base_secrets.BaseSecretsBackend.get_connections
→airflow.secrets.base_secrets.BaseSecretsBackend.get_connection
airflow.hooks.base.BaseHook.get_connections
→ useget_connection
airflow.datasets.BaseDataset.iter_datasets
→airflow.sdk.definitions.asset.BaseAsset.iter_assets
airflow.datasets.BaseDataset.iter_dataset_aliases
→airflow.sdk.definitions.asset.BaseAsset.iter_asset_aliases
filename_template
inairflow.utils.log.file_task_handler.FileTaskHandler
BaseOperator
sla
task_concurrency
→max_active_tis_per_dag
BaseAuthManager
appbuilder
airflow.plugins_manager.AirflowPlugin
executors
(from #43289)hooks
operators
sensors
airflow.hooks.base_hook.BaseHook
→airflow.hooks.base.BaseHook
airflow.operators.dagrun_operator.TriggerDagRunLink
→airflow.operators.trigger_dagrun.TriggerDagRunLink
airflow.operators.dagrun_operator.TriggerDagRunOperator
→airflow.operators.trigger_dagrun.TriggerDagRunOperator
airflow.operators.python_operator.BranchPythonOperator
→airflow.operators.python.BranchPythonOperator
airflow.operators.python_operator.PythonOperator
→airflow.operators.python.PythonOperator
airflow.operators.python_operator.PythonVirtualenvOperator
→airflow.operators.python.PythonVirtualenvOperator
airflow.operators.python_operator.ShortCircuitOperator
→airflow.operators.python.ShortCircuitOperator
airflow.operators.latest_only_operator.LatestOnlyOperator
→airflow.operators.latest_only.LatestOnlyOperator
In additional to the changes above, this PR also add utility functions and improve docstring.
Test Plan
A test fixture is included in the PR.