Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[airflow] Extend rule to check class attributes, methods, arguments (AIR302) #15083

Merged
merged 12 commits into from
Dec 31, 2024

Conversation

Lee-W
Copy link
Contributor

@Lee-W Lee-W commented Dec 20, 2024

Summary

Airflow 3.0 removes various deprecated functions, members, modules, and other values. They have been deprecated in 2.x, but the removal causes incompatibilities that we want to detect. This PR add rules for the following.

  • Removed class attribute
    • airflow.providers_manager.ProvidersManager.dataset_factoriesairflow.providers_manager.ProvidersManager.asset_factories
    • airflow.providers_manager.ProvidersManager.dataset_uri_handlersairflow.providers_manager.ProvidersManager.asset_uri_handlers
    • airflow.providers_manager.ProvidersManager.dataset_to_openlineage_convertersairflow.providers_manager.ProvidersManager.asset_to_openlineage_converters
    • airflow.lineage.hook.DatasetLineageInfo.datasetairflow.lineage.hook.AssetLineageInfo.asset
  • Removed class method (subclasses in airflow should also checked)
    • airflow.secrets.base_secrets.BaseSecretsBackend.get_conn_uriairflow.secrets.base_secrets.BaseSecretsBackend.get_conn_value
    • airflow.secrets.base_secrets.BaseSecretsBackend.get_connectionsairflow.secrets.base_secrets.BaseSecretsBackend.get_connection
    • airflow.hooks.base.BaseHook.get_connections → use get_connection
    • airflow.datasets.BaseDataset.iter_datasetsairflow.sdk.definitions.asset.BaseAsset.iter_assets
    • airflow.datasets.BaseDataset.iter_dataset_aliasesairflow.sdk.definitions.asset.BaseAsset.iter_asset_aliases
  • Removed constructor args (subclasses in airflow should also checked)
    • argument filename_template inairflow.utils.log.file_task_handler.FileTaskHandler
    • in BaseOperator
      • sla
      • task_concurrencymax_active_tis_per_dag
    • in BaseAuthManager
      • appbuilder
  • Removed class variable (subclasses anywhere should be checked)
    • in airflow.plugins_manager.AirflowPlugin
      • executors (from #43289)
      • hooks
      • operators
      • sensors
  • Replaced names
    • airflow.hooks.base_hook.BaseHookairflow.hooks.base.BaseHook
    • airflow.operators.dagrun_operator.TriggerDagRunLinkairflow.operators.trigger_dagrun.TriggerDagRunLink
    • airflow.operators.dagrun_operator.TriggerDagRunOperatorairflow.operators.trigger_dagrun.TriggerDagRunOperator
    • airflow.operators.python_operator.BranchPythonOperatorairflow.operators.python.BranchPythonOperator
    • airflow.operators.python_operator.PythonOperatorairflow.operators.python.PythonOperator
    • airflow.operators.python_operator.PythonVirtualenvOperatorairflow.operators.python.PythonVirtualenvOperator
    • airflow.operators.python_operator.ShortCircuitOperatorairflow.operators.python.ShortCircuitOperator
    • airflow.operators.latest_only_operator.LatestOnlyOperatorairflow.operators.latest_only.LatestOnlyOperator

In additional to the changes above, this PR also add utility functions and improve docstring.

Test Plan

A test fixture is included in the PR.

@Lee-W Lee-W changed the title [airflow]: extend removed method class attributes and more methods(AIR302) [airflow]: extend removed method class attributes and more methods (AIR302) Dec 20, 2024
Copy link
Contributor

github-actions bot commented Dec 20, 2024

ruff-ecosystem results

Linter (stable)

✅ ecosystem check detected no linter changes.

Linter (preview)

ℹ️ ecosystem check detected linter changes. (+7 -0 violations, +0 -0 fixes in 1 projects; 54 projects unchanged)

apache/airflow (+7 -0 violations, +0 -0 fixes)

ruff check --no-cache --exit-zero --ignore RUF9 --output-format concise --preview --select ALL

+ providers/tests/alibaba/cloud/log/test_oss_task_handler.py:189:67: AIR302 `filename_template` is removed in Airflow 3.0
+ providers/tests/amazon/aws/log/test_cloudwatch_task_handler.py:244:13: AIR302 `filename_template` is removed in Airflow 3.0
+ providers/tests/amazon/aws/log/test_s3_task_handler.py:232:70: AIR302 `filename_template` is removed in Airflow 3.0
+ providers/tests/elasticsearch/log/test_es_task_handler.py:673:13: AIR302 `filename_template` is removed in Airflow 3.0
+ providers/tests/fab/auth_manager/test_fab_auth_manager.py:87:26: AIR302 `appbuilder` is removed in Airflow 3.0; The constructor takes no parameter now.
+ providers/tests/google/cloud/log/test_gcs_task_handler.py:295:13: AIR302 `filename_template` is removed in Airflow 3.0
+ providers/tests/microsoft/azure/log/test_wasb_task_handler.py:211:13: AIR302 `filename_template` is removed in Airflow 3.0

Changes by rule (1 rules affected)

code total + violation - violation + fix - fix
AIR302 7 7 0 0 0

@dhruvmanila dhruvmanila added rule Implementing or modifying a lint rule preview Related to preview mode features labels Dec 24, 2024
@Lee-W Lee-W force-pushed the extend-AIR302 branch 2 times, most recently from 7ea9b06 to cbde476 Compare December 25, 2024 03:50
@Lee-W Lee-W force-pushed the extend-AIR302 branch 2 times, most recently from 6e665c4 to 3b8461f Compare December 27, 2024 11:45
@Lee-W
Copy link
Contributor Author

Lee-W commented Dec 29, 2024

@MichaReiser, Pinging in case we missed this one 🙂 Would be awesome if we could get this re-reviewed when bandwidth allowed. Thanks a lot!

Any class in Airflow that inherits these class should not have these methods

* `airflow.secrets.base_secrets.BaseSecretsBackend.get_conn_uri` → `airflow.secrets.base_secrets.BaseSecretsBackend.get_conn_value`
* `airflow.secrets.base_secrets.BaseSecretsBackend.get_connections` → `airflow.secrets.base_secrets.BaseSecretsBackend.get_connection`
* `airflow.hooks.base.BaseHook.get_connections` → use `get_connection`
* `airflow.datasets.BaseDataset.iter_datasets` → `airflow.sdk.definitions.asset.BaseAsset.iter_assets`
* `airflow.datasets.BaseDataset.iter_dataset_aliases` → `airflow.sdk.definitions.asset.BaseAsset.iter_asset_aliases`
…rules

* `airflow.providers_manager.ProvidersManager.dataset_factories` → `airflow.providers_manager.ProvidersManager.asset_factories`
* `airflow.providers_manager.ProvidersManager.dataset_uri_handlers` → `airflow.providers_manager.ProvidersManager.asset_uri_handlers`
* `airflow.providers_manager.ProvidersManager.dataset_to_openlineage_converters` → `airflow.providers_manager.ProvidersManager.asset_to_openlineage_converters`
* `airflow.lineage.hook.DatasetLineageInfo.dataset`  → `airflow.lineage.hook.AssetLineageInfo.asset`
…y` is renamed as `max_active_tis_per_dag` in all airflow operators
* `airflow.operators.dagrun_operator.TriggerDagRunLink` → `airflow.operators.trigger_dagrun.TriggerDagRunLink`
* `airflow.operators.dagrun_operator.TriggerDagRunOperator` → `airflow.operators.trigger_dagrun.TriggerDagRunOperator`
* `airflow.operators.python_operator.BranchPythonOperator` → `airflow.operators.python.BranchPythonOperator`
* `airflow.operators.python_operator.PythonOperator` → `airflow.operators.python.PythonOperator`
* `airflow.operators.python_operator.PythonVirtualenvOperator` → `airflow.operators.python.PythonVirtualenvOperator`
* `airflow.operators.python_operator.ShortCircuitOperator` → `airflow.operators.python.ShortCircuitOperator`
* `airflow.operators.latest_only_operator.LatestOnlyOperator` → `     airflow.operators.latest_only.LatestOnlyOperator`
Comment on lines 27 to 28
if let (Some(start_element), Some(last_element)) = (rest.first(), rest.last()) {
start_element.starts_with(module) & last_element.ends_with(symbol_suffix)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the starts_with be instead a start_element == module as I think we need to match the module exactly?

Also, I think this should be && and not & (bitwise AND) ;)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The && typo makes me wonder how come the tests failed to catch this

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm unable to push to this PR, so here's the change that I made locally to fix this and add some documentation:

diff --git a/crates/ruff_linter/src/rules/airflow/rules/removal_in_3.rs b/crates/ruff_linter/src/rules/airflow/rules/removal_in_3.rs
index 0f6831317..592329fde 100644
--- a/crates/ruff_linter/src/rules/airflow/rules/removal_in_3.rs
+++ b/crates/ruff_linter/src/rules/airflow/rules/removal_in_3.rs
@@ -11,21 +11,35 @@ use ruff_text_size::Ranged;
 
 use crate::checkers::ast::Checker;
 
+/// Check whether the segments corresponding to the fully qualified name points to a symbol that's
+/// either a builtin or coming from one of the providers in Airflow.
+///
+/// The pattern it looks for are:
+/// - `airflow.providers.**.<module>.**.*<symbol_suffix>` for providers
+/// - `airflow.<module>.**.*<symbol_suffix>` for builtins
+///
+/// where `**` is one or more segments separated by a dot, and `*` is one or more characters.
+///
+/// Examples for the above patterns:
+/// - `airflow.providers.google.cloud.secrets.secret_manager.CloudSecretManagerBackend` (provider)
+/// - `airflow.secrets.base_secrets.BaseSecretsBackend` (builtin)
 fn is_airflow_builtin_or_provider(segments: &[&str], module: &str, symbol_suffix: &str) -> bool {
     match segments {
         ["airflow", "providers", rest @ ..] => {
             if let (Some(pos), Some(last_element)) =
                 (rest.iter().position(|&s| s == module), rest.last())
             {
+                // Check that the module is not the last element i.e., there's a symbol that's
+                // being used from the `module` that ends with `symbol_suffix`.
                 pos + 1 < rest.len() && last_element.ends_with(symbol_suffix)
             } else {
                 false
             }
         }
 
-        ["airflow", rest @ ..] => {
-            if let (Some(start_element), Some(last_element)) = (rest.first(), rest.last()) {
-                start_element.starts_with(module) & last_element.ends_with(symbol_suffix)
+        ["airflow", first, rest @ ..] => {
+            if let Some(last) = rest.last() {
+                first == module && last.ends_with(symbol_suffix)
             } else {
                 false
             }
@@ -35,18 +49,26 @@ fn is_airflow_builtin_or_provider(segments: &[&str], module: &str, symbol_suffix
     }
 }
 
+/// Check whether the symbol is coming from the `secrets` builtin or provider module which ends
+/// with `Backend`.
 fn is_airflow_secret_backend(segments: &[&str]) -> bool {
     is_airflow_builtin_or_provider(segments, "secrets", "Backend")
 }
 
+/// Check whether the symbol is coming from the `hooks` builtin or provider module which ends
+/// with `Hook`.
 fn is_airflow_hook(segments: &[&str]) -> bool {
     is_airflow_builtin_or_provider(segments, "hooks", "Hook")
 }
 
+/// Check whether the symbol is coming from the `operators` builtin or provider module which ends
+/// with `Operator`.
 fn is_airflow_operator(segments: &[&str]) -> bool {
     is_airflow_builtin_or_provider(segments, "operators", "Operator")
 }
 
+/// Check whether the symbol is coming from the `log` builtin or provider module which ends
+/// with `TaskHandler`.
 fn is_airflow_task_handler(segments: &[&str]) -> bool {
     is_airflow_builtin_or_provider(segments, "log", "TaskHandler")
 }

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Additionally, as convention these helper functions would go at the bottom after the rule logic i.e., the rule definition would be up top and then the entrypoint to the rule logic and then all the helper structs and functions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dhruvmanila Thanks! This helps a ton! Just updated it and add more docstring to each utility functions

@Lee-W Lee-W requested a review from dhruvmanila December 31, 2024 00:53
@dhruvmanila dhruvmanila changed the title [airflow]: extend removed method class attributes and more methods (AIR302) [airflow] Extend rule to check class attributes, methods, arguments (AIR302) Dec 31, 2024
}

// Check whether a removed Airflow argument is passed.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In Rust, documentation comments starts with triple forward slashes (///). I'll update this in my planned follow-up PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it!

Copy link
Member

@dhruvmanila dhruvmanila left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

I've a few minor nits but I can't push it to this branch, I'll create a follow-up PR instead.

@dhruvmanila dhruvmanila merged commit 253c274 into astral-sh:main Dec 31, 2024
21 checks passed
@Lee-W
Copy link
Contributor Author

Lee-W commented Dec 31, 2024

Thank you!

I've a few minor nits but I can't push it to this branch, I'll create a follow-up PR instead.

Thanks a ton! Not sure why you cannot push to this branch 😢 I read through the refactor PR. Everything looks much better! But I noticed there one minor typo and created a PR for this fix #15211

@jedcunningham jedcunningham deleted the extend-AIR302 branch January 8, 2025 07:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
preview Related to preview mode features rule Implementing or modifying a lint rule
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants