Skip to content

BUG: Raise clear error for duplicate id_vars in melt (GH61475) #61484

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

ZanirP
Copy link

@ZanirP ZanirP commented May 23, 2025

Comment on lines +242 to +244
# GH61475 - prevent AttributeError when duplicate column
if not hasattr(id_data, "dtype"):
raise Exception(f"{col} is a duplicate column header")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. This should check if not frame.columns.is_unique at the beginning of the function.
  2. A ValueError is more appropriate here

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the feedback! I've moved the check for not frame.columns.is_unique to the beginning of the function and updated the exception type to ValueError as suggested.

A quick clarification question: currently melt allows duplicate column names in 'value_vars', as seen in the test test_melt_with_duplicate_columns.With this change, are we treating any duplicate columns in the input DataFrame as a ValueError? Not just when the duplicates appear in id_vars?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah good point. I guess this specifically when id_vars is not empty we'll want to raise if not frame.columns.is_unique

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Just to make sure I understand:
You're saying that we should only raise an error in duplicate column names if the duplicate is in id_vars, correct? Essentially if the duplicates are in value_vars then, we can let the melt function work as is, as long as no errors are occuring?

@mroeschke mroeschke added Reshaping Concat, Merge/Join, Stack/Unstack, Explode Error Reporting Incorrect or improved errors from pandas labels May 23, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Error Reporting Incorrect or improved errors from pandas Reshaping Concat, Merge/Join, Stack/Unstack, Explode
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BUG: More Indicative Error when pd.melt with duplicate columns
2 participants