Skip to content

Conversation

janani-gurram
Copy link

SUMMARY

  1. Fixes the histogram chart so that the chart renders even when the x-axis variable contains NULL values.
  2. Adds unit tests to verify the behavior when:
    • The target column has all nulls (with and without grouping)
    • The target column has some nulls (with and without grouping)

This issue is fixed by removing empty values from the input DataFrame before rendering. In cases where dropping these values results in an empty DataFrame, we now safely return the empty DataFrame instead of failing.

BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF

Before
The chart failed to render when the selected column contained NULLs.
Screenshot 2025-10-16 at 11 10 41 AM

After
The chart now renders correctly, ignoring NULL values.
Screenshot 2025-10-16 at 11 11 52 AM

TESTING INSTRUCTIONS

  1. Open any dataset and create a histogram chart.
  2. Select a column that contains NULL values.
  3. Verify that the chart renders correctly, excluding NULL entries.

ADDITIONAL INFORMATION

@janani-gurram janani-gurram marked this pull request as draft October 17, 2025 00:19
@janani-gurram janani-gurram changed the title Fix/handle nulls in hist fix(histogram): add NULL handling for histogram Oct 17, 2025
Copy link

@korbit-ai korbit-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review by Korbit AI

Korbit automatically attempts to detect when you fix issues in new commits.
Category Issue Status
Performance Inefficient DataFrame copy for null filtering ▹ view ✅ Fix detected
Files scanned
File Path Reviewed
superset/utils/pandas_postprocessing/histogram.py

Explore our documentation to understand the languages and file types we support and the files we ignore.

Check out our docs on how you can make Korbit work best for you and your team.

Loving Korbit!? Share us on LinkedIn Reddit and X

@dosubot dosubot bot added the viz:charts:histogram Related to the Histogram chart label Oct 17, 2025
@janani-gurram janani-gurram marked this pull request as ready for review October 17, 2025 15:51
Copy link

@korbit-ai korbit-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review by Korbit AI

Korbit automatically attempts to detect when you fix issues in new commits.
Category Issue Status
Design Input DataFrame modified in-place ▹ view
Files scanned
File Path Reviewed
superset/utils/pandas_postprocessing/histogram.py

Explore our documentation to understand the languages and file types we support and the files we ignore.

Check out our docs on how you can make Korbit work best for you and your team.

Loving Korbit!? Share us on LinkedIn Reddit and X

Comment on lines +52 to +54
df.dropna(subset=[column], inplace=True)
if df.empty:
return df
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Input DataFrame modified in-place category Design

Tell me more
What is the issue?

The function modifies the input DataFrame in-place by dropping rows with NULLs, which violates the principle of not mutating input parameters and can cause unexpected side effects for the caller.

Why this matters

Callers may not expect their original DataFrame to be modified, leading to data loss in the calling code and potential bugs in downstream processing that depends on the original DataFrame structure.

Suggested change ∙ Feature Preview

Create a copy of the DataFrame before dropping NULLs:

# drop empty values from the target column
df = df.dropna(subset=[column])
if df.empty:
    return df
Provide feedback to improve future suggestions

Nice Catch Incorrect Not in Scope Not in coding standard Other

💬 Looking for more details? Reply to this comment to chat with Korbit.

Copy link

codecov bot commented Oct 17, 2025

Codecov Report

❌ Patch coverage is 0% with 3 lines in your changes missing coverage. Please review.
✅ Project coverage is 71.92%. Comparing base (fb8fca4) to head (55b453f).
⚠️ Report is 18 commits behind head on master.

Files with missing lines Patch % Lines
superset/utils/pandas_postprocessing/histogram.py 0.00% 3 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##           master   #35693       +/-   ##
===========================================
+ Coverage        0   71.92%   +71.92%     
===========================================
  Files           0      589      +589     
  Lines           0    43638    +43638     
  Branches        0     4726     +4726     
===========================================
+ Hits            0    31388    +31388     
- Misses          0    11006    +11006     
- Partials        0     1244     +1244     
Flag Coverage Δ
hive 46.27% <0.00%> (?)
mysql 70.96% <0.00%> (?)
postgres 71.01% <0.00%> (?)
presto 49.97% <0.00%> (?)
python 71.89% <0.00%> (?)
sqlite 70.61% <0.00%> (?)
unit 100.00% <ø> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@sfirke sfirke added the 🎪 ⚡ showtime-trigger-start Create new ephemeral environment for this PR label Oct 17, 2025
@github-actions github-actions bot added 🎪 55b453f 🚦 building Environment 55b453f status: building 🎪 55b453f 📅 2025-10-17T18-05 Environment 55b453f created at 2025-10-17T18-05 🎪 55b453f ⌛ 48h Environment 55b453f expires after 48h 🎪 55b453f 🤡 sfirke Environment 55b453f requested by sfirke and removed 🎪 ⚡ showtime-trigger-start Create new ephemeral environment for this PR labels Oct 17, 2025
Copy link
Contributor

🎪 Showtime is building environment on GHA for 55b453f

@github-actions github-actions bot added 🎪 55b453f 🚦 deploying Environment 55b453f status: deploying 🎪 55b453f 🚦 running Environment 55b453f status: running 🎪 🎯 55b453f Active environment pointer - 55b453f is receiving traffic and removed 🎪 55b453f 🚦 building Environment 55b453f status: building 🎪 55b453f 🚦 deploying Environment 55b453f status: deploying 🎪 55b453f 🚦 running Environment 55b453f status: running labels Oct 17, 2025
@github-actions github-actions bot added 🎪 55b453f 🚦 running Environment 55b453f status: running 🎪 55b453f 🌐 52.37.95.52:8080 Environment 55b453f URL: http://52.37.95.52:8080 (click to visit) and removed 🎪 🎯 55b453f Active environment pointer - 55b453f is receiving traffic labels Oct 17, 2025
Copy link
Contributor

🎪 Showtime deployed environment on GHA for 55b453f

Environment: http://52.37.95.52:8080 (admin/admin)
Lifetime: 48h auto-cleanup
Updates: New commits create fresh environments automatically

@github-actions github-actions bot removed 🎪 55b453f 📅 2025-10-17T18-05 Environment 55b453f created at 2025-10-17T18-05 🎪 55b453f 🤡 sfirke Environment 55b453f requested by sfirke 🎪 55b453f ⌛ 48h Environment 55b453f expires after 48h 🎪 55b453f 🌐 52.37.95.52:8080 Environment 55b453f URL: http://52.37.95.52:8080 (click to visit) 🎪 55b453f 🚦 running Environment 55b453f status: running labels Oct 19, 2025
@rusackas
Copy link
Member

Are there cases where we should use zero-imputation for these values (counting them as zero)? If selecting zero imputation in Advanced Analytics solves that use case, we may be good.

@rusackas
Copy link
Member

rusackas commented Oct 20, 2025

This viz doesn't have the Advanced Analytics feature, it seems that would be worth adding here, since it provides the zero-imputation feature. Advanced Analytics doesn't have an option to strip out null values however. It's probably better to use Advanced Analytics to "fix" null values OR use a Filter to remove null values right from the control panel. Stripping them out in post-processing doesn't give users the chance to set them as 0... they won't even know there ARE null values, which seems dangerous.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size/M viz:charts:histogram Related to the Histogram chart

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Histogram fails to load when null values present in x-axis variable (regression from 4.1.2, occured in 5.0.0rc3, present in 6.0.0rc2)

4 participants