Skip to content

Histogram fails to load when null values present in x-axis variable (regression from 4.1.2, occured in 5.0.0rc3, present in 6.0.0rc2) #33738

@sfirke

Description

@sfirke

Bug description

A histogram that rendered fine in 4.1.2 fails to render in 5.0.0rc3:

Image

The issue is that there are NULL values in the x-axis variable. When I add a "Is Not Null" filter the chart renders in 5.0.0rc3.

Previously these were excluded automatically and the chart would render, which is a reasonable behavior.

I found this error message confusing as it looked like a SQL problem but the chart's underlying SQL did not change. And it's not so much "non-numeric values" as it is NULLs since this is a numeric column.

The logs:

2025-06-11 02:48:29,472:WARNING:superset.views.error_handling:Exception
Traceback (most recent call last):
  File "/app/.venv/lib/python3.10/site-packages/flask/app.py", line 1484, in full_dispatch_request
    rv = self.dispatch_request()
  File "/app/.venv/lib/python3.10/site-packages/flask/app.py", line 1469, in dispatch_request
    return self.ensure_sync(self.view_functions[rule.endpoint])(**view_args)
  File "/app/.venv/lib/python3.10/site-packages/flask_appbuilder/security/decorators.py", line 101, in wraps
    return f(self, *args, **kwargs)
  File "/app/superset/views/base_api.py", line 120, in wraps
    duration, response = time_function(f, self, *args, **kwargs)
  File "/app/superset/utils/core.py", line 1369, in time_function
    response = func(*args, **kwargs)
  File "/app/superset/utils/log.py", line 304, in wrapper
    value = f(*args, **kwargs)
  File "/app/superset/charts/data/api.py", line 260, in data
    return self._get_data_response(
  File "/app/superset/utils/log.py", line 304, in wrapper
    value = f(*args, **kwargs)
  File "/app/superset/charts/data/api.py", line 423, in _get_data_response
    result = command.run(force_cached=force_cached)
  File "/app/superset/commands/chart/data/get_data_command.py", line 45, in run
    payload = self._query_context.get_payload(
  File "/app/superset/common/query_context.py", line 102, in get_payload
    return self._processor.get_payload(cache_query_context, force_cached)
  File "/app/superset/common/query_context_processor.py", line 752, in get_payload
    query_results = [
  File "/app/superset/common/query_context_processor.py", line 753, in <listcomp>
    get_query_results(
  File "/app/superset/common/query_actions.py", line 227, in get_query_results
    return result_func(query_context, query_obj, force_cached)
  File "/app/superset/common/query_actions.py", line 103, in _get_full
    payload = query_context.get_df_payload(query_obj, force_cached=force_cached)
  File "/app/superset/common/query_context.py", line 123, in get_df_payload
    return self._processor.get_df_payload(
  File "/app/superset/common/query_context_processor.py", line 162, in get_df_payload
    query_result = self.get_query_result(query_obj)
  File "/app/superset/common/query_context_processor.py", line 287, in get_query_result
    df = query_object.exec_post_processing(df)
  File "/app/superset/common/query_object.py", line 452, in exec_post_processing
    df = getattr(pandas_postprocessing, operation)(df, **options)
  File "/app/superset/utils/pandas_postprocessing/histogram.py", line 56, in histogram
    raise ValueError(f"Column '{column}' contains non-numeric values")
ValueError: Column 'days_active' contains non-numeric values
2025-06-11 02:48:29,472:ERROR:superset.views.error_handling:Column 'days_active' contains non-numeric values
Traceback (most recent call last):
  File "/app/.venv/lib/python3.10/site-packages/flask/app.py", line 1484, in full_dispatch_request
    rv = self.dispatch_request()
  File "/app/.venv/lib/python3.10/site-packages/flask/app.py", line 1469, in dispatch_request
    return self.ensure_sync(self.view_functions[rule.endpoint])(**view_args)
  File "/app/.venv/lib/python3.10/site-packages/flask_appbuilder/security/decorators.py", line 101, in wraps
    return f(self, *args, **kwargs)
  File "/app/superset/views/base_api.py", line 120, in wraps
    duration, response = time_function(f, self, *args, **kwargs)
  File "/app/superset/utils/core.py", line 1369, in time_function
    response = func(*args, **kwargs)
  File "/app/superset/utils/log.py", line 304, in wrapper
    value = f(*args, **kwargs)
  File "/app/superset/charts/data/api.py", line 260, in data
    return self._get_data_response(
  File "/app/superset/utils/log.py", line 304, in wrapper
    value = f(*args, **kwargs)
  File "/app/superset/charts/data/api.py", line 423, in _get_data_response
    result = command.run(force_cached=force_cached)
  File "/app/superset/commands/chart/data/get_data_command.py", line 45, in run
    payload = self._query_context.get_payload(
  File "/app/superset/common/query_context.py", line 102, in get_payload
    return self._processor.get_payload(cache_query_context, force_cached)
  File "/app/superset/common/query_context_processor.py", line 752, in get_payload
    query_results = [
  File "/app/superset/common/query_context_processor.py", line 753, in <listcomp>
    get_query_results(
  File "/app/superset/common/query_actions.py", line 227, in get_query_results
    return result_func(query_context, query_obj, force_cached)
  File "/app/superset/common/query_actions.py", line 103, in _get_full
    payload = query_context.get_df_payload(query_obj, force_cached=force_cached)
  File "/app/superset/common/query_context.py", line 123, in get_df_payload
    return self._processor.get_df_payload(
  File "/app/superset/common/query_context_processor.py", line 162, in get_df_payload
    query_result = self.get_query_result(query_obj)
  File "/app/superset/common/query_context_processor.py", line 287, in get_query_result
    df = query_object.exec_post_processing(df)
  File "/app/superset/common/query_object.py", line 452, in exec_post_processing
    df = getattr(pandas_postprocessing, operation)(df, **options)
  File "/app/superset/utils/pandas_postprocessing/histogram.py", line 56, in histogram
    raise ValueError(f"Column '{column}' contains non-numeric values")
ValueError: Column 'days_active' contains non-numeric values

Screenshots/recordings

No response

Superset version

5.0.0rc3

Python version

3.9

Node version

16

Browser

Firefox

Additional context

No response

Checklist

  • I have searched Superset docs and Slack and didn't find a solution to my problem.
  • I have searched the GitHub issue tracker and didn't find a similar bug report.
  • I have checked Superset's logs for errors and if I found a relevant Python stacktrace, I included it here as text in the "additional context" section.

Metadata

Metadata

Assignees

Labels

#bug:regressionBugs that are identified as regessionsgood first issueGood first issues for new contributorsviz:charts:histogramRelated to the Histogram chart

Type

No type

Projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions