Problem
When using Sidemantic with Spark SQL / HiveServer2, time dimensions with day granularity are compiled with TRUNC(column, 'DAY').
Example semantic model:
models:
- name: bookings
sql: |
SELECT *
FROM integrations.bookings
WHERE pdate > '2026-01-01'
primary_key: booking_id
dimensions:
- name: state
type: categorical
- name: created_at
type: time
granularity: day
metrics:
- name: bookings_count
agg: count
- name: paid_bookings_count
agg: count
filters:
- state = 'paid'
A query such as:
select bookings.created_at, bookings.paid_bookings_count
from bookings
where bookings.created_at > '2026-05-01'
generates SQL similar to:
WITH bookings_cte AS (
SELECT
booking_id AS booking_id,
state AS state,
TRUNC(created_at, 'DAY') AS created_at,
CASE WHEN state = 'paid' THEN 1 ELSE NULL END AS paid_bookings_count_raw
FROM integrations.bookings
WHERE created_at > '2026-05-01'
)
SELECT
bookings_cte.created_at AS created_at,
COUNT(bookings_cte.paid_bookings_count_raw) AS paid_bookings_count
FROM bookings_cte
GROUP BY
1
In Spark SQL, this does not work as expected. TRUNC(timestamp_col, 'DAY') returns NULL.
Expected behavior
For Spark SQL, day-level timestamp truncation should use DATE_TRUNC, for example:
DATE_TRUNC('DAY', created_at)
Why it matters
Spark’s TRUNC function only supports date truncation to larger calendar units such as year, month, week, and quarter. It does not support DAY as a format for TRUNC.
Spark’s DATE_TRUNC function is the appropriate function for timestamp truncation to day/hour/minute/etc.
Because of this, Sidemantic-generated SQL for type: time dimensions with granularity: day currently produces unusable results on Spark: the dimension becomes NULL and grouping breaks.
Sources
Problem
When using Sidemantic with Spark SQL / HiveServer2, time dimensions with day granularity are compiled with
TRUNC(column, 'DAY').Example semantic model:
A query such as:
generates SQL similar to:
In Spark SQL, this does not work as expected.
TRUNC(timestamp_col, 'DAY')returnsNULL.Expected behavior
For Spark SQL, day-level timestamp truncation should use
DATE_TRUNC, for example:DATE_TRUNC('DAY', created_at)Why it matters
Spark’s
TRUNCfunction only supports date truncation to larger calendar units such as year, month, week, and quarter. It does not support DAY as a format forTRUNC.Spark’s
DATE_TRUNCfunction is the appropriate function for timestamp truncation to day/hour/minute/etc.Because of this, Sidemantic-generated SQL for
type: timedimensions withgranularity: daycurrently produces unusable results on Spark: the dimension becomes NULL and grouping breaks.Sources
DATE_TRUNC: https://spark.apache.org/docs/latest/api/sql/index.html#date_truncTRUNC: https://spark.apache.org/docs/latest/api/sql/index.html#trunc