Fix Compiler dashboard bug for fetching (#7419)

yangw-dev · web-flow · commit e76ebb603100 · 2025-10-29T14:02:35.000-07:00
# Overview Fix Compiler Dashboard bug for fetching unnecessary metrics.query # Issue The new compiler inductor also rendering and fetches data which does not provide benchmark results, The metrics we are interested in are ends with output: 'performance.csv','accuracy.csv' and 'smoketest.csv'. This filter out the noisy metadata backend metrics during data # Demo Currently in hud, we have backend that has .csv: [link1](https://hud.pytorch.org/benchmark/compilers_regression?renderGroupId=main&time.start=2025-10-22T00%3A00%3A00.000Z&time.end=2025-10-29T23%3A59%3A59.999Z&filters.repo=pytorch%2Fpytorch&filters.benchmarkName=compiler&filters.backend=&filters.mode=inference&filters.dtype=bfloat16&filters.deviceName=cuda+%28a100%29&filters.device=cuda&filters.arch=a100&lbranch=main&rbranch=main&rcommit.commit=35f3572fa483a8edb101d5765564e1ae274f3d45&rcommit.workflow_id=18915706635&rcommit.date=2025-10-29T19%3A00%3A00Z&rcommit.branch=main&lcommit.commit=ad4dc52bf6bc09fd3680bcb9bc957203c9cb54f5&lcommit.workflow_id=18697188028&lcommit.date=2025-10-22T01%3A00%3A00Z&lcommit.branch=main&maxSampling=110 ) This pr's demo, no such backend exists anymore [link2](https://torchci-ehh23x5s6-fbopensource.vercel.app/benchmark/compilers_regression?renderGroupId=main&time.start=2025-10-22T00%3A00%3A00.000Z&time.end=2025-10-29T23%3A59%3A59.999Z&filters.repo=pytorch%2Fpytorch&filters.benchmarkName=compiler&filters.backend=&filters.mode=inference&filters.dtype=bfloat16&filters.deviceName=cuda+%28a100%29&filters.device=cuda&filters.arch=a100&lbranch=main&rbranch=main&rcommit.commit=1b655a87ef137d2cc9603a982532c5e033432daa&rcommit.workflow_id=18899846384&rcommit.date=2025-10-29T09%3A00%3A00Z&rcommit.branch=main&lcommit.commit=d01f15152cdf9a4b693d5c768cef31a0b2a5b012&lcommit.workflow_id=18708183777&lcommit.date=2025-10-22T09%3A00%3A00Z&lcommit.branch=main&maxSampling=110) # Use endWith we use hardcoded endWith to filter out the data, endsWith() avoids regex parsing and string scanning, this is much faster than Like % # Investigation To reproduce the data that are filtered ouit, use sql below in clickhouse ``` SELECT DISTINCT benchmark_extra_info['output'] FROM benchmark.oss_ci_benchmark_torchinductor WHERE timestamp >= now() - INTERVAL 2 MONTH AND NOT ( endsWith(benchmark_extra_info['output'], 'performance.csv') OR endsWith(benchmark_extra_info['output'], 'accuracy.csv') OR endsWith(benchmark_extra_info['output'], 'smoketest.csv') ) ORDER BY benchmark_extra_info['output'] ``` result we try to filter out ``` /var/lib/jenkins/pytorch/test/test-reports/inference_huggingface.csv /var/lib/jenkins/pytorch/test/test-reports/inference_timm_models.csv /var/lib/jenkins/pytorch/test/test-reports/inference_torchbench.csv /var/lib/jenkins/pytorch/test/test-reports/training_huggingface.csv /var/lib/jenkins/pytorch/test/test-reports/training_timm_models.csv /var/lib/jenkins/pytorch/test/test-reports/training_torchbench.csv /var/lib/jenkins/workspace/test/test-reports/inference_huggingface.csv /var/lib/jenkins/workspace/test/test-reports/inference_timm_models.csv /var/lib/jenkins/workspace/test/test-reports/inference_torchbench.csv /var/lib/jenkins/workspace/test/test-reports/training_huggingface.csv /var/lib/jenkins/workspace/test/test-reports/training_timm_models.csv /var/lib/jenkins/workspace/test/test-reports/training_torchbench.csv ```
diff --git a/torchci/clickhouse_queries/compilers_benchmark_api_commit_query/query.sql b/torchci/clickhouse_queries/compilers_benchmark_api_commit_query/query.sql
@@ -15,6 +15,11 @@ WHERE
         )
         OR empty({branches: Array(String)})
     )
+    AND NOT (
+        endsWith(benchmark_extra_info['output'], 'huggingface.csv')
+        OR endsWith(benchmark_extra_info['output'], 'torchbench.csv')
+        OR endsWith(benchmark_extra_info['output'], 'timm_models.csv')
+    )
     AND (
         has({suites: Array(String)}, suite)
         OR empty({suites: Array(String)})
diff --git a/torchci/clickhouse_queries/compilers_benchmark_api_query/query.sql b/torchci/clickhouse_queries/compilers_benchmark_api_query/query.sql
@@ -19,6 +19,11 @@ SELECT
 FROM benchmark.oss_ci_benchmark_torchinductor
 WHERE
     workflow_id IN ({workflows: Array(UInt64)})
+    AND NOT (
+        endsWith(benchmark_extra_info['output'], 'huggingface.csv')
+        OR endsWith(benchmark_extra_info['output'], 'torchbench.csv')
+        OR endsWith(benchmark_extra_info['output'], 'timm_models.csv')
+    )
     AND (
         has(
             {branches: Array(String)},

Original file line number	Diff line number	Diff line change
`@@ -15,6 +15,11 @@ WHERE`
`15`	`15`	`)`
`16`	`16`	`OR empty({branches: Array(String)})`
`17`	`17`	`)`
	`18`	`+ AND NOT (`
	`19`	`+ endsWith(benchmark_extra_info['output'], 'huggingface.csv')`
	`20`	`+ OR endsWith(benchmark_extra_info['output'], 'torchbench.csv')`
	`21`	`+ OR endsWith(benchmark_extra_info['output'], 'timm_models.csv')`
	`22`	`+ )`
`18`	`23`	`AND (`
`19`	`24`	`has({suites: Array(String)}, suite)`
`20`	`25`	`OR empty({suites: Array(String)})`