Skip to content

Conversation

pgmpablo157321
Copy link
Contributor

@pgmpablo157321 pgmpablo157321 commented Oct 14, 2025

  • Updates to the submission checker for the new directory structure
  • File describing the new directory structure
  • Sample of new directory structure base on v5.1 results

Copy link
Contributor

github-actions bot commented Oct 14, 2025

MLCommons CLA bot All contributors have signed the MLCommons CLA ✍️ ✅

@pgmpablo157321 pgmpablo157321 marked this pull request as ready for review October 15, 2025 21:18
@pgmpablo157321 pgmpablo157321 requested a review from a team as a code owner October 15, 2025 21:18
```
├── ...
│ ├── closed
│ │ ├── code
Copy link
Contributor

@anandhu-eng anandhu-eng Oct 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @pgmpablo157321 , would submitter name be there as parent folder of code,results and systems in new structure?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or should we put closed or open as parent of just results?

Also should we consider moving system.json under sut folder inside results directory and thus get rid of systems folder as well?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@anandhu-eng Correct, I just pushed a fix for this

@arjunsuresh Not sure, we can discuss with the WG

@anandhu-eng
Copy link
Contributor

Hi @pgmpablo157321 , I was testing the changes. I was encountered with the following error:

[2025-10-17 11:30:18,023 submission_checker.py:3284 ERROR] no compliance dir for closed/MLCommons/results/SR680a_V3_B200SXMx8_TRT/llama2-70b-99.9: closed/MLCommons/compliance/SR680a_V3_B200SXMx8_TRT/llama2-70b-99.9/Server

The detailed log could be found here.

I think the issue is with the preprocessing script where it still modifies the results folder to compliance here. This leads the code to check for the following directory: closed/MLCommons/compliance/SR680a_V3_B200SXMx8_TRT/llama2-70b-99.9/Server

The sample submission files(taken from inference_results_v5.1 repo) which was used to test the changes could be found here.

@anandhu-eng
Copy link
Contributor

Hi @pgmpablo157321 , this line from preprocess_submission.py triggers the following error:

[2025-10-20 08:05:22,807 preprocess_submission.py:370 ERROR] no submission in closed/MLCommons/measurements
  [2025-10-20 08:05:22,807 preprocess_submission.py:370 ERROR] no submission in closed/MLCommons/compliance

Please find the run logs here:
https://github.com/mlcommons/mlperf-automations/actions/runs/18591329608/job/53153591425?pr=689

@pgmpablo157321 pgmpablo157321 force-pushed the submission_dir branch 4 times, most recently from b5aaf4f to 9d619aa Compare October 21, 2025 03:58
@pgmpablo157321
Copy link
Contributor Author

@anandhu-eng Pushed some fixes but two things seem to be failing:

  1. For some submission generation, there is no calibration.md. I don't think this has anything to do with these changes
  2. For some of the bert runs, the submission logs use the current folder structure, and the submission checker expects this new one. How do you think we should address this? Should the github action be updated with this PR as well? (I added some code in the preprocess_submission.py to get the new folder structure from the old one, maybe update the github action to run this?)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants