Skip to content

Conversation

@Skn0tt
Copy link
Member

@Skn0tt Skn0tt commented Dec 23, 2025

Updates shard weights so that shards are a little closer. New runtimes:

bot shard 1 shard 2
ubuntu-20 25min 45s 25min 56s
windows-latest 39min 10s 40min 11s
macOS-latest 19min 6s 28min 13s
ubuntu-22 23min 8s 26min 18s
ubuntu-22 25min 1s 26min 29s

No idea why macOS is so disbalanced 🤔
We could probably shard windows across three bots to make everything even.

@Skn0tt Skn0tt self-assigned this Dec 23, 2025
@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@Skn0tt Skn0tt marked this pull request as ready for review December 23, 2025 11:37
@Skn0tt
Copy link
Member Author

Skn0tt commented Dec 23, 2025

Here's a heuristic for finding the right weights:

import type { FullConfig, Reporter, Suite, TestCase } from './packages/playwright-test/reporter';

export default class ShardWeightReporter implements Reporter {
  suite!: Suite;
  config!: FullConfig;

  onBegin(config: FullConfig, suite: Suite): void {
    this.suite = suite;
    this.config = config;
  }

  onEnd() {
    const tests = this.suite.allTests();

    const testsByBot = Object.entries(this.groupTestsByBot(tests));
    testsByBot.sort((a, b) => a[0].localeCompare(b[0]));

    console.log('Suggested shard weights per bot:');
    for (const [botName, botTests] of testsByBot) {
      const suggestedWeights = this.calculateShardWeights(botTests);
      console.log(`${botName}:`, this.normaliseWeights(suggestedWeights));
    }
  }

  private groupTestsByBot(tests: TestCase[]): Record<string, TestCase[]> {
    const testsByBot: Record<string, TestCase[]> = {};

    for (const test of tests) {
      let botName = test.tags[0] ?? 'unknown';
      botName = botName.replace(/-\d+$/, ''); // TODO: fix on playwright end. bot name shouldn't contain shard index suffix
      testsByBot[botName] ??= [];
      testsByBot[botName].push(test);
    }

    return testsByBot;
  }

  private sortByTestGroups(tests: TestCase[]) {
    // recreates the sorting that filterForShard gets as input
    tests.sort((a, b) => {
      const shardIndexA = a.results[0]!.shardIndex;
      const shardIndexB = b.results[0]!.shardIndex;
      if (shardIndexA !== shardIndexB)
        return shardIndexA - shardIndexB;

      const startTimeA = a.results[0]!.startTime.getTime();
      const startTimeB = b.results[0]!.startTime.getTime();
      return startTimeA - startTimeB;
    });

  }

  private calculateShardWeights(tests: TestCase[]): number[] {
    if (tests.length === 0)
      return [];

    this.sortByTestGroups(tests);

    const shardTotal = tests[tests.length - 1].results[0]!.shardIndex;

    const totalDuration = this.sum(tests.map(test => test.results[0]!.duration));
    const optimalDuration = totalDuration / shardTotal;

    const weights: number[] = [];
    let currentDuration = 0;
    let currentTestCount = 0;

    for (const test of tests) {
      const duration = test.results[0]!.duration;
      currentDuration += duration;
      currentTestCount++;

      // When we've reached the optimal shard duration, record this shard's weight
      if (currentDuration >= optimalDuration && weights.length < shardTotal - 1) {
        weights.push(currentTestCount);
        currentDuration = 0;
        currentTestCount = 0;
      }
    }
    weights.push(currentTestCount);

    return weights;
  }

  private normaliseWeights(weights: number[]): number[] {
    const total = weights.reduce((a, b) => a + b, 0);
    weights = weights.map(w => Math.floor((w / total) * 100));
    const remaining = 100 - weights.reduce((a, b) => a + b, 0);
    for (let i = 0; i < remaining; i++)
      weights[i % weights.length]++;
    return weights;
  }

  private sum(array: number[]): number {
    return array.reduce((a, b) => a + b, 0);
  }
}

Output when running agains this PR's report:

Suggested shard weights per bot:
@macos-latest-node20: [ 63, 37 ]
@ubuntu-latest-node20: [ 59, 41 ]
@ubuntu-latest-node22: [ 61, 39 ]
@ubuntu-latest-node24: [ 59, 41 ]
@windows-latest-node20: [ 59, 41 ]

with:
node-version: ${{matrix.node-version}}
command: npm run ttest -- --shard ${{ matrix.shardIndex }}/${{ matrix.shardTotal }}
command: npm run ttest -- --shard ${{ matrix.shardIndex }}/${{ matrix.shardTotal }} --shard-weights=58:42
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so this is meaningfully better than 10:7?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that should also work. why 10:7?

@github-actions
Copy link
Contributor

github-actions bot commented Jan 5, 2026

Test results for "MCP"

2822 passed, 116 skipped


Merge workflow run.

@github-actions
Copy link
Contributor

github-actions bot commented Jan 5, 2026

Test results for "tests 1"

3 flaky ⚠️ [firefox-library] › library/inspector/cli-codegen-1.spec.ts:1082 › cli codegen › should not throw csp directive violation errors `@firefox-ubuntu-22.04-node20`
⚠️ [playwright-test] › runner.spec.ts:124 › should ignore subprocess creation error because of SIGINT `@macos-latest-node20-1`
⚠️ [playwright-test] › ui-mode-test-output.spec.ts:118 › should collapse repeated console messages for test `@macos-latest-node20-2`

34400 passed, 689 skipped


Merge workflow run.

@Skn0tt Skn0tt merged commit 2c26a01 into microsoft:main Jan 5, 2026
31 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants