perf!: testing yaml optimisations by james-garner-canonical · Pull Request #2361 · canonical/operator

james-garner-canonical · 2026-03-03T04:10:42Z

This PR takes the most drastic changes possible to address the performance concerns raised in #1498. My thinking is that we can see what this breaks and what the maximum possible performance gains are, and then decide what changes (if any) we should actually make.

Changes

Whenever a Context object is created, we load the metadata (trying charmcraft.yaml first and falling back to the legacy files if needed) (so 1x open and 1x YAML load)

This PR caches the file loads performed by _CharmSpec.autoload, so we only read from disc once per file per testing process instead of once per Context creation (e.g. once per test).

With each run, we create an ops.CharmMeta object, reading the text of the file written above from metadata.yaml and actions.yaml (and probably in the future config.yaml) and doing a YAML load.

This PR eliminates these extra file loads entirely by creating the CharmMeta object from the already created _CharmSpec object. We still create the temporary directory, as it's where container writes occur.

With each run, we create a temporary directory and write the three (this is simulating the Juju state) files there (so 3x open and 3x YAML dump), using the (dictionary) data from the charm spec. This directory is discarded after the run.

This PR drops these file writes. It seems they were only used for creating CharmMeta objects. In principle, charms could assume that these files are exist and try to read them, but apparently this doesn't come up in any of the tests for charms we check.

Running super-tox results in 70 of the 153 charms passing on both main and on this branch.

Two of our tests that explicitly assert on behaviour changed in this PR changes have been updated.

Benchmarking

Benchmark results look promising. The fast benchmarks don't seem to have changed much, while the slowest one (test_many_tests_autoload_meta) goes from ~9.87 ms to ~4.53 ms. That said, these numbers indicate that there don't really seem to be major performance gains on the table here.

Benchmark results on main

--------------------------------------------------------------------------------------------------------- benchmark: 10 tests ---------------------------------------------------------------------------------------------------------
Name (time in us)                                    Min                     Max                   Mean                 StdDev                Median                 IQR            Outliers          OPS            Rounds  Iterations
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
test_context_explicit_meta_config_actions        10.2490 (1.0)        2,441.0980 (23.90)        12.8702 (1.05)         26.6113 (9.39)        11.6620 (1.0)        0.9920 (1.02)      57;1020  77,699.0609 (0.95)      14458           1
test_context_explicit_meta                       10.4990 (1.02)         298.6600 (2.92)         12.2551 (1.0)           7.0336 (2.48)        11.7120 (1.00)       0.9720 (1.0)        26;173  81,598.7972 (1.0)        2986           1
test_full_state                                  42.5590 (4.15)         102.1420 (1.0)          45.2166 (3.69)          2.8327 (1.0)         44.7840 (3.84)       1.0720 (1.10)      146;276  22,115.7564 (0.27)       4332           1
test_context_autoload_meta                      890.4900 (86.89)      1,532.4740 (15.00)       923.1925 (75.33)        39.5865 (13.97)      917.5610 (78.68)     19.9470 (20.52)       22;24   1,083.1977 (0.01)        864           1
test_run_no_observer                          1,939.4570 (189.23)     5,220.7090 (51.11)     2,186.3798 (178.41)      450.4242 (159.01)   2,072.2960 (177.70)    94.9130 (97.65)       12;15     457.3771 (0.01)        147           1
test_run_observed                             2,098.2141 (204.72)     7,426.1350 (72.70)     2,338.8252 (190.85)      539.9316 (190.61)   2,199.7551 (188.63)    97.3194 (100.12)      16;24     427.5651 (0.01)        235           1
test_deferred_events                          2,197.5910 (214.42)     6,072.2670 (59.45)     2,439.1141 (199.03)      564.1207 (199.15)   2,312.8470 (198.32)   104.4535 (107.46)      10;16     409.9849 (0.01)        189           1
test_lots_of_logs                             3,725.1250 (363.46)   170,508.0220 (>1000.0)   5,195.1735 (423.92)   13,039.0260 (>1000.0)  3,971.1170 (340.52)   275.9275 (283.87)       1;24     192.4864 (0.00)        163           1
test_many_tests_explicit_meta                 4,435.6780 (432.79)    77,010.1980 (753.95)    5,682.7027 (463.70)    6,314.0333 (>1000.0)  4,983.3650 (427.32)   334.6597 (344.30)       1;13     175.9726 (0.00)        131           1
test_many_tests_autoload_meta                 9,500.5250 (926.97)    16,799.5220 (164.47)   10,245.4877 (836.02)    1,166.4413 (411.78)   9,867.5980 (846.14)   267.6975 (275.41)       9;16      97.6039 (0.00)         88           1
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Benchmark results on this branch

-------------------------------------------------------------------------------------------------------- benchmark: 10 tests ---------------------------------------------------------------------------------------------------------
Name (time in us)                                    Min                     Max                  Mean                 StdDev                Median                 IQR            Outliers          OPS            Rounds  Iterations
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
test_context_explicit_meta_config_actions        10.3290 (1.0)        2,262.8430 (6.69)        12.4424 (1.0)          21.8495 (2.74)        11.5820 (1.0)        0.7020 (1.0)       40;1393  80,370.4021 (1.0)       13371           1
test_context_explicit_meta                       10.3390 (1.00)         338.3950 (1.0)         13.1573 (1.06)          7.9888 (1.0)         12.1420 (1.05)       1.9240 (2.74)       26;330  76,003.6978 (0.95)       2832           1
test_context_autoload_meta                       21.0900 (2.04)       1,410.2950 (4.17)        26.2959 (2.11)         20.7144 (2.59)        23.0830 (1.99)       1.3230 (1.88)     406;1323  38,028.7317 (0.47)       9708           1
test_full_state                                  43.2210 (4.18)         614.7830 (1.82)        46.5991 (3.75)         14.9854 (1.88)        45.6150 (3.94)       1.3230 (1.88)       26;147  21,459.6533 (0.27)       3636           1
test_run_no_observer                            938.7000 (90.88)      4,585.6890 (13.55)    1,134.1832 (91.15)       390.6378 (48.90)    1,073.5930 (92.69)     95.5240 (136.08)       4;10     881.6917 (0.01)        156           1
test_run_observed                             1,081.8780 (104.74)    77,891.7310 (230.18)   1,376.8241 (110.66)    2,717.4299 (340.16)   1,180.6090 (101.93)    81.1320 (115.58)       3;84     726.3092 (0.01)        814           1
test_deferred_events                          1,170.1640 (113.29)   308,802.2290 (912.55)   1,966.3871 (158.04)   13,236.6985 (>1000.0)  1,290.1495 (111.39)    93.6805 (133.45)       1;41     508.5469 (0.01)        540           1
test_lots_of_logs                             2,709.6710 (262.34)   270,928.1870 (800.63)   3,983.4068 (320.15)   15,818.0428 (>1000.0)  2,916.2080 (251.79)   162.9275 (232.10)       1;39     251.0414 (0.00)        287           1
test_many_tests_explicit_meta                 3,411.3160 (330.27)    14,443.7620 (42.68)    4,206.8873 (338.11)    1,626.4021 (203.59)   3,732.5740 (322.27)   501.5410 (714.46)       8;17     237.7054 (0.00)        130           1
test_many_tests_autoload_meta                 4,092.3530 (396.20)   201,823.7970 (596.41)   5,941.3242 (477.51)   14,641.9881 (>1000.0)  4,527.3335 (390.89)   533.8110 (760.43)       1;23     168.3126 (0.00)        182           1
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Resolves: #1498

tonyandrewmeyer

I like most of these changes - need to review the performance numbers a bit. I'm a bit concerned about charms not being able to read the YAML files. I assume that's going to come up in the discussion today.

testing/src/scenario/state.py

dimaqq · 2026-03-04T00:56:45Z

stand-up comments: a more useful metric would be if a super-tox run is faster and by how much, assuming that the same number of tests a run and pass.

Merge branch 'main' into 26-03+feat+testing-yaml-optimisations

james-garner-canonical added 15 commits March 3, 2026 17:03

perf: cache yaml load when autoloading charm spec

dbb9b0f

fix: use absolute path to charm root for caching load

3a56a78

test: add debugging asserts to _load_yaml

b604b03

test: make unit tests verbose

95c07dd

test: add logging

1bdfd19

fix: avoid accidental mutation

2262cfc

fix: handle else case

b841ae4

chore: restore lru_cache use now that we've debugged things

d895b81

style: format

04adbce

perf: use already loaded _CharmSpec to create CharmMeta

55c9f66

chore: revert workflow change

5585982

perf: don't write metadata files out at all

4c49ceb

refactor: clean up unused code to make diff clearer

455c2b4

refactor: tidy up since we have less to do now

f848e1a

fix: don't use a context manager, keep using cleanup explicitly

547765d

tonyandrewmeyer reviewed Mar 4, 2026

View reviewed changes

testing/src/scenario/state.py Show resolved Hide resolved

dimaqq mentioned this pull request Mar 4, 2026

Changes to consider for next major release (4.x) #2350

Open

9 tasks

james-garner-canonical mentioned this pull request Mar 5, 2026

os.getcwd() is not the charm root in ops[testing] #2045

Open

chore: merge main

f874a6e

Merge branch 'main' into 26-03+feat+testing-yaml-optimisations

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf!: testing yaml optimisations#2361

perf!: testing yaml optimisations#2361
james-garner-canonical wants to merge 16 commits intocanonical:mainfrom
james-garner-canonical:26-03+feat+testing-yaml-optimisations

james-garner-canonical commented Mar 3, 2026 •

edited

Loading

Uh oh!

tonyandrewmeyer left a comment

Uh oh!

Uh oh!

dimaqq commented Mar 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

james-garner-canonical commented Mar 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changes

Benchmarking

Uh oh!

tonyandrewmeyer left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

dimaqq commented Mar 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

james-garner-canonical commented Mar 3, 2026 •

edited

Loading