rust: implement `ListRuns` RPC #4385

wchargin · 2020-11-24T08:56:07Z

Summary:
This patch connects the //tensorboard/data/server main module to the
Commit and LogdirLoader infrastructure, and implements ListRuns as
a proof of concept.

For now, we just take a logdir path as a positional command line
argument. A future patch will bring nicer command-line argument parsing.

Test Plan:
Unit tests included. Note that testing these data provider methods is
nice and easy: you just call them. No mocking or networking required.

For a manual grpc_cli test, start just the data server, and run:

$ grpc_cli --channel_creds_type=insecure \
>   --protofiles tensorboard/data/proto/data_provider.proto \
>   call localhost:6806 TensorBoardDataProvider.ListRuns '' 2>/dev/null
runs {
  name: "lr_1E-03,conv=1,fc=2"
  start_time: 1563406327
}
// etc.

For an end-to-end test, cherry-pick #4356, then run both the data server
and a copy of TensorBoard concurrently:

bazel run //tensorboard/data/server -- ~/tensorboard_data/mnist &
bazel run //tensorboard -- \
    --bind_all \
    --generic_data true \
    --grpc_data_provider localhost:6806 \
    ;

Then navigate to localhost:6006/data/runs and note that the response
includes a correct list of runs.

wchargin-branch: rust-listruns

Summary: This module performs conversions from legacy on-disk data formats. It will correspond to both the `data_compat` and `dataclass_compat` modules from Python TensorBoard. For now, we have just one function, to determine the initial summary metadata of a time series. And, for now, we only implement this for scalars, to keep things simple. Test Plan: Unit tests included. wchargin-branch: rust-data-compat wchargin-source: ef510b422b00a66a23efc1406947844fa8d1a798

Summary: Our errors now have nice string formatting, easier `From` conversions, and `std::error::Error` implementations. Test Plan: Included some tests for the `Display` implementations. They’re often not necessary—one benefit of deriving traits is that you can be confident in the implementation without manually testing it. But sometimes, if the format string is non-trivial, it can be nice to actually see the full text written out. wchargin-branch: rust-use-thiserror wchargin-source: 0f8009c3b96619f2eb4649353487dc996b45a712

Summary: We’d defined a `Step` [newtype] in the `reservoir` module, but that type will be useful more broadly. We thus move it into a new `types` module, and define `WallTime` and `Tag` siblings, which we’ll need shortly. [newtype]: https://doc.rust-lang.org/rust-by-example/generics/new_types.html Test Plan: Most behavior is just `#[derive]`d; unit tests included for the rest. wchargin-branch: rust-types-module wchargin-source: b0f5475b247ef8f9cb8bf2e0bf94d2b648b8009e

…rigin/wchargin-rust-use-thiserror' and 'origin/wchargin-rust-types-module' into HEAD

wchargin-branch: rust-run-loader wchargin-source: 35359d6e233139973edac8b2c1440cc48ed4e710

wchargin-branch: rust-data-compat wchargin-source: 467974fa4e0e55dea0872bb1394a768a38811ac5

wchargin-branch: rust-run-loader wchargin-source: d190e3d45c304d976a3fc14a73d99572a6ad377f

Summary: This patch refactors `run`’s `StagePayload` type into an `EventValue` type in `data_compat`. This way, we can define enriching conversions from `EventValue`s to committed values in a layer that’s not specific to the run loader. We also implement `initial_metadata` for the `GraphDef` variant, for symmetry and to make the abstraction clearer. The run loader still doesn’t handle graphs. Test Plan: Unit tests expanded for the new code, and continue to cover the old. wchargin-branch: rust-promote-eventvalue wchargin-source: 96af44556af064cf9ef933154ca3a522a8829798

wchargin-branch: rust-run-loader wchargin-source: 170c77aae53df6716520df779e4838d3fc3b93b9 # Conflicts: # tensorboard/data/server/data_compat.rs

wchargin-branch: rust-run-loader wchargin-source: 170c77aae53df6716520df779e4838d3fc3b93b9

wchargin-branch: rust-promote-eventvalue wchargin-source: 106b4519aba1a966d8273b6b2509f1798131dc5a

Summary: This patch introduces the *commit*, which is the link between loading threads and serving threads. The commit structure is designed for concurrency: many threads can read from it at once, and when the commit is updated, a loader acquires exclusive access only to its own run. This improves latency for both readers and writers. The commit itself doesn’t have much logic; it’s mostly just the correct composition of substructures, most notably the reservoir basins. See also the RustBoard technical design doc (#4350) for discussion. Test Plan: There’s one method on `commit::TimeSeries`; it has tests. wchargin-branch: rust-commit-type wchargin-source: d69b4e0034b68d1d8dadc6aa0f6dc3cc48efccde

Summary: This patch integrates `RunLoader` and `Commit`. A run loader now commits all changes at the end of its load cycle. This isn’t sufficient—we also need it to periodically commit while it’s still loading. But it’s a good start for now. Much of the weight is actually in `data_compat`, which owns the logic for “enriching” raw events into their class-specific representations. This way, that logic can be reused: e.g., for an alternate consumer that exposes a stream of all events, without downsampling. Test Plan: The new enrichment functionality has dedicated tests, and existing tests for the run loader have been updated to operate on a `Commit` rather than poking into the internals. wchargin-branch: rust-commit-run-loader wchargin-source: 1e0ae52eb9cd9ead6f53c9716306a5fcce178e06

wchargin-branch: rust-run-loader wchargin-source: 793ff3e287c2571fd1bc06d5c1940ed3b3bdc796

wchargin-branch: rust-promote-eventvalue wchargin-source: 432923653efb6fe768bdd507acac833672ec3e06

wchargin-branch: rust-commit-type wchargin-source: 1850f510bac1f515fa4b7f73c1ec9a7a47fea283

wchargin-branch: rust-commit-run-loader wchargin-source: 14baca16a82293a24a1b7f58869305f3b4ff7a6f # Conflicts: # tensorboard/data/server/run.rs

wchargin-branch: rust-commit-run-loader wchargin-source: 14baca16a82293a24a1b7f58869305f3b4ff7a6f

wchargin-branch: rust-promote-eventvalue wchargin-source: dd7917e2bd9084f8ff048657d94572168a221a7e # Conflicts: # tensorboard/data/server/data_compat.rs # tensorboard/data/server/run.rs

wchargin-branch: rust-promote-eventvalue wchargin-source: dd7917e2bd9084f8ff048657d94572168a221a7e

wchargin-branch: rust-commit-type wchargin-source: 38179a4b9af7538f133ee04f771ac895a8db21a9

wchargin-branch: rust-commit-run-loader wchargin-source: f84d5842282dba99385fba34ece75f6ac7f82d2e

wchargin-branch: rust-commit-type wchargin-source: b62b0669a50b5086a1f6af18effe62fbfd1e5431

wchargin-branch: rust-commit-run-loader wchargin-source: a8130d52ce4b02636f59cab02ebef5cb27055fc9

wchargin-branch: rust-commit-run-loader wchargin-source: 869544fd17ac2d740e60f1501b9540490d06906b # Conflicts: # tensorboard/data/server/commit.rs

wchargin-branch: rust-commit-run-loader wchargin-source: 869544fd17ac2d740e60f1501b9540490d06906b

wchargin-branch: rust-commit-run-loader wchargin-source: e0ee69aea5631ccc6f2197922249d766aaa84cb6

wchargin-branch: rust-commit-run-loader wchargin-source: 3f18bbb73a892785e6dfe5255b2831289ac3e789 # Conflicts: # tensorboard/data/server/run.rs

wchargin-branch: rust-commit-run-loader wchargin-source: 3f18bbb73a892785e6dfe5255b2831289ac3e789

Summary: A new `LogdirLoader` type owns `RunLoader`s for each run and wires them up to a shared `Commit`. It is the logdir loader that finds event files in the filesystem and feeds them to the run loaders. For ease of review, this patch implements event file discovery but does not actually load data. Stay tuned. Test Plan: Unit tests included. One test covers an edge case involving directories whose names are not valid Unicode. This test only runs on Unix. We could make it run on Windows, too, but I don’t have a Windows box to test on, and the code under test is portable: the only platform-specific code is the test setup. So run tests on a `cfg(unix)` platform, such as Linux or macOS, for best coverage. wchargin-branch: rust-logdir-loader-discovery wchargin-source: dfb479587d0b1361dc39840dc7f535d3f944166d

Summary: This patch connects `LogdirLoader` to `RunLoader` by actually calling the `RunLoader::reload` method. Test Plan: Tests adapted and expanded. wchargin-branch: rust-logdir-loader-data wchargin-source: 8a363898cc9970a816da0898f1ce0ea78bfd98ea

wchargin-branch: rust-logdir-loader-discovery wchargin-source: 761b320491045f4ceb6ced9eb596242ba860e7a9 # Conflicts: # tensorboard/data/server/data_compat.rs

wchargin-branch: rust-logdir-loader-discovery wchargin-source: 761b320491045f4ceb6ced9eb596242ba860e7a9

wchargin-branch: rust-logdir-loader-data wchargin-source: 5f6ccd3318e76347bc3f07d78c459070fe15e65d

Summary: This patch connects the `//tensorboard/data/server` main module to the `Commit` and `LogdirLoader` infrastructure, and implements `ListRuns` as a proof of concept. For now, we just take a logdir path as a positional command line argument. A future patch will bring nicer command-line argument parsing. Test Plan: Run both the data server and a copy of TensorBoard concurrently: ``` bazel run //tensorboard/data/server -- ~/tensorboard_data/mnist & bazel run //tensorboard -- \ --bind_all \ --generic_data true \ --grpc_data_provider localhost:6806 \ ; ``` Then navigate to `localhost:6006/data/runs` and note that the response includes a correct list of runs. Or, for a manual `grpc_cli` test, start just the data server, and run: ``` $ grpc_cli --channel_creds_type=insecure \ > --protofiles tensorboard/data/proto/data_provider.proto \ > call localhost:6806 TensorBoardDataProvider.ListRuns '' 2>/dev/null runs { name: "lr_1E-03,conv=1,fc=2" start_time: 1563406327 } // etc. ``` Unit tests also included. Note that testing these data provider methods is nice and easy: you just call them. No mocking or networking required. wchargin-branch: rust-listruns wchargin-source: eb471eca310905eb45d730936279b57bdae70927

wchargin · 2020-11-24T09:11:57Z

Cf.: #4387 for grpc_cli tips; #4356 to cherry-pick in if you want to
run the end-to-end test.

wchargin-branch: rust-logdir-loader-discovery wchargin-source: 44cd7a803ba04324ce11a892d8a8414792dae95c

wchargin-branch: rust-logdir-loader-data wchargin-source: 29289362b8ff8294883bd381401613a6269e2ebc

wchargin-branch: rust-listruns wchargin-source: 8187059ac28210897b1c90e4a31cd8237993ec4b

wchargin-branch: rust-logdir-loader-data wchargin-source: ab88209f4c0f5874b7c0e80a7c670a8f7cef810d # Conflicts: # tensorboard/data/server/logdir.rs

wchargin-branch: rust-logdir-loader-data wchargin-source: ab88209f4c0f5874b7c0e80a7c670a8f7cef810d

wchargin-branch: rust-listruns wchargin-source: bf3865887dddcdf516bc73806ea40f1f51ca0738

wchargin-branch: rust-listruns wchargin-source: 805e652280c9c72e2f1ea5cd0be2ae74de1538bb

wchargin-branch: rust-listruns wchargin-source: 4314f515f85db55f912bccc1ea0eb22078e6d783

nfelt · 2020-11-24T23:09:49Z

tensorboard/data/server/server.rs

+                .map(|(Run(name), start_time)| data::Run {
+                    name,
+                    start_time: start_time.into(),
+                    ..Default::default()


I gather the point of this line is to be future-proof in case we add more fields to Run? Is there any mechanism that enforces this by default or is just a convention we'll need to uphold ourselves?

Yeah—specifically, so that proto updates that add new fields don’t
introduce compilation errors.

I was considering adding a tiny “style guide” to DEVELOPMENT.md with
things like this, so I’ll probably go ahead and do that.

There is a #[non_exhaustive] attribute that you can add to structs to
indicate that they may have more fields later, preventing callers
outside the crate from initializing them directly but still permitting
them to access public fields. This doesn’t suffice because (a) we’re in
the same crate (though we could change that) and (b) it doesn’t permit
functional record updates (..Default::default()):

error[E0639]: cannot create non-exhaustive struct using struct expression --> main.rs:3:13 | 3 | let s = sub::S { | _____________^ 4 | | a: 1, 5 | | b: 2, 6 | | ..Default::default() 7 | | }; | |_____^

The RFC threads have a lot of comments, and from skimming them it’s not
obvious to me whether this was “intended, by design” or merely “artifact
of how #[non_exhaustive] desugars to a hidden private field”. (It was
definitely known at the time.)

Another potential check for this would be in Clippy: “lint if you
construct a value at a type that implements prost::Message without
using a functional record update”. There is no such check currently.

Reading through the next PR, it's really rather noisy to have the ..Default::default() everywhere we create a proto, especially in tests :/ Given the fact that we control these protos ourselves, I wonder if on net maybe it's easier for places where we don't expect to add fields to just relax this and accept that we'll need to make modifications to the code if we update the proto?

The other alternative that came to mind would be to initialize structs to default-empty, then mutate to populate fields; needing mutation is a downside although I think in some of the more convoluted multi-layer struct cases this may be significantly more concise.

I can see why neither of these is ideal, it just feels distracting IMO to have every single proto struct need the extra syntax. Would be nice if there was some way to specify on the struct definition that they can be initialized a subset of fields if the rest are Default (basically like optional arguments), or if at least maybe the functional update syntax supported plain .. as sugar for ..Default::default().

Just seeing this. For posterity, further discussion at #4399.

nfelt · 2020-11-25T05:19:51Z

tensorboard/data/server/server.rs

+        commit
+    }
+
+    fn sample_handler() -> DataProviderHandler {


I'll leave it up to you (and would also be fine to do in a follow up or punt for now), but I try to avoid data fixtures unless it's really pretty complicated or expensive to set up the right data. In this case it seems like it wouldn't be so bad to 1) create a commit, 2) explicitly populate it with the data we want via test helpers, and then 3) wrap a handler around it and do the assertions. I just find the assertions to be a lot more meaningful when I can see the data they depend on in the same test method vs having to go dig up the fixture.

Okay; noted, thanks!. I’m certainly sympathetic to this. I think I might
punt it for now since a shared fixture suffices for this set of changes.
But I’m happy to go back and reswizzle things after that.

Done in #4433.

@nfelt

Summary: In #4385, @nfelt suggested using test helpers to easily populate a commit rather than using a single fixture across all test cases. This patch implements such test helpers and uses them in the server tests. The net diff to the test code (not the helpers) is −32 lines. Test Plan: Existing tests pass. The new test helpers mostly don’t have explicit tests, but that’s okay since they’re exercised by the main test code. wchargin-branch: rust-test-data-builders wchargin-source: 31164b4d2b4a1fb9bbabc584249e1d3715c17a3e

@nfelt

Summary: In #4385, @nfelt suggested using test helpers to easily populate a commit rather than using a single fixture across all test cases. This patch implements such test helpers and uses them in the server tests. The net diff to the test code (not the helpers) is −32 lines. Test Plan: Existing tests pass. The new test helpers mostly don’t have explicit tests, but that’s okay since they’re exercised by the main test code. wchargin-branch: rust-test-data-builders

wchargin added 30 commits November 17, 2020 13:39

Merge remote-tracking branches 'origin/wchargin-rust-data-compat', 'o…

5e16ea3

…rigin/wchargin-rust-use-thiserror' and 'origin/wchargin-rust-types-module' into HEAD

[update patch]

997ddfa

wchargin-branch: rust-run-loader wchargin-source: 35359d6e233139973edac8b2c1440cc48ed4e710

[update diffbase]

103ccbd

wchargin-branch: rust-data-compat wchargin-source: 467974fa4e0e55dea0872bb1394a768a38811ac5

[update diffbase]

734d780

wchargin-branch: rust-run-loader wchargin-source: d190e3d45c304d976a3fc14a73d99572a6ad377f

[update diffbase]

76bcca2

wchargin-branch: rust-run-loader wchargin-source: 170c77aae53df6716520df779e4838d3fc3b93b9 # Conflicts: # tensorboard/data/server/data_compat.rs

[update patch]

9ba9726

wchargin-branch: rust-run-loader wchargin-source: 170c77aae53df6716520df779e4838d3fc3b93b9

[update diffbase]

aef800e

wchargin-branch: rust-promote-eventvalue wchargin-source: 106b4519aba1a966d8273b6b2509f1798131dc5a

[update patch]

532ea1a

wchargin-branch: rust-run-loader wchargin-source: 793ff3e287c2571fd1bc06d5c1940ed3b3bdc796

[update diffbase]

c577558

wchargin-branch: rust-promote-eventvalue wchargin-source: 432923653efb6fe768bdd507acac833672ec3e06

[update diffbase]

8b11660

wchargin-branch: rust-commit-type wchargin-source: 1850f510bac1f515fa4b7f73c1ec9a7a47fea283

[update diffbase]

b8121ac

wchargin-branch: rust-commit-run-loader wchargin-source: 14baca16a82293a24a1b7f58869305f3b4ff7a6f # Conflicts: # tensorboard/data/server/run.rs

[update patch]

f5d7542

wchargin-branch: rust-commit-run-loader wchargin-source: 14baca16a82293a24a1b7f58869305f3b4ff7a6f

[update diffbase]

e7a6f6b

wchargin-branch: rust-promote-eventvalue wchargin-source: dd7917e2bd9084f8ff048657d94572168a221a7e # Conflicts: # tensorboard/data/server/data_compat.rs # tensorboard/data/server/run.rs

[update patch]

b41f47f

wchargin-branch: rust-promote-eventvalue wchargin-source: dd7917e2bd9084f8ff048657d94572168a221a7e

[update diffbase]

3c7e13c

wchargin-branch: rust-commit-type wchargin-source: 38179a4b9af7538f133ee04f771ac895a8db21a9

[update diffbase]

587e1af

wchargin-branch: rust-commit-run-loader wchargin-source: f84d5842282dba99385fba34ece75f6ac7f82d2e

[update diffbase]

ec0cf2c

wchargin-branch: rust-commit-type wchargin-source: b62b0669a50b5086a1f6af18effe62fbfd1e5431

[update diffbase]

d9f9f19

wchargin-branch: rust-commit-run-loader wchargin-source: a8130d52ce4b02636f59cab02ebef5cb27055fc9

[update patch]

d39710a

wchargin-branch: rust-commit-run-loader wchargin-source: a8130d52ce4b02636f59cab02ebef5cb27055fc9

[update diffbase]

5025feb

wchargin-branch: rust-commit-run-loader wchargin-source: 869544fd17ac2d740e60f1501b9540490d06906b # Conflicts: # tensorboard/data/server/commit.rs

[update patch]

6480fdf

wchargin-branch: rust-commit-run-loader wchargin-source: 869544fd17ac2d740e60f1501b9540490d06906b

[update diffbase]

d1aa9a4

wchargin-branch: rust-commit-run-loader wchargin-source: e0ee69aea5631ccc6f2197922249d766aaa84cb6

[update patch]

4c28c91

wchargin-branch: rust-commit-run-loader wchargin-source: e0ee69aea5631ccc6f2197922249d766aaa84cb6

[update diffbase]

d2b3bf4

wchargin-branch: rust-commit-run-loader wchargin-source: 3f18bbb73a892785e6dfe5255b2831289ac3e789 # Conflicts: # tensorboard/data/server/run.rs

wchargin added 7 commits November 23, 2020 13:54

[update patch]

d5f2ae0

wchargin-branch: rust-commit-run-loader wchargin-source: 3f18bbb73a892785e6dfe5255b2831289ac3e789

rust: reload data in log directories

f7a7cc0

Summary: This patch connects `LogdirLoader` to `RunLoader` by actually calling the `RunLoader::reload` method. Test Plan: Tests adapted and expanded. wchargin-branch: rust-logdir-loader-data wchargin-source: 8a363898cc9970a816da0898f1ce0ea78bfd98ea

[update diffbase]

6a38bb6

wchargin-branch: rust-logdir-loader-discovery wchargin-source: 761b320491045f4ceb6ced9eb596242ba860e7a9 # Conflicts: # tensorboard/data/server/data_compat.rs

[update patch]

5bedfd8

wchargin-branch: rust-logdir-loader-discovery wchargin-source: 761b320491045f4ceb6ced9eb596242ba860e7a9

[update diffbase]

293f887

wchargin-branch: rust-logdir-loader-data wchargin-source: 5f6ccd3318e76347bc3f07d78c459070fe15e65d

wchargin added the core:rustboard //tensorboard/data/server/... label Nov 24, 2020

google-cla bot added the cla: yes label Nov 24, 2020

wchargin requested a review from nfelt November 24, 2020 08:56

wchargin added 6 commits November 24, 2020 09:45

[update patch]

cf7d310

wchargin-branch: rust-logdir-loader-discovery wchargin-source: 44cd7a803ba04324ce11a892d8a8414792dae95c

[update diffbase]

d96e53d

wchargin-branch: rust-logdir-loader-data wchargin-source: 29289362b8ff8294883bd381401613a6269e2ebc

[update diffbase]

1c625f0

wchargin-branch: rust-listruns wchargin-source: 8187059ac28210897b1c90e4a31cd8237993ec4b

[update diffbase]

46ac8fd

wchargin-branch: rust-logdir-loader-data wchargin-source: ab88209f4c0f5874b7c0e80a7c670a8f7cef810d # Conflicts: # tensorboard/data/server/logdir.rs

[update patch]

34ce4d2

wchargin-branch: rust-logdir-loader-data wchargin-source: ab88209f4c0f5874b7c0e80a7c670a8f7cef810d

[update diffbase]

8e58aca

wchargin-branch: rust-listruns wchargin-source: bf3865887dddcdf516bc73806ea40f1f51ca0738

Base automatically changed from wchargin-rust-logdir-loader-data to master November 24, 2020 21:26

wchargin added 2 commits November 24, 2020 13:27

[update diffbase]

758ae4d

wchargin-branch: rust-listruns wchargin-source: 805e652280c9c72e2f1ea5cd0be2ae74de1538bb

[update diffbase]

17eeeb7

wchargin-branch: rust-listruns wchargin-source: 4314f515f85db55f912bccc1ea0eb22078e6d783

nfelt approved these changes Nov 25, 2020

View reviewed changes

wchargin merged commit 5f3b099 into master Nov 25, 2020

wchargin deleted the wchargin-rust-listruns branch November 25, 2020 17:22

nfelt mentioned this pull request Nov 30, 2020

rust: add short style guide #4399

Merged

wchargin mentioned this pull request Dec 7, 2020

rust: replace static fixture with test helpers #4433

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

rust: implement `ListRuns` RPC #4385

rust: implement `ListRuns` RPC #4385

Uh oh!

wchargin commented Nov 24, 2020 •

edited

Loading

Uh oh!

wchargin commented Nov 24, 2020

Uh oh!

nfelt Nov 24, 2020

Uh oh!

wchargin Nov 25, 2020

Uh oh!

nfelt Nov 25, 2020

Uh oh!

wchargin Dec 1, 2020

Uh oh!

nfelt Nov 25, 2020

Uh oh!

wchargin Nov 25, 2020

Uh oh!

wchargin Dec 9, 2020

Uh oh!

Uh oh!

rust: implement ListRuns RPC #4385

rust: implement ListRuns RPC #4385

Uh oh!

Conversation

wchargin commented Nov 24, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

wchargin commented Nov 24, 2020

Uh oh!

nfelt Nov 24, 2020

Choose a reason for hiding this comment

Uh oh!

wchargin Nov 25, 2020

Choose a reason for hiding this comment

Uh oh!

nfelt Nov 25, 2020

Choose a reason for hiding this comment

Uh oh!

wchargin Dec 1, 2020

Choose a reason for hiding this comment

Uh oh!

nfelt Nov 25, 2020

Choose a reason for hiding this comment

Uh oh!

wchargin Nov 25, 2020

Choose a reason for hiding this comment

Uh oh!

wchargin Dec 9, 2020

Choose a reason for hiding this comment

Uh oh!

Uh oh!

rust: implement `ListRuns` RPC #4385

rust: implement `ListRuns` RPC #4385

wchargin commented Nov 24, 2020 •

edited

Loading