Skip to content

rust: create multi-file run loader #4343

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 10 commits into from
Nov 19, 2020
Merged

rust: create multi-file run loader #4343

merged 10 commits into from
Nov 19, 2020

Conversation

wchargin
Copy link
Contributor

Summary:
This patch adds a RunLoader that reads from multiple event files
within a single run. For now, it only keeps track of scalars, and it
doesn’t do anything with the data that it reads. But it persists
everything that we’ll need for the rest of the pipeline.

Test Plan:
Unit tests included: they create a test log directory and write some
event files, then ensure that they can be read back properly.

wchargin-branch: rust-run-loader

Summary:
This module performs conversions from legacy on-disk data formats. It
will correspond to both the `data_compat` and `dataclass_compat` modules
from Python TensorBoard. For now, we have just one function, to
determine the initial summary metadata of a time series. And, for now,
we only implement this for scalars, to keep things simple.

Test Plan:
Unit tests included.

wchargin-branch: rust-data-compat
wchargin-source: ef510b422b00a66a23efc1406947844fa8d1a798
Summary:
Our errors now have nice string formatting, easier `From` conversions,
and `std::error::Error` implementations.

Test Plan:
Included some tests for the `Display` implementations. They’re often not
necessary—one benefit of deriving traits is that you can be confident in
the implementation without manually testing it. But sometimes, if the
format string is non-trivial, it can be nice to actually see the full
text written out.

wchargin-branch: rust-use-thiserror
wchargin-source: 0f8009c3b96619f2eb4649353487dc996b45a712
Summary:
We’d defined a `Step` [newtype] in the `reservoir` module, but that type
will be useful more broadly. We thus move it into a new `types` module,
and define `WallTime` and `Tag` siblings, which we’ll need shortly.

[newtype]: https://doc.rust-lang.org/rust-by-example/generics/new_types.html

Test Plan:
Most behavior is just `#[derive]`d; unit tests included for the rest.

wchargin-branch: rust-types-module
wchargin-source: b0f5475b247ef8f9cb8bf2e0bf94d2b648b8009e
…rigin/wchargin-rust-use-thiserror' and 'origin/wchargin-rust-types-module' into HEAD
wchargin-branch: rust-run-loader
wchargin-source: 35359d6e233139973edac8b2c1440cc48ed4e710
@wchargin wchargin added the core:rustboard //tensorboard/data/server/... label Nov 18, 2020
@google-cla google-cla bot added the cla: yes label Nov 18, 2020
@wchargin
Copy link
Contributor Author

wchargin commented Nov 18, 2020

This patch is against the combined diffbase of #4340, #4341, and #4342.
Please ignore diffbase commits. (Currently, the only commit under review
is 997ddfa.) Thanks!

edit: Two of these have merged, so I’ve linearized.

@wchargin wchargin requested a review from stephanwlee November 18, 2020 05:22
Copy link
Contributor Author

@wchargin wchargin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple tour-guide comments…

}

#[test]
fn test() -> Result<(), Box<dyn std::error::Error>> {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On this line, dyn std::error::Error opts us into dynamic dispatch: the
error may be any value implementing Error, along with its vtable.
This isn’t useful for programmatic callers, since it’s hard to usefully
inspect the cause, but it’s really convenient for main functions or test
functions, like this one.

/// will all map to [`EventFile::Dead`].
fn update_file_set(&mut self, filenames: Vec<PathBuf>) {
// Remove any discarded files.
let new_file_set: HashSet<&PathBuf> = filenames.iter().collect();
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Of some interest: you have probably seen us write .collect() before to
turn an iterator into a vector. But here we’re turning one into a hash
set! In fact, collect can generate any collection type, as long as
it implements FromIterator. The specific type in any instance is
determined by context, and the compiler will complain if there’s not
enough information to uniquely identify a type. This is sometimes called
target typing.

wchargin-branch: rust-data-compat
wchargin-source: 467974fa4e0e55dea0872bb1394a768a38811ac5
@wchargin wchargin changed the base branch from master to wchargin-rust-data-compat November 18, 2020 23:06
wchargin-branch: rust-run-loader
wchargin-source: d190e3d45c304d976a3fc14a73d99572a6ad377f
Base automatically changed from wchargin-rust-data-compat to master November 19, 2020 01:56
wchargin-branch: rust-run-loader
wchargin-source: 170c77aae53df6716520df779e4838d3fc3b93b9

# Conflicts:
#	tensorboard/data/server/data_compat.rs
wchargin-branch: rust-run-loader
wchargin-source: 170c77aae53df6716520df779e4838d3fc3b93b9
let new_file_set: HashSet<&PathBuf> = filenames.iter().collect();
for (k, v) in self.files.iter_mut() {
if !new_file_set.contains(k) {
*v = EventFile::Dead;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just confirming an obvious. what happens to the old reference to v? Does compiler essentially deallocates it since the owner disavowed it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep. We say that the old value is dropped here, and so the compiler
will call the appropriate destructors and free the associated memory.


// Open readers for any new files.
for filename in filenames {
use std::collections::btree_map;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it kosher to "import" mid module? If so, when is this pattern used? I suspect that a good compiler will do the right thing no matter how you write this but I am still curious as to when I should prefer this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this is fine!

First, note that Rust imports are always pure. Modules do not have side
effects. So you can freely reorder imports (rustfmt sorts them) or
declare them at any scope that you like without worrying about bugs.

So, then, it’s all about scope. The general Rust convention is to import
most types and traits that you use at top level, and modules if you
want, too. Importing bare functions is rarer, I think, but not unheard
of. I like to be able to see data_compat::initial_metadata(...) to
clarify that it’s not just a helper function in this module.

So I could have imported btree_map::Entry at top level. But I think
that that would be a wee bit confusing, because if you see Entry in
some arbitrary place in the code, it’s not obvious what that refers to
(as opposed to, e.g., HashMap).

I also could have imported btree_map at top level, and that would also
have been reasonable imho. I don’t really have strong opinions here.

Some examples of my conventions:

// OK: types and traits that are clear from context
use std::collections::{HashMap, HashSet};
use std::io::{Read, Write};

// OK: modules, aliases
use std::io;
use std::path;
use crate::proto::tensorboard as pb;

// Avoid: types not clear from context
// use std::io::Error;  // a fn returns "Error"---what kind of error?

// Weak-avoid: standalone functions
// use std::iter::once;  // seeing "iter::once" would be clearer

Comment on lines +294 to +297
let f1_name = logdir.path().join("tfevents.123");
let f2_name = logdir.path().join("tfevents.456");
let mut f1 = BufWriter::new(File::create(&f1_name)?);
let mut f2 = BufWriter::new(File::create(&f2_name)?);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No action required.

Interesting; I thought the module would consider an inevitable abstraction of the file system and operate on testable EventFile or something akin to that.

I assume this is actually writing these events out to the temp directory (at sync_all call in L315)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That was certainly my plan, to parameterize the reader:

pub struct RunLoader<R> {
    files: HashMap<PathBuf, EventFile<R>>,
    // ...
}
impl<R: Read> for RunLoader { /* ... */ }

This works swimmingly for reading, but the problem is that Read is
just a stream type, and doesn’t actually give you a way to open a
file. So I would have needed to define something like:

trait Filesystem {
    type File: Read + Seek;
    fn open(p: &Path) -> io::Result<Self::File>;
}

struct RealFilesystem;
impl Filesystem for RealFilesystem {
    type File = std::fs::File;
    fn open(p: &Path) -> io::Result<Self::File> { std::fs::File::open(p) }
}

And this works perfectly fine, and we probably will want it
eventually, but I didn’t really want to bother implementing my own fake
in-memory file system when the real filesystem will do just fine.

I assume this is actually writing these events out to the temp
directory (at sync_all call in L315)?

Yep. The sync_all is an fsync(2). Without it, the files would still
be written when they’re dropped, but here we needed to keep a reference
to them later, so this flushes the BufWriter’s buffer.

// Start time should be that of the file version event, even though that didn't correspond
// to any time series.
assert_eq!(loader.start_time, Some(WallTime::new(1234.0).unwrap()));
assert_eq!(loader.time_series.keys().collect::<Vec<_>>(), vec![&tag]);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

<_>! One day I will see a code with my favorite emoji, >_<

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm… I’m having trouble thinking of a piece of standard Rust syntax that
contains >_<, since _ is not a valid identifier by itself. :-(
But this time, macros can save the day:

macro_rules! swlmoji {
    (<_>) => { "meep" };
    (>_<) => { "mrrp" };
}
fn main() {
    println!("{}", swlmoji!(<_>));
    println!("{}", swlmoji!(>_<));
}

Copy link
Contributor Author

@wchargin wchargin Nov 19, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(oh, and the _ in Vec<_> means “please infer type parameter” :-) )

Comment on lines 354 to 357
(Step(0), WallTime::new(1235.0).unwrap(), 0.25),
(Step(1), WallTime::new(1236.0).unwrap(), 0.50),
(Step(2), WallTime::new(2346.0).unwrap(), 0.75),
(Step(3), WallTime::new(2347.0).unwrap(), 1.00),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Testing my understanding: I assume the loader.time_series does not actually sort/order events by step, right? It is just happens to be sorted since we wrote as such in L311-L314. Don't think I've read the reservoir module thoroughly to know that on top of my head, but it looks like rsv.offer simply pushes onto the vec.

Copy link
Contributor Author

@wchargin wchargin Nov 19, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is sorted because the reservoir preempts any non-preceding records
before pushing onto the vector, so when it pushes, the step of the
new record is always larger than the steps of all other records.

Including a preemption in this test is probably a pretty reasonable
idea. I’ll go ahead and do that.

wchargin-branch: rust-run-loader
wchargin-source: 793ff3e287c2571fd1bc06d5c1940ed3b3bdc796
@wchargin wchargin merged commit cc30043 into master Nov 19, 2020
@wchargin wchargin deleted the wchargin-rust-run-loader branch November 19, 2020 16:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla: yes core:rustboard //tensorboard/data/server/...
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants