Skip to content

[compiletest] Ignore known paths when abbreviating output #96551

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Jun 6, 2022
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
47 changes: 29 additions & 18 deletions src/tools/compiletest/src/read2.rs
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ use std::io::{self, Write};
use std::mem::replace;
use std::process::{Child, Output};

pub fn read2_abbreviated(mut child: Child, exclude_from_len: &[String]) -> io::Result<Output> {
pub fn read2_abbreviated(mut child: Child, filter_paths_from_len: &[String]) -> io::Result<Output> {
let mut stdout = ProcOutput::new();
let mut stderr = ProcOutput::new();

Expand All @@ -18,7 +18,7 @@ pub fn read2_abbreviated(mut child: Child, exclude_from_len: &[String]) -> io::R
child.stdout.take().unwrap(),
child.stderr.take().unwrap(),
&mut |is_stdout, data, _| {
if is_stdout { &mut stdout } else { &mut stderr }.extend(data, exclude_from_len);
if is_stdout { &mut stdout } else { &mut stderr }.extend(data, filter_paths_from_len);
data.clear();
},
)?;
Expand All @@ -29,23 +29,30 @@ pub fn read2_abbreviated(mut child: Child, exclude_from_len: &[String]) -> io::R

const HEAD_LEN: usize = 160 * 1024;
const TAIL_LEN: usize = 256 * 1024;
const EXCLUDED_PLACEHOLDER_LEN: isize = 32;

// Whenever a path is filtered when counting the length of the output, we need to add some
// placeholder length to ensure a compiler emitting only filtered paths doesn't cause a OOM.
//
// 32 was chosen semi-arbitrarily: it was the highest power of two that still allowed the test
// suite to pass at the moment of implementing path filtering.
const FILTERED_PATHS_PLACEHOLDER_LEN: usize = 32;

enum ProcOutput {
Full { bytes: Vec<u8>, excluded_len: isize },
Full { bytes: Vec<u8>, filtered_len: usize },
Abbreviated { head: Vec<u8>, skipped: usize, tail: Box<[u8]> },
}

impl ProcOutput {
fn new() -> Self {
ProcOutput::Full { bytes: Vec::new(), excluded_len: 0 }
ProcOutput::Full { bytes: Vec::new(), filtered_len: 0 }
}

fn extend(&mut self, data: &[u8], exclude_from_len: &[String]) {
fn extend(&mut self, data: &[u8], filter_paths_from_len: &[String]) {
let new_self = match *self {
ProcOutput::Full { ref mut bytes, ref mut excluded_len } => {
ProcOutput::Full { ref mut bytes, ref mut filtered_len } => {
let old_len = bytes.len();
bytes.extend_from_slice(data);
*filtered_len += data.len();

// We had problems in the past with tests failing only in some environments,
// due to the length of the base path pushing the output size over the limit.
Expand All @@ -58,21 +65,25 @@ impl ProcOutput {
// The compiler emitting only excluded strings is addressed by adding a
// placeholder size for each excluded segment, which will eventually reach
// the configured threshold.
for pattern in exclude_from_len {
let pattern_bytes = pattern.as_bytes();
// We start matching `pattern_bytes - 1` into the previously loaded data,
// to account for the fact a pattern might be included across multiple
// `extend` calls. Starting from `- 1` avoids double-counting patterns.
let matches = (&bytes[(old_len.saturating_sub(pattern_bytes.len() - 1))..])
.windows(pattern_bytes.len())
.filter(|window| window == &pattern_bytes)
for path in filter_paths_from_len {
let path_bytes = path.as_bytes();
// We start matching `path_bytes - 1` into the previously loaded data,
// to account for the fact a path_bytes might be included across multiple
// `extend` calls. Starting from `- 1` avoids double-counting paths.
let matches = (&bytes[(old_len.saturating_sub(path_bytes.len() - 1))..])
.windows(path_bytes.len())
.filter(|window| window == &path_bytes)
.count();
*excluded_len += matches as isize
* (EXCLUDED_PLACEHOLDER_LEN - pattern_bytes.len() as isize);
*filtered_len -= matches * path_bytes.len();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is going to run into some problems if we have overlapping matches (e.g., one of the patterns is /home/mark and the other is /home/mark/rust, then we'll subtract out twice).

I think src_base and build_base won't do that in practice so we're OK for now, but ideally we'd either (a) assert that it doesn't happen or (b) somehow fix it. I'm not quite sure what the best approach to fixing it would be -- trying to detect this kind of overlap seems annoying.


// We can't just remove the length of the filtered path from the output lenght,
// otherwise a compiler emitting only filtered paths would OOM compiletest. Add
// a fixed placeholder length for each path to prevent that.
*filtered_len += matches * FILTERED_PATHS_PLACEHOLDER_LEN;
}

let new_len = bytes.len();
if (new_len as isize + *excluded_len) as usize <= HEAD_LEN + TAIL_LEN {
if *filtered_len <= HEAD_LEN + TAIL_LEN {
return;
}

Expand Down
12 changes: 6 additions & 6 deletions src/tools/compiletest/src/runtest.rs
Original file line number Diff line number Diff line change
Expand Up @@ -1736,25 +1736,25 @@ impl<'test> TestCx<'test> {
}

fn read2_abbreviated(&self, child: Child) -> Output {
let mut exclude_from_len = Vec::new();
let mut filter_paths_from_len = Vec::new();
let mut add_path = |path: &Path| {
let path = path.display().to_string();
let windows = path.replace("\\", "\\\\");
if windows != path {
exclude_from_len.push(windows);
filter_paths_from_len.push(windows);
}
exclude_from_len.push(path);
filter_paths_from_len.push(path);
};

// List of strings that will not be measured when determining whether the output is larger
// List of paths that will not be measured when determining whether the output is larger
// than the output truncation threshold.
//
// Note: avoid adding a subdirectory of an already excluded directory here, otherwise the
// Note: avoid adding a subdirectory of an already filtered directory here, otherwise the
// same slice of text will be double counted and the truncation might not happen.
add_path(&self.config.src_base);
add_path(&self.config.build_base);

read2_abbreviated(child, &exclude_from_len).expect("failed to read output")
read2_abbreviated(child, &filter_paths_from_len).expect("failed to read output")
}

fn compose_and_run(
Expand Down