-
Notifications
You must be signed in to change notification settings - Fork 1.7k
Improve speed of cargo dev fmt
#14862
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the speed improvement.
I am a bit concerned that some of the Clippy maintainers or some of the users might use stray Rust files while working on Clippy, or test directories under the Clippy root. I wonder if it we should be more conservative and start from known roots.
clippy_dev/src/fmt.rs
Outdated
.as_os_str() | ||
.as_encoded_bytes() | ||
.get(2..) | ||
.is_none_or(|x| x != "target".as_bytes() && x != ".git".as_bytes()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At least one Clippy maintainer uses .jj
which will have the same issue as .git
:
.is_none_or(|x| x != "target".as_bytes() && x != ".git".as_bytes()) | |
.is_none_or(|x| !matches!(x, b"target" | b".git" | b".jj")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not really a problem, but I agree it should be skipped as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this is not a colocated git repo, all the git files will be under .jj/repo/store/git
, which would not be filtered away, hence the importance of including it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah. That is indeed a problem.
clippy_dev/src/fmt.rs
Outdated
) | ||
.map(|mut cmd| match cmd.spawn() { | ||
Ok(x) => x, | ||
Err(ref e) => panic_action(&e, ErrAction::Run, "rustfmt".as_ref()), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be more ergonomic to have panic_action()
take a impl AsRef<Path>
(and call .as_ref()
in it) and to use &str
directly:
Err(ref e) => panic_action(&e, ErrAction::Run, "rustfmt".as_ref()), | |
Err(e) => panic_action(&e, ErrAction::Run, "rustfmt"), |
clippy_dev/src/fmt.rs
Outdated
@@ -353,25 +330,17 @@ fn run_rustfmt(clippy: &ClippyInfo, update_mode: UpdateMode) { | |||
{ | |||
eprintln!("{s}"); | |||
} | |||
panic_action(&e, ErrAction::Run, name.as_ref()); | |||
panic_action(&e, ErrAction::Run, "rustfmt".as_ref()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
panic_action(&e, ErrAction::Run, "rustfmt".as_ref()); | |
panic_action(&e, ErrAction::Run, "rustfmt"); |
clippy_dev/src/fmt.rs
Outdated
}, | ||
}, | ||
Err(ref e) => panic_action(e, ErrAction::Run, name.as_ref()), | ||
Err(ref e) => panic_action(e, ErrAction::Run, "rustfmt".as_ref()), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Err(ref e) => panic_action(e, ErrAction::Run, "rustfmt".as_ref()), | |
Err(ref e) => panic_action(e, ErrAction::Run, "rustfmt"), |
for arg in self.args.by_ref().take(self.batch_size) { | ||
cmd.arg(arg.as_ref()); | ||
cmd_len += arg.as_ref().len(); | ||
cmd_len += 8; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why 8? Maybe add a small comment ("arbitrary safety gap"?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Eight is for unix based things since the pointers in argv
are part of the limit IIRC. Doesn't really matter since Windows has a way lower limit.
Windows is unfortunately not consistent here since it's one space plus whatever is needed to escape spaces within arguments. I can add a comment about all this mess.
clippy_dev/src/utils.rs
Outdated
cmd_len += arg.as_ref().len(); | ||
cmd_len += 8; | ||
|
||
// Windows has a command length limit of 32767; stop before we hit that. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// Windows has a command length limit of 32767; stop before we hit that. | |
// Windows has a command length limit of 32767; stop before we hit that. | |
// Unix-like systems typically have at least 256k bytes (`getconf ARG_MAX`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unix is actually more complicated than that, but it will almost always be larger than Windows (except cygwin because Windows).
clippy_dev/src/utils.rs
Outdated
|
||
// Windows has a command length limit of 32767; stop before we hit that. | ||
if cmd_len > 30000 { | ||
self.batch_size = (self.args.len() / self.thread_count).max(32); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should probably use .div_ceil()
here (see below), and import min_batch_size
into the struct instead of using the hardcoded 32.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
div_ceil
is correct.
clippy_dev/src/utils.rs
Outdated
if len != 0 { | ||
run_cmd(&mut cmd); | ||
let thread_count = thread::available_parallelism().map_or(1, NonZero::get); | ||
let batch_size = (args.len() / thread_count).max(min_batch_size); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you want to use div_ceil()
for this kind of operations, you want you have 2 threads and 65 files for a min batch size of 32, you would want batches of 33+32, not 32+32+1.
let batch_size = (args.len() / thread_count).max(min_batch_size); | |
let batch_size = args.len().div_ceil(thread_count).max(min_batch_size); |
This is only a problem if people keep extra |
All dot files and target directories are skipped now. Should be good enough to avoid any issues. |
I seem to remember having seen recently in a Zulip thread people saying that they have some I wonder if matching the beginning of the walkdir path against a list of expected prefixes would have any real impact performance (using, e.g., a radix-tree, or even an anchored Aho-Corasick but this might be overkill). |
After thinking more about this, the worst that can happen is that people will get their temporary test code reformatted. That may be for the best. |
This stops using
cargo fmt
and instead calls rustfmt directly with the list of all files.All
cargo fmt
does is find the crate roots and passes the edition fromcargo.toml
. Since the edition is set inrustfmt.toml
for the test files and we're already iterating through all the files this is not needed.--skip-children
is used since we already pass all the files, so the automatic detection isn't buying us anything other than running slower.Second commit(part of the first commit now) is a change to only use theignore
option inrustfmt.toml
rather than having a way incargo dev fmt
to ignore files.r? @samueltardieu
changelog: none