Skip to content

new restriction lint: pointer_format #14792

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jun 1, 2025

Conversation

llogiq
Copy link
Contributor

@llogiq llogiq commented May 13, 2025

I read a blog post about kernel security, and how various features might get lost while porting to Rust. In kernel C, they have some guardrails against divulging pointers. An easy way to replicate that in Rust is a lint for pointer formatting. So that's what this lint does.


changelog: new [pointer_format] lint

@rustbot
Copy link
Collaborator

rustbot commented May 13, 2025

r? @samueltardieu

rustbot has assigned @samueltardieu.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

@rustbot rustbot added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties label May 13, 2025
@samueltardieu
Copy link
Contributor

Lint example needs to be fixed. FCP thread created on Zulip.

@rustbot label +S-final-comment-period

@rustbot rustbot added the S-final-comment-period Status: final comment period it will be merged unless new objections are raised (~1 week) label May 13, 2025
@tgross35
Copy link
Contributor

Is there any chance this could be adapted to check for any fmt trait being used on a pointer, rather than just the :p formatting? @nbdd0121 pointed out that a likely accident is printing via Debug, possibly through derives.

@llogiq
Copy link
Contributor Author

llogiq commented May 13, 2025

Do you mean debug-formatting a function pointer? That should be workable for the direct case but might become pretty hard for the general case.

@tgross35
Copy link
Contributor

I think a specific example would be something like this:

#[derive(Debug)]
pub struct S {
    pub p: *const u32,
}

pub fn foo(a: *const u32) {
    println!("{a:?}");
}

pub fn bar(a: S) {
    println!("{a:?}");
}

fn main() {
    foo(&0u32);
    bar(S{p: &0u32});
}

I'm not sure how this would be done in clippy since it would need to check whether a crate calls the Pointer / Debug implementations for raw pointers in the crate. If this isn't easy then checking directly in the format string still does help.

@llogiq
Copy link
Contributor Author

llogiq commented May 13, 2025

A thing we might do there is to walk the types of the arguments (as far as they derive Debug, we should not go into manual implementations) and see if there's any raw pointer or reference.

Comment on lines 202 to 205
/// In kernel context, this might be vulnerable to misuse for exfiltrating
/// stack or kernel function addresses.
Copy link
Contributor

@kpreid kpreid May 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

“…misuse for exfiltrating…” makes it sound like this is protection against attempts to insert malicious code into a project. Is auditing such code within scope for clippy? If not, it might be good to change the wording.

Also, “kernel context” isn’t very clear to people not familiar with what it’s talking about and sounds like it might be a formal or common Rust term, which it is not. How about something more like “In (projects|codebases|crates) such as operating system kernels,”?

@llogiq
Copy link
Contributor Author

llogiq commented May 16, 2025

Yeah, I'm working on some code to add that. I may not be able to look through manual Debug impls, though, so it's a best-effort thing at best.

@ojeda
Copy link
Contributor

ojeda commented May 18, 2025

Ideally, there would be a way to customize all pointer printing, i.e. including :p and :?, so that we can get them hashed as expected, and thus at that point we wouldn't need to lint for it. Meanwhile, this lint can be useful (and may be for other projects that do not want any printing whatsoever, too).

@samueltardieu samueltardieu added S-waiting-on-author Status: This is awaiting some action from the author. (Use `@rustbot ready` to update this status) and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties labels May 23, 2025
@llogiq
Copy link
Contributor Author

llogiq commented May 24, 2025

I'm currently trying to do that, but I hit a wall trying to find the Debug impl to check if it was derived.

@Jarcho
Copy link
Contributor

Jarcho commented May 24, 2025

TyCtxt::is_automatically_derived should return true on the impl's DefId. It's not necessarily std's derive, but it should work for an initial implementation.

@llogiq
Copy link
Contributor Author

llogiq commented May 24, 2025

Yeah, the problem is to get the impl's DefId.

@Jarcho
Copy link
Contributor

Jarcho commented May 24, 2025

Whoops. Slightly misread your comment. Instance::try_resolve with the Debug trait as the DefId and the expression's type as the generic arguments should get you that. You might need to use Debug::fmt instead, not sure.

@llogiq
Copy link
Contributor Author

llogiq commented May 25, 2025

Ok, thanks to @compiler-errors' help, I was able to fix the problem. I will need to rebase and also I should add a lot of caching, because we're currently giving trait lookup a real workout.

@rustbot

This comment has been minimized.

@llogiq
Copy link
Contributor Author

llogiq commented May 25, 2025

Now I at least cached the result of the Debug derive check. There may be more perf wins to be had by caching the result of the check and percolating it upward across the type tree, but that requires more bookkeeping, so I'm leaving it for later.

@llogiq
Copy link
Contributor Author

llogiq commented May 25, 2025

@rustbot ready

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties and removed S-waiting-on-author Status: This is awaiting some action from the author. (Use `@rustbot ready` to update this status) labels May 25, 2025
@samueltardieu
Copy link
Contributor

Now I at least cached the result of the Debug derive check. There may be more perf wins to be had by caching the result of the check and percolating it upward across the type tree, but that requires more bookkeeping, so I'm leaving it for later.

Maybe has_derived_debug and has_pointer_debug could become compiler queries.

@llogiq llogiq force-pushed the pointer-format branch 2 times, most recently from 201883c to 9568bea Compare May 25, 2025 18:51
@llogiq
Copy link
Contributor Author

llogiq commented May 25, 2025

I normalized all types and added some simple caching of the pointer debug check (doesn't take in-between types into account yet, for that I would need to keep another stack).

@llogiq llogiq requested a review from Jarcho May 25, 2025 18:53
Comment on lines 635 to 638
ty::Ref(_, t, _) | ty::Slice(t) | ty::Array(t, _) => {
open.push(tcx.normalize_erasing_regions(typing_env, *t));
},
ty::Tuple(ts) => open.extend(ts.iter().map(|ty| tcx.normalize_erasing_regions(typing_env, ty))),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the type is already normalized you don't need to normalize the contained types here. It's only ADT fields that normalization doesn't affect.

Copy link
Contributor

@Jarcho Jarcho left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a recursive solution is better for this since it will allow you to stop as soon as you get a cache hit. Basically:

match cache.entry(ty) {
  Vacant(e) => e.insert(false),
  Occupied(res) => return res,
}
let res = calculate(ty);
cache.insert(res);
res

Right now your most common cache hit (false) still needs to walk the type.

@llogiq
Copy link
Contributor Author

llogiq commented May 25, 2025

Which cache hit do you mean? I have two caches: One is the cache that stores whether a type has derived Debug. The other is whether a debug format shows a pointer. For the latter, I only store the outermost type. Doing all types would require storing the indices of the "stack frames" in our open stack. With that, I could walk the stack, adding the result to the last element of each frame.

@Jarcho
Copy link
Contributor

Jarcho commented May 25, 2025

The cache for whether a pointer is printed.

@llogiq
Copy link
Contributor Author

llogiq commented May 26, 2025

Ok, I'll do the bookkeeping approach. I'd also like to note that I switched from the recursive variant to the loop because I got a stack overflow in dogfood, even with the visited set. Apparently there are some very deep types in the Rust compiler.

@Jarcho
Copy link
Contributor

Jarcho commented May 26, 2025

That's weird. You shouldn't be nesting anywhere near deep enough to do that.

If you weren't normalizing then that would probably be why. It's possible to end up with an infinite number of different projections that all resolve to the same type.

@llogiq
Copy link
Contributor Author

llogiq commented May 26, 2025

Ok, I reverted to a recursive solution.

@llogiq llogiq dismissed Jarcho’s stale review May 26, 2025 18:50

Implemented the suggestions

@y21
Copy link
Member

y21 commented May 26, 2025

I think there's still some stack overflows possible with truly infinitely recursive types (although that's not recursion at fault, it'd still lead to a hang with a loop), e.g. this crashes

#![warn(clippy::pointer_format)]
use core::marker::PhantomData;
use core::fmt::Debug;

trait Foo { type Assoc: Foo + Debug; }

#[derive(Debug)]
struct S<T: Foo + 'static>(&'static S<T::Assoc>, PhantomData<T>);

fn f<T: Foo + Debug + 'static>(s: &S<T>) {
    format!("{s:?}");
}

(Admittedly it's a weird/arbitrary repro, but the root issue is similar to a crash that was reported before: #13544)

Other lints have solved this using tcx.recursion_limit() and a depth parameter and just bailing out with false if recursing too deeply

Copy link
Contributor

@samueltardieu samueltardieu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you make sure to include the example in #14792 (comment) and use the tcx.recursion_limit() to stop the infinite looping?

@rustbot rustbot added S-waiting-on-author Status: This is awaiting some action from the author. (Use `@rustbot ready` to update this status) and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties labels May 28, 2025
@llogiq
Copy link
Contributor Author

llogiq commented May 29, 2025

Ok, the recursion limit is added.

@llogiq llogiq dismissed samueltardieu’s stale review May 29, 2025 08:23

applied requested changes

@samueltardieu
Copy link
Contributor

LGTM, just waiting for another team member which is not you or me to +1 the FCP.

@samueltardieu samueltardieu added this pull request to the merge queue Jun 1, 2025
Merged via the queue into rust-lang:master with commit 1652187 Jun 1, 2025
11 checks passed
@llogiq llogiq deleted the pointer-format branch June 1, 2025 13:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-final-comment-period Status: final comment period it will be merged unless new objections are raised (~1 week) S-waiting-on-author Status: This is awaiting some action from the author. (Use `@rustbot ready` to update this status)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants