-
-
Notifications
You must be signed in to change notification settings - Fork 5.8k
add push secrets detector #34226
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
add push secrets detector #34226
Conversation
use git diff based one
fix linting
I think this is the right place to detect. But the detection may take much time. It should have a queue to handle them and not return the detection result immediately. |
The entire point of the feature is push rejection if the change contains secrets and if that gets pushed to queue to do later it nullfies the point of it being a pre-recieve hook. |
@wxiaoguang Mind also taking a look when you have a moment? You usually have good ideas and insights about things that would be better to do before a change/feature and a more native way to do it. btw. I'm aware of the fault in the diffing when refs are added removed - the end behavior should be:
I'm in progress of figuring out how to do 2, without scanning the entire history if not required. |
What if I'd like to store secret-like data in git repo? For example: I have a repo that contains encrypted passwords (of course, the master key is managed separately, not in the repo). |
That's why it would be off by default which is in TODOs. If you'd like to keep it on and store secrets, you'd have to add them to allowlist in config. I don't think gitleaks will classify encrypted passwords as keys but that's a good case to test for. |
Oh I see the "repo config" TODO now. Maybe 2 approaches:
I didn't know how other forges do so I can't tell which one is better .....
TBH I am not quite familiar with the "git hook" related code, can't really tell the differences between these cases (how the result would be affected) .... or maybe make it configurable to let repo admin decide? Or the same question as above: how do other forges do? |
One example of similar hook I found is this: https://github.com/github/platform-samples/blob/master/pre-receive-hooks/block_confidentials.sh As to how others do it: Github doesn't specify much but notes that it logs bypasses in security tab https://docs.github.com/en/code-security/secret-scanning/introduction/about-push-protection As for more details on how they allow configs: Mostly via web UI updates. I'd argue that making the ruleset part of the repository is better as it has benefits of history tracking, messages why was the rule added and usual git benefits. |
I am neutral about these, while I think I could understand why "mostly via web UI updates": it is more friendly to end users. Not everyone is git expert or could commit the ruleset file into a git repo easily and debug it. (Hmm, TBH, I have a little more preference for "web UI update") |
Honestly one doesn't exclude the other. This is a git server after all so we can have the config displayed in UI but stored in repo (gerrit like).
Never mind - I was printing newCommitId instead of the commit from finding and assumed it was an issue. It's not. |
add finding base for unknown starting point add bypass for pushing
Okay I think the PR by now it's out of it's PoC phase and into "good enough for experimental releases" so if someone wants to try and mess with the feature please do. I think the approach of blocking until the scan completes is correct. I've pushed linux 0af2f6be1b42..6fea5fabd332 Apart of UI issues, error handling in the function and repo level feature flag, is there anything I'm missing which should be added/addressed in v1 of this feature? There are also tests to make but currently I don't have a clue how to approach that. I was thinking about redoing hooks more in gitaly direction (grpc streaming stdout,stderr) but that's an idea for later. |
How much overhead does token scanning add during a git push? |
base the attempt on a change between default branch and new reference
I've tested pushing linux master to new repo (so my worst case scenario - a lot of commits which are not in default branch) - found two bugs, one with gitea UI other in my implementation. One thing that could be optimized is to avoid storing the output and parsing it on the fly though I had no luck getting that working so far. Or storing it in a temp file and letting git write to it while parser reads it in the background (how well that would work I have no idea). The diff parser runs in a go routine and writes to a channel and gitleaks consumes that channel so ideally it would be about even but the issue is Linux repo also might not be the best benchmark for this due to the size, but I think it's worth trying in the unlikely event this will be somehow used during migrations. Ideas welcome because I'm currently out - on |
adds a step to pre-commit hook which scans commit diffs for secrets
Very WIP/PoC, more looking for a feedback on the approach.
TODO: