Centralised pre-commit/pre-push hook repo design covering the token leak root causes: unactivated hooksPath and insufficient regex coverage.
5.2 KiB
CircuitForge Hooks — Secret & PII Scanning Design
Date: 2026-03-07 Scope: All CircuitForge repos (Peregrine first; others on public release) Status: Approved, ready for implementation
Problem
A live Forgejo API token was committed in docs/plans/2026-03-03-feedback-button-plan.md
and required emergency history scrubbing via git-filter-repo. Root causes:
core.hooksPathwas never configured — the existing.githooks/pre-commitran on zero commits- The token format (
FORGEJO_API_TOKEN=<hex>) matched none of the hook's three regexes - No pre-push safety net existed
Solution
Centralised hook repo (circuitforge-hooks) shared across all products.
Each repo activates it with one command. The heavy lifting is delegated to
gitleaks — an actively-maintained binary with 150+ built-in secret patterns,
native Forgejo/Gitea token detection, and a clean allowlist system.
Repository Structure
/Library/Development/CircuitForge/circuitforge-hooks/
├── hooks/
│ ├── pre-commit # gitleaks --staged scan (fast, every commit)
│ ├── commit-msg # conventional commits enforcement
│ └── pre-push # gitleaks full-branch scan (safety net)
├── gitleaks.toml # shared base config
├── install.sh # wires core.hooksPath in the calling repo
├── tests/
│ └── test_hooks.sh # migrated + extended from Peregrine
└── README.md
Forgejo remote: git.opensourcesolarpunk.com/pyr0ball/circuitforge-hooks
Hook Behaviour
pre-commit
- Runs
gitleaks protect --staged— scans only the staged diff - Sub-second on typical commits
- Blocks commit and prints redacted match on failure
- Merges per-repo
.gitleaks.tomlallowlist if present
pre-push
- Runs
gitleaks git— scans full branch history not yet on remote - Catches anything committed with
--no-verifyor before hooks were wired - Same config resolution as pre-commit
commit-msg
- Enforces conventional commits format (
type(scope): subject) - Migrated unchanged from
peregrine/.githooks/commit-msg
gitleaks Config
Shared base (circuitforge-hooks/gitleaks.toml)
title = "CircuitForge secret + PII scanner"
[extend]
useDefault = true # inherit all 150+ built-in rules
[[rules]]
id = "cf-generic-env-token"
description = "Generic KEY=<token> in env-style assignment"
regex = '''(?i)(token|secret|key|password|passwd|pwd|api_key)\s*[=:]\s*['\"]?[A-Za-z0-9\-_]{20,}['\"]?'''
[rules.allowlist]
regexes = ['api_key:\s*ollama', 'api_key:\s*any']
[[rules]]
id = "cf-phone-number"
description = "US phone number in source or config"
regex = '''\b(\+1[\s\-.]?)?\(?\d{3}\)?[\s\-.]?\d{3}[\s\-.]?\d{4}\b'''
[rules.allowlist]
regexes = ['555-\d{4}', '555\.\d{4}', '5550', '1234567890', '0000000000']
[[rules]]
id = "cf-personal-email"
description = "Personal email address in source/config (not .example files)"
regex = '''[a-zA-Z0-9._%+\-]+@(gmail|yahoo|icloud|hotmail|outlook|proton)\.(com|me)'''
[rules.allowlist]
paths = ['.*\.example$', '.*test.*', '.*docs/.*']
[allowlist]
description = "CircuitForge global allowlist"
paths = [
'.*\.example$',
'docs/reference/.*',
'gitleaks\.toml$',
]
regexes = [
'sk-abcdefghijklmnopqrstuvwxyz',
'your-forgejo-api-token-here',
]
Per-repo override (e.g. peregrine/.gitleaks.toml)
[extend]
path = "/Library/Development/CircuitForge/circuitforge-hooks/gitleaks.toml"
[allowlist]
regexes = [
'\d{10}\.html', # Craigslist listing IDs (10-digit, look like phone numbers)
]
Activation Per Repo
Each repo's setup.sh or manage.sh calls:
bash /Library/Development/CircuitForge/circuitforge-hooks/install.sh
install.sh does exactly one thing:
git config core.hooksPath /Library/Development/CircuitForge/circuitforge-hooks/hooks
For Heimdall live deploys (/devl/<repo>/), the same line goes in the deploy
script / post-receive hook.
Migration from Peregrine
peregrine/.githooks/pre-commit→ replaced by gitleaks wrapperperegrine/.githooks/commit-msg→ copied verbatim to hooks repoperegrine/tests/test_hooks.sh→ migrated and extended in hooks repoperegrine/.githooks/directory → kept temporarily, then removed after cutover
Rollout Order
circuitforge-hooksrepo — create, implement, testperegrine— activate (highest priority, already public)circuitforge-license(heimdall) — activate before any public release- All subsequent repos — activate as part of their public-release checklist
Testing
tests/test_hooks.sh covers:
- Staged file with live-format token → blocked
- Staged file with phone number → blocked
- Staged file with personal email in source → blocked
.examplefile with placeholders → allowed- Craigslist URL with 10-digit ID → allowed (Peregrine allowlist)
- Valid conventional commit message → accepted
- Non-conventional commit message → rejected
What This Does Not Cover
- Scanning existing history on new repos (run
gitleaks gitmanually before making any repo public — add to the public-release checklist) - CI/server-side enforcement (future: Forgejo Actions job on push to main)
- Binary files or encrypted secrets at rest