Tuesday, September 30, 2014

Review entire git commit history of a file, for security issues, passwords, sensitive information, etc.

Say you want to push a git repo. that you've been working on privately to a publicly viewable place, how do you review not just the current working copy / HEAD for sensitive information, but the whole history of files that might contain details you don't want to make public? Something like this seems reasonably efficient:

git log --patch --reverse path/to/file.py \
  | grep '\(^+\|^commit\)' \
  | sed 's/^+//'
Including only lines starting with "+" or "commit" shows you only what's added to the file. As long as you start at the beginning, the deletions (lines starting with "-") and context (everything else) don't matter.

Deleting the '+' at the beginning of each line means you can dump the output from the above into your favorite editor to get syntax highlighting, which perhaps makes it easier to read. You might need to use a temporary file with an appropriate extension, e.g.:

git log --patch --reverse path/to/file.py \
  | grep '\(^+\|^commit\)' \
  | sed 's/^+//; s/^commit/#commit/' \
  >delme.py
(adding the # in front of commit lines is nice with syntax highlighting in Python)