git - getting ALL previous version of a specific file/folder

I want to retrieve all previous version of a specific file in a git repository.

I see it is possible to get one specific version with the checkout command, but I want them all. And the git clone command with the depth option doesn't seem to allow me to clone subfolder ("not valid repository name").

Do you know if it is possible and how?

Thank you

Answers


OP wanted to retrieve all versions, but the answers would not deliver. Especially if the file has hundreds of revisions (all suggestions are too manual). The only half-working solution was proposed by @Tobias in the comments, but suggested bash loop would build files in random order as well as it generates hundreds of empty files when used against our repos. One of the reasons was that "rev-list --all --objects" would list different objects (trees included - but useless for our purpose).

I started with Tobias's solution, added counters, clean up a bit and end up reinventing the wheel in form of the bash script listed below.

The script would: - extract all file versions to /tmp/all_versions_exported - take 1 argument - relative path to the file inside git repo - give result filenames numeric prefix (sortable) - mention inspected filename in result files (to tell apples apart from oranges:) - mention commit date in the result filename (see output example below) - not create empty result files

cat /usr/local/bin/git_export_all_file_versions

#!/bin/bash

# we'll write all git versions of the file to this folder:
EXPORT_TO=/tmp/all_versions_exported

# take relative path to the file to inspect
GIT_PATH_TO_FILE=$1

# ---------------- don't edit below this line --------------

USAGE="Please cd to the root of your git proj and specify path to file you with to inspect (example: $0 some/path/to/file)"

# check if got argument
if [ "${GIT_PATH_TO_FILE}" == "" ]; then
    echo "error: no arguments given. ${USAGE}" >&2
    exit 1
fi

# check if file exist
if [ ! -f ${GIT_PATH_TO_FILE} ]; then
    echo "error: File '${GIT_PATH_TO_FILE}' does not exist. ${USAGE}" >&2
    exit 1
fi

# extract just a filename from given relative path (will be used in result file names)
GIT_SHORT_FILENAME=$(basename $GIT_PATH_TO_FILE)

# create folder to store all revisions of the file
if [ ! -d ${EXPORT_TO} ]; then
    echo "creating folder: ${EXPORT_TO}"
    mkdir ${EXPORT_TO}
fi

## uncomment next line to clear export folder each time you run script
#rm ${EXPORT_TO}/*

# reset coutner
COUNT=0

# iterate all revisions
git rev-list --all --objects -- ${GIT_PATH_TO_FILE} | \
    cut -d ' ' -f1 | \
while read h; do \
     COUNT=$((COUNT + 1)); \
     COUNT_PRETTY=$(printf "%04d" $COUNT); \
     COMMIT_DATE=`git show $h | head -3 | grep 'Date:' | awk '{print $4"-"$3"-"$6}'`; \
     if [ "${COMMIT_DATE}" != "" ]; then \
         git cat-file -p ${h}:${GIT_PATH_TO_FILE} > ${EXPORT_TO}/${COUNT_PRETTY}.${COMMIT_DATE}.${h}.${GIT_SHORT_FILENAME};\
     fi;\
done    

# return success code
echo "result stored to ${EXPORT_TO}"
exit 0

Usage example:

cd /home/myname/my-git-repo

git_export_all_file_versions docs/howto/readme.txt
    result stored to /tmp/all_versions_exported

ls /tmp/all_versions_exported
    0001.17-Oct-2016.ee0a1880ab815fd8f67bc4299780fc0b34f27b30.readme.txt
    0002.3-Oct-2016.d305158b94bedabb758ff1bb5e1ad74ed7ccd2c3.readme.txt
    0003.29-Sep-2016.7414a3de62529bfdd3cb1dd20ebc1a977793102f.readme.txt
    0004.28-Sep-2016.604cc0a34ec689606f7d3b2b5bbced1eece7483d.readme.txt
    0005.28-Sep-2016.198043c219c81d776c6d8a20e4f36bd6d8a57825.readme.txt
    0006.9-Sep-2016.5aea5191d4b86aec416b031cb84c2b78603a8b0f.readme.txt
    <and so on and on . . .>

edit: if you see errors like this:

fatal: Not a valid object name 3e93eba38b31b8b81905ceaa95eb47bbaed46494:readme.txt

it means you've started the script not from the root folder of your git project.


The script provided by Dmitry does actually solve the problem, but it had a few issues that led me to adapt it to be more suitable for my needs. Specifically:

  1. The use of git show broke because of my default date-format settings.
  2. I wanted the results sorted in date order, not reverse-date order.
  3. I wanted to be able to run it against a file that had been deleted from the repo.
  4. I didn't want all revisions on all branches; I just wanted the revisions reachable from HEAD.
  5. I wanted it to error if it wasn't in a git repo.
  6. I didn't want to have to edit the script to adjust certain options.
  7. The way it worked was inefficient.
  8. I didn't need the numbering in the output filenames. (A suitably-formatted date serves the same purpose.)
  9. I wanted safer "paths with spaces" handling

You can see the latest version of my modifications in my github repo or here's the version as of this writing:

#!/bin/sh

# based on script provided by Dmitry Shevkoplyas at http://stackoverflow.com/questions/12850030/git-getting-all-previous-version-of-a-specific-file-folder

set -e

if ! git rev-parse --show-toplevel >/dev/null 2>&1 ; then
    echo "Error: you must run this from within a git working directory" >&2
    exit 1
fi

if [ "$#" -lt 1 ] || [ "$#" -gt 2 ]; then
    echo "Usage: $0 <relative path to file> [<output directory>]" >&2
    exit 2
fi

FILE_PATH="$1"

EXPORT_TO=/tmp/all_versions_exported
if [ -n "$2" ]; then
    EXPORT_TO="$2"
fi

FILE_NAME="$(basename "$FILE_PATH")"

if [ ! -d "$EXPORT_TO" ]; then
    echo "Creating directory '$EXPORT_TO'"
    mkdir -p "$EXPORT_TO"
fi

echo "Writing files to '$EXPORT_TO'"
git log --diff-filter=d --date-order --reverse --format="%ad %H" --date=iso-strict "$FILE_PATH" | grep -v '^commit' | \
    while read LINE; do \
        COMMIT_DATE=`echo $LINE | cut -d ' ' -f 1`; \
        COMMIT_SHA=`echo $LINE | cut -d ' ' -f 2`; \
        printf '.' ; \
        git cat-file -p "$COMMIT_SHA:$FILE_PATH" > "$EXPORT_TO/$COMMIT_DATE.$COMMIT_SHA.$FILE_NAME" ; \
    done
echo

exit 0

An example of the output:

$ git_export_all_file_versions bin/git_export_all_file_versions /tmp/stackoverflow/demo
Creating directory '/tmp/stackoverflow/demo'
Writing files to '/tmp/stackoverflow/demo'
...

$ ls -1 /tmp/stackoverflow/demo/
2017-05-02T15:52:52-04:00.c72640ed968885c3cc86812a2e1aabfbc2bc3b2a.git_export_all_file_versions
2017-05-02T16:58:56-04:00.bbbcff388d6f75572089964e3dc8d65a3bdf7817.git_export_all_file_versions
2017-05-02T17:05:50-04:00.67cbdeab97cd62813cec58d8e16d7c386c7dae86.git_export_all_file_versions

git rev-list --all --objects -- path/to/file.txt

lists you all the blobs associated with the repo path

To get a specific version of a file

git cat-file -p commitid:path/to/file.txt

(commitid can be anything

  • symbolic ref (branch, tag names; remote too)
  • a commit hash
  • a revision spec like HEAD~3, branch1@{4} etc.

Sometimes old versions of a file are only available through git reflog. I recently had a situation where I needed to dig through all the commits, even ones that were no longer part of the log because of an accidental overwriting during interactive rebasing.

I wrote this Ruby script to output all the previous versions of the file to find the orphaned commit. It was easy enough to grep the output of this to track down my missing file. Hope it helps someone.

#!/usr/bin/env ruby
path_to_file = ""
`git reflog`.split("\n").each do |log|
   puts commit = log.split(" ").first
   puts `git show #{commit}:#{path_to_file}`
   puts
 end

The same thing could be done with git log.


All the versions of a file are already in the git repo when you git clone it. You can create branches associated with the checkout of a particular commit:

git checkout -b branchname {commit#}

This might suffice for a quick and dirty manual comparison of changes:

  • checkout to branches
  • Copy to an editor buffer

This might be ok, if you only have a few versions to be concerned with and don't mind a bit of manual, albeit git built-in commands.

For scripted solutions, there are already a couple of other solutions that were provided in other answers.


Need Your Help

how to create a 404 component in vuejs using vue-router

vue.js http-status-code-404 vue-component vue-router

I'm new to vuejs and I'm working on my first project with vue. I'm just wondering how I will route to my 404.vue component when the requested url is not found.

python design patterns

python design-patterns

I am looking for any resources that gives examples of Best Practices, Design patterns and the SOLID principles using Python.