June 11, 2018
How rebase a repo to a subtree and keep it in sync


I just ran into a little problem.
Working on a project embedding some foreign code, I created a repository to manage this code, as the lead developer didn't provide one.
The integration of the source was directly done at the root of the repository I made.
Later, the developer provided a public repository with just one little difference. All the code was lying in a subfolder.
Looking for a way to keep in sync my repository directly from his own, and as all git mechanisms I've found are only working with descendant subtree, I created the following script :

#! /bin/bash
# this script will help you to rebase a repository to a subtree of it
# check for provided directory
if [ -z "$1" ]
    echo "Need the directory to get content rebased !"
    exit 1
DIR=`echo $1 | sed 's/\/$//'`
BRANCH=$(git rev-parse --abbrev-ref HEAD)

# check provided directory exists
if [ -z `git ls-tree master | grep -o $DIR` ]
    echo "$DIR does not seem to exist on branch : $BRANCH !"
    exit 1

# check for existing target branch
git show-ref --verify --quiet "refs/heads/$NEWBRANCH"
if [[ $? != 0 ]]
    HASH="$(git rev-parse $BRANCH)"
    # checkout the target tree in a new branch
    git checkout -b $NEWBRANCH $HASH
    # rebase the branch
    git checkout $NEWBRANCH && git rebase -p $BRANCH

# check we are on the right branch
if [[ `git rev-parse --abbrev-ref HEAD` == "$NEWBRANCH" ]]
    FILTER="find $DIR/  -maxdepth 1 -type f | xargs -I{} -e mv {} . ; mv $DIR/* .; rmdir $DIR"
    git filter-branch -f --prune-empty --tree-filter "$FILTER" -- && git gc --aggressive

How to use

  1. copy & paste this code in a bash script, (don't forget to make it runnable).
  2. change directory to the cloned repository and checkout the branch you want to be the source
  3. run the script passing the folder to get the content at the root of a new branch

What this script does

It creates a new tree based on the folder subtree and publish it in a new branch named 'new-SOURCE_BRANCH-rebased' where SOURCE_BRANCH is the name of the branch you were before running this script.
Then it rebases all original log information to keep consistency.

If you run this script a second time from the source branch, it will update the target branch.


EDIT : the last version of this script is available on github here : git-rebasetosubtree.sh