重写历史git filter-branch创建/拆分为子模块/子项目

时间:2012-11-16 22:17:59

标签: git rewrite git-submodules git-filter-branch subproject

我目前正在将一个cvs项目导入git 导入后,我想重写历史记录,将现有目录移动到单独的子模块中。

假设我有这样的结构:

file1
file2
file3
dir1
dir2
library

现在我想重写历史记录,以便目录library始终是一个git子模块。比如说,将指定的目录拆分成它们自己的子模块/子项目

这是我目前的代码:

文件 rewrite-submodule (名为)

cd project
git filter-branch --tree-filter $PWD/../$0-tree-filter --tag-name-filter cat -- --all

文件 rewrite-submodule-tree-filter

    #!/bin/bash

    function gitCommit()
    {
        unset GIT_DIR
        unset GIT_WORK_TREE
        git add -A
        if [ -n "$(git diff --cached --name-only)" ]
        then
            # something to commit
            git commit -F $_msg
        fi
    }

    _git_dir=$GIT_DIR
    _git_work_tree=$GIT_WORK_TREE
    unset GIT_DIR
    unset GIT_WORK_TREE
    _dir=$PWD

    if [ -d "library" ]
    then
        _msg=$(tempfile)
        git log ${GIT_COMMIT}^! --format="%B" > $_msg
        git rm -r --cached lib
        cd library
        if [ -d ".git" ]
        then
            gitCommit
        else
            git init
            gitCommit
        fi
        cd ..
        export GIT_DIR=$_git_dir
        export GIT_WORK_TREE=$_git_work_tree
        git submodule add -f ./lib
    fi

    GIT_DIR=$_git_dir
    GIT_WORK_TREE=$_git_work_tree
    

此代码创建.gitmodules文件,但不创建主存储库中的子模块提交条目(行Subproject commit <sha1-hash>,由git diff输出),目录library中的文件仍然是版本化在主存储库中,而不是在子项目存储库中。

提前感谢任何提示

.gitmodules看起来像这样:
    

    [submodule "library"]
        path = library
        url = ./library
    

4 个答案:

答案 0 :(得分:2)

我解决了自己的问题,这是解决方案:

git-submodule-split library another_library

脚本git-submodule-split

    #!/bin/bash

    set -eu

    if [ $# -eq 0 ]
    then
        echo "Usage: $0 submodules-to-split"
    fi

    export _tmp=$(mktemp -d)
    export _libs="$@"
    for i in $_libs
    do
        mkdir -p $_tmp/$i
    done

    git filter-branch --commit-filter '
    function gitCommit()
    {
        git add -A
        if [ -n "$(git diff --cached --name-only)" ]
        then
            git commit -F $_msg
        fi
    } >/dev/null

    # from git-filter-branch
    git checkout-index -f -u -a || die "Could not checkout the index"
    # files that $commit removed are now still in the working tree;
    # remove them, else they would be added again
    git clean -d -q -f -x

    _git_dir=$GIT_DIR
    _git_work_tree=$GIT_WORK_TREE
    _git_index_file=$GIT_INDEX_FILE
    unset GIT_DIR
    unset GIT_WORK_TREE
    unset GIT_INDEX_FILE

    _msg=$(tempfile)
    cat /dev/stdin > $_msg
    for i in $_libs
    do
        if [ -d "$i" ]
        then
            unset GIT_DIR
            unset GIT_WORK_TREE
            unset GIT_INDEX_FILE
            cd $i
            if [ -d ".git" ]
            then
                gitCommit
            else
                git init >/dev/null
                gitCommit
            fi
            cd ..
            rsync -a -rtu $i/.git/ $_tmp/$i/.git/
            export GIT_DIR=$_git_dir
            export GIT_WORK_TREE=$_git_work_tree
            export GIT_INDEX_FILE=$_git_index_file
            git rm -q -r --cached $i
            git submodule add ./$i >/dev/null
            git add $i
        fi
    done
    rm $_msg
    export GIT_DIR=$_git_dir
    export GIT_WORK_TREE=$_git_work_tree
    export GIT_INDEX_FILE=$_git_index_file

    if [ -f ".gitmodules" ]
    then
        git add .gitmodules
    fi

    _new_rev=$(git write-tree)
    shift
    git commit-tree "$_new_rev" "$@";
    ' --tag-name-filter cat -- --all

    for i in $_libs
    do
        if [ -d "$_tmp/$i/.git" ]
        then
            rsync -a -i -rtu $_tmp/$i/.git/ $i/.git/
            cd $i
            git reset --hard
            cd ..
        fi
    done
    rm -r $_tmp

    git for-each-ref refs/original --format="%(refname)" | while read i; do git update-ref -d $i; done

    git reflog expire --expire=now --all
    git gc --aggressive --prune=now

    

答案 1 :(得分:1)

我有一个带有utils库的项目,该项目开始在其他项目中很有用,并且希望将其历史记录拆分为子模块。没想到先看SO,于是我写了自己的书,它在本​​地建立历史记录,因此速度要快一些,之后,如果需要,您可以设置helper命令的.gitmodules文件等,然后推送子模块的历史记录本身就可以随心所欲。

剥离后的命令本身在此处,注释中的文档位于其后的未剥离的命令中。将其作为自己的命令运行,并设置subdir,如果要拆分subdir=utils git split-submodule目录,则像utils一样运行。因为它是一次性的,所以很容易破解,但是我在Git历史记录的Documentation子目录中对其进行了测试。

#!/bin/bash
# put this or the commented version below in e.g. ~/bin/git-split-submodule
${GIT_COMMIT-exec git filter-branch --index-filter "subdir=$subdir; ${debug+debug=$debug;} $(sed 1,/SNIP/d "$0")" "$@"}
${debug+set -x}
fam=(`git rev-list --no-walk --parents $GIT_COMMIT`)
pathcheck=(`printf "%s:$subdir\\n" ${fam[@]} \
    | git cat-file --batch-check='%(objectname)' | uniq`)
[[ $pathcheck = *:* ]] || {
    subfam=($( set -- ${fam[@]}; shift;
        for par; do tpar=`map $par`; [[ $tpar != $par ]] &&
            git rev-parse -q --verify $tpar:"$subdir"
        done
    ))
    git rm -rq --cached --ignore-unmatch  "$subdir"
    if (( ${#pathcheck[@]} == 1 && ${#fam[@]} > 1 && ${#subfam[@]} > 0)); then
        git update-index --add --cacheinfo 160000,$subfam,"$subdir"
    else
        subnew=`git cat-file -p $GIT_COMMIT | sed 1,/^$/d \
            | git commit-tree $GIT_COMMIT:"$subdir" $(
                ${subfam:+printf ' -p %s' ${subfam[@]}}) 2>&-
            ` &&
        git update-index --add --cacheinfo 160000,$subnew,"$subdir"
    fi
}
${debug+set +x}

#!/bin/bash
# Git filter-branch to split a subdirectory into a submodule history.

# In each commit, the subdirectory tree is replaced in the index with an
# appropriate submodule commit.
# * If the subdirectory tree has changed from any parent, or there are
#   no parents, a new submodule commit is made for the subdirectory (with
#   the current commit's message, which should presumably say something
#   about the change). The new submodule commit's parents are the
#   submodule commits in any rewrites of the current commit's parents.
# * Otherwise, the submodule commit is copied from a parent.

# Since the new history includes references to the new submodule
# history, the new submodule history isn't dangling, it's incorporated.
# Branches for any part of it can be made casually and pushed into any
# other repo as desired, so hooking up the `git submodule` helper
# command's conveniences is easy, e.g.
#     subdir=utils git split-submodule master
#     git branch utils $(git rev-parse master:utils)
#     git clone -sb utils . ../utilsrepo
# and you can then submodule add from there in other repos, but really,
# for small utility libraries and such, just fetching the submodule
# histories into your own repo is easiest. Setup on cloning a
# project using "incorporated" submodules like this is:
#   setup:  utils/.git
#
#   utils/.git:
#       @if _=`git rev-parse -q --verify utils`; then \
#           git config submodule.utils.active true \
#           && git config submodule.utils.url "`pwd -P`" \
#           && git clone -s . utils -nb utils \
#           && git submodule absorbgitdirs utils \
#           && git -C utils checkout $$(git rev-parse :utils); \
#       fi
# with `git config -f .gitmodules submodule.utils.path utils` and
# `git config -f .gitmodules submodule.utils.url ./`; cloners don't
# have to do anything but `make setup`, and `setup` should be a prereq
# on most things anyway.

# You can test that a commit and its rewrite put the same tree in the
# same place with this function:
# testit ()
# {
#     tree=($(git rev-parse `git rev-parse $1`: refs/original/refs/heads/$1));
#     echo $tree `test $tree != ${tree[1]} && echo ${tree[1]}`
# }
# so e.g. `testit make~95^2:t` will print the `t` tree there and if
# the `t` tree at ~95^2 from the original differs it'll print that too.

# To run it, say `subdir=path/to/it git split-submodule` with whatever
# filter-branch args you want.

# $GIT_COMMIT is set if we're already in filter-branch, if not, get there:
${GIT_COMMIT-exec git filter-branch --index-filter "subdir=$subdir; ${debug+debug=$debug;} $(sed 1,/SNIP/d "$0")" "$@"}

${debug+set -x}
fam=(`git rev-list --no-walk --parents $GIT_COMMIT`)
pathcheck=(`printf "%s:$subdir\\n" ${fam[@]} \
    | git cat-file --batch-check='%(objectname)' | uniq`)

[[ $pathcheck = *:* ]] || {
    subfam=($( set -- ${fam[@]}; shift;
        for par; do tpar=`map $par`; [[ $tpar != $par ]] &&
            git rev-parse -q --verify $tpar:"$subdir"
        done
    ))

    git rm -rq --cached --ignore-unmatch  "$subdir"
    if (( ${#pathcheck[@]} == 1 && ${#fam[@]} > 1 && ${#subfam[@]} > 0)); then
        # one id same for all entries, copy mapped mom's submod commit
        git update-index --add --cacheinfo 160000,$subfam,"$subdir"
    else
        # no mapped parents or something changed somewhere, make new
        # submod commit for current subdir content.  The new submod
        # commit has all mapped parents' submodule commits as parents:
        subnew=`git cat-file -p $GIT_COMMIT | sed 1,/^$/d \
            | git commit-tree $GIT_COMMIT:"$subdir" $(
                ${subfam:+printf ' -p %s' ${subfam[@]}}) 2>&-
            ` &&
        git update-index --add --cacheinfo 160000,$subnew,"$subdir"
    fi
}
${debug+set +x}

答案 2 :(得分:0)

注意:仅当您从父仓库

执行时才创建子模块条目
git submodule init
git submodule update

您的rewrite-submodule-tree-filter脚本中不需要这些命令,因为它只是关于正确设置.gitmodules文件内容。

只有在您第一次使用父回购时才会执行这些“git submodule”命令:请参阅“Cloning a Project with Submodules”。

答案 3 :(得分:0)

这是一个更新的答案,适用于MacOSX。主要的变化是使用pushd / popd来改变目录,因此子模块可以是模块/ glop,而不仅仅是glop。

#!/bin/bash

set -eu

if [ $# -eq 0 ]
then
    echo "Usage: $0 submodules-to-split"
fi

export _tmp=$(mktemp -d /tmp/git-submodule-split.XXXXXX)
export _libs="$@"
for i in $_libs
do
    mkdir -p $_tmp/$i
done

git filter-branch --commit-filter '
function gitCommit()
{
    git add -A
    if [ -n "$(git diff --cached --name-only)" ]
    then
        git commit -F $_msg
    fi
} >/dev/null

# from git-filter-branch
git checkout-index -f -u -a || die "Could not checkout the index"
# files that $commit removed are now still in the working tree;
# remove them, else they would be added again
git clean -d -q -f -x >&2

_git_dir=$GIT_DIR
_git_work_tree=$GIT_WORK_TREE
_git_index_file=$GIT_INDEX_FILE
unset GIT_DIR
unset GIT_WORK_TREE
unset GIT_INDEX_FILE

_msg=$(mktemp /tmp/git-submodule-split-msg.XXXXXX)
cat /dev/stdin > $_msg
for i in $_libs
do
    if [ -d "$i" ]
    then
        unset GIT_DIR
        unset GIT_WORK_TREE
        unset GIT_INDEX_FILE
        pushd $i > /dev/null
        if [ -d ".git" ]
        then
            gitCommit
        else
            git init >/dev/null
            gitCommit
        fi
        popd > /dev/null
        mkdir -p $_tmp/$i
        rsync -a -rtu $i/.git/ $_tmp/$i/.git/
        export GIT_DIR=$_git_dir
        export GIT_WORK_TREE=$_git_work_tree
        export GIT_INDEX_FILE=$_git_index_file
        git rm -q -r --cached $i >&2
        git submodule add ./$i $i >&2
        git add $i >&2
    fi
done
export GIT_DIR=$_git_dir
export GIT_WORK_TREE=$_git_work_tree
export GIT_INDEX_FILE=$_git_index_file

if [ -f ".gitmodules" ]
then
    git add .gitmodules >&2
fi

_new_rev=$(git write-tree)
shift
git commit-tree -F $_msg "$_new_rev" $@;
rm -f $_msg
' --tag-name-filter cat -- --all

for i in $_libs
do
    if [ -d "$_tmp/$i/.git" ]
    then
        rsync -a -i -rtu $_tmp/$i/.git/ $i/.git/
        pushd $i
        git reset --hard
        popd
    fi
done
rm -rf $_tmp

git for-each-ref refs/original --format="%(refname)" | while read i; do git update-ref -d $i; done

git reflog expire --expire=now --all
git gc --aggressive --prune=now