我正面临将几个存储库合并到一个存储库中的问题,杂项文件到处移动
根据对SO的一些研究,SO, how to merge repositories我得出了以下草图:
user=some_user
new_superproj=new_proj # new repository, will include old repositories
hosting=bitbucket.org # gitgub etc
r1=repo1 # repo 1 to merge
r2=repo2
...
# clone to the new place. These are throw-away (!!!) directory
git clone git@${hosting}:${some_user}/${r1}.git
git clone git@${hosting}:${some_user}/${r2}.git
...
mkdir ${new_superproj} && cd ${new_superproj}
# dummy commit so we can merge
git init
dir > deleteme.txt
git add .
git commit -m "Initial dummy commit"
git rm ./deleteme.txt
git commit -m "Clean up initial file"
# repeat for all source repositories
repo=${r1}
pushd .
cd ../${repo}
# In the throw-away repository, move to the subfolder and rewrite log
git filter-branch --index-filter '
git ls-files -s |
sed "s,\t,&'"${repo}"'/," |
GIT_INDEX_FILE=$GIT_INDEX_FILE.new git update-index --index-info &&
mv $GIT_INDEX_FILE.new $GIT_INDEX_FILE
' HEAD
popd
# now bring data in to the new repository
git remote add -f ${repo} ../${repo}
git merge --allow-unrelated-histories ${repo}/master -m "Merging repo ${repo} in"
# remove remote to throw-away repo
git remote rm ${repo}
到目前为止,一切都很好,除非我们想在保留日志的同时移动文件。 Git在移动/重命名上很烂,并且日志重写片段的适应性很差,因此重写的方式统一,对于整个目录都是递归的
想法是,当文件移动时,我们知道存储库中没有其他更改,而是重命名和移动了。因此,如何将每个文件的以下部分重写为规范的。取自git filter-branch, official documentation
git filter-branch --index-filter \
'git ls-files -s | sed "s-\t\"*-&newsubdir/-" |
GIT_INDEX_FILE=$GIT_INDEX_FILE.new \
git update-index --index-info &&
mv "$GIT_INDEX_FILE.new" "$GIT_INDEX_FILE"' HEAD
我很难理解'sed'之后的内容以及它如何应用于git filter-branch
我想运行脚本(bash,python等),所以:
for each file in repository get moved/renamed
...
# in the loop, moved/renamed file found
old_file="..." # e.g. a/b/c/old_name.txt
new_file="..." # e.g. a/b/f/g/new_name.txt, at this point it is known, old_file and new_file is the same file
update_log_paths(old_file, new_file) # <--- this part is needed
有什么想法吗?
答案 0 :(得分:1)
事实证明,从以下命令Move file-by-file in git进行提示,它就像(伪代码)一样简单:
move_files
cd repo_root
git add . # so changes detected as moves, vs added/deleted
repo_moves=collect_moves_data()
git reset HEAD && git checkout . && git clean -df . # undo all moves
我发现最大的误解是“ git log --follow”或其他“更强”的选项不适用于许多相关的SO问题:
git log --follow <file>
在移动之前不显示日志,而在未更改的情况下提交文件。
for each_move in repo_moves
old_file, new_file=deduct_old_new_name(each_move)
new_dir=${new_file%/*}
filter="$filter \n\
if [ -e \"${old_file}\" ]; then \n\
echo \n\
if [ ! -e \"${new_dir}\" ]; then \n\
mkdir --parents \"${new_dir}\" && echo \n\
fi \n\
mv \"${old_file}\" \"${new_file}\" \n\
fi \n\
"
git filter-branch -f --index-filter "`echo -e $filter`"
如果您需要回来:
git pull # with merge
git reset --hard <hash> # get hash of your origin/master, orignin/HEAD), which will be HEAD~2, but I'd check it manually and copy/paste hash