将文件和子文件夹的git历史导出到另一个现有的git repo中吗?

时间:2019-05-27 13:14:05

标签: git

请考虑粘贴在本文末尾的Bash脚本testgit.sh,它将在此处重建存储库示例。

因此,我有一个oldrepo_git存储库,其中包含一些文件和文件夹-然后是一个newrepo_git存储库,它只有一个提交(对于README)。这是gitk --all在这些存储库中看到的:

oldrepo_newrepo

基本上,我想导出文件a.txt的整个git历史记录以及aa子文件夹的全部(因此,aa/aa.txtaa/ab.txt文件- (而不是README库中的b.txtoldrepo_git文件),然后将其导入newrepo_git存储库中-如果可能,请使用正确的时间戳记和分支/合并信息。

由于README中名为oldrepo_git的文件不是此操作的一部分,并且由于newrepo_git除了README文件之外没有其他内容,所以我不希望有任何文件发生冲突。但是,我不确定可以使用哪些命令来执行此操作:我知道有git filter-branch,但是据我所知,它将就地更改oldrepo_git“的历史记录“-它不会将此历史记录“导入”到newrepo_git

换句话说,如果oldrepo_git的历史记录是:

$ git log --oneline --graph
*   7e26890 (HEAD -> master) Merge branch 'testbranch'
|\
| * 56ef109 (testbranch) change 5 made
| * 1a78db3 change 4 made
| * d98b4cf change 3 made
| * e5e49af change 2 made
| * 8704c24 change 1 made
|/
* f318d97 added a.txt
* 252bf7f Initial commit

...完成该过程后,我希望将其视为newrepo_git的历史记录:

$ git log --oneline --graph
*   XXXYYGG (HEAD -> master) Merge branch 'testbranch'
|\
| * XXXYYFF (testbranch) change 5 made
| * XXXYYEE change 4 made
| * XXXYYDD change 3 made
| * XXXYYCC change 2 made
| * XXXYYBB change 1 made
|/
* XXXYYAA added a.txt
* 8e99c2d Initial commit by Bob

如何执行此操作?


Bash脚本testgit.sh

#!/usr/bin/env bash

rm -rf oldrepo_git newrepo_git
mkdir oldrepo_git newrepo_git

cd oldrepo_git
git init
git config user.name tester
git config user.email tester@example.com
echo "# README" >> README
git add README
GIT_COMMITTER_DATE="1558960260" git commit --date "1558960260" -m "Initial commit"
echo "Testing" >> a.txt
git add a.txt
GIT_COMMITTER_DATE="1558960270" git commit --date "1558960270" -m "added a.txt"
git checkout -b testbranch
mkdir aa bb
for ix in 1 2 3 4 5; do
  echo $ix >> a.txt
  echo $ix >> b.txt
  echo $ix >> aa/aa.txt
  echo $ix >> aa/ab.txt
  git add .
  newts="$((1558960270+ix*10))"
  GIT_COMMITTER_DATE="$newts" git commit --date "$newts" -m "change $ix made"
done
git checkout master
ix="$((ix+1))"; newts="$((1558960270+ix*10))"
GIT_COMMITTER_DATE="$newts" GIT_AUTHOR_DATE="$newts" git merge --no-ff --no-edit testbranch

cd ../newrepo_git
git init
git config user.name bob
git config user.email bob@example.com
echo "# Bob's README" >> README
git add README
GIT_COMMITTER_DATE="1558960260" git commit --date "1558960260" -m "Initial commit by Bob"

1 个答案:

答案 0 :(得分:2)

编辑:您可能想在OP中的echo $ix >> bb/bb.txt脚本的for循环中添加一个额外的testgit.sh,以便此帖子中的输出匹配。


好吧,我想这就是应该做的事情-至少与OP有关(我们还没有远程仓库)。首先,复制oldrepo:

cp -a oldrepo_git oldrepo_filt_git

然后显然,我们必须通过将git filter-branchgit rm结合使用来删除复制的oldrepo中不需要的所有内容-我在这里找到的部分命令是:{{3} }

cd oldrepo_filt_git
git filter-branch --index-filter "git rm --cached --ignore-unmatch -r $(bash -O extglob -c 'ls -xd !(a*)')" --prune-empty -- --all

请注意,由于在这里我们要告诉git rm要删除的内容,因此我们要指定我们不想保留的内容,与要保留的内容相反。我想保留a.txt文件和aa文件夹-这样,全局匹配项将是a*。然后,您需要bash extglob函数来获得该功能;因此,如果整个列表是:

$ ls
a.txt  aa  b.txt  bb  README

...然后是extglob节,该节只会给我们要删除的文件/文件夹名称:

$ bash -O extglob -c 'ls -xd !(a*)'
b.txt  bb  README

因此,在运行git filter-branch命令之后:

$ git filter-branch --index-filter "git rm --cached --ignore-unmatch -r $(bash -O extglob -c 'ls -xd !(a*)')" --prune-empty -- --all
Rewrite 252bf7ff5f385dad880240d5d80e68f24ae09b59 (1/8) (0 seconds passed, remaining 0 predicted)    rm 'README'
Rewrite f318d9712cd7aacdb5dd45febbcdbbce6b741e08 (2/8) (1 seconds passed, remaining 3 predicted)    rm 'README'
Rewrite 00b62e7da8784d45850d7483cbea88fdc4aa844c (2/8) (1 seconds passed, remaining 3 predicted)    rm 'README'
rm 'b.txt'
rm 'bb/bb.txt'
Rewrite c618eff47d38412c54a8381a5bacc921bddefe2d (2/8) (1 seconds passed, remaining 3 predicted)    rm 'README'
rm 'b.txt'
rm 'bb/bb.txt'
Rewrite 2cada8d822d83f37bdc4a37bcfb03047c1cc1ded (5/8) (3 seconds passed, remaining 1 predicted)    rm 'README'
rm 'b.txt'
rm 'bb/bb.txt'
Rewrite 7b296b70018f4105f190d06ed4d9c58e3f80532f (5/8) (3 seconds passed, remaining 1 predicted)    rm 'README'
rm 'b.txt'
rm 'bb/bb.txt'
Rewrite 18a1ad1d35cd8573c39485d0a29b630325f9727d (7/8) (5 seconds passed, remaining 0 predicted)    rm 'README'
rm 'b.txt'
rm 'bb/bb.txt'
Rewrite 2ffbbf03d51363f1ced3aaaf000d5921c9d8b919 (7/8) (5 seconds passed, remaining 0 predicted)    rm 'README'
rm 'b.txt'
rm 'bb/bb.txt'

Ref 'refs/heads/master' was rewritten
Ref 'refs/heads/testbranch' was rewritten

...我们有

$ git log --oneline --graph --stat
*   31cd8b5 (HEAD -> master) Merge branch 'testbranch'
|\
| * 42b153d (testbranch) change 5 made
| |  a.txt     | 1 +
| |  aa/aa.txt | 1 +
| |  aa/ab.txt | 1 +
| |  3 files changed, 3 insertions(+)
| * ff1be9d change 4 made
| |  a.txt     | 1 +
| |  aa/aa.txt | 1 +
| |  aa/ab.txt | 1 +
| |  3 files changed, 3 insertions(+)
| * 90f050c change 3 made
| |  a.txt     | 1 +
| |  aa/aa.txt | 1 +
| |  aa/ab.txt | 1 +
| |  3 files changed, 3 insertions(+)
| * d2d2136 change 2 made
| |  a.txt     | 1 +
| |  aa/aa.txt | 1 +
| |  aa/ab.txt | 1 +
| |  3 files changed, 3 insertions(+)
| * ab237ac change 1 made
|/
|    a.txt     | 1 +
|    aa/aa.txt | 1 +
|    aa/ab.txt | 1 +
|    3 files changed, 3 insertions(+)
* ea0a32d added a.txt
   a.txt | 1 +
   1 file changed, 1 insertion(+)

...确认这是我想要的存储库的过滤状态-我想,我现在想将其合并到我的newrepo_git中。


好吧,事实证明,我不想“合并”到newrepo_git中,我想“加入”-我在Detach many subdirectories into a new, separate Git repository中找到的大部分信息

因此,首先,我们将目录更改为newrepo:

cd ../newrepo_git

请注意,此时,大多数在线资源都将推荐:

git remote add oldrepo ../oldrepo_filt_git/
git pull oldrepo master --allow-unrelated-histories

...但是这将导致具有两个根的历史记录-这不是我想要的:

$ git log --oneline --graph --stat
*   845c81e (HEAD -> master) Merge branch 'master' of ../oldrepo_filt_git
|\
| *   31cd8b5 (oldrepo/master) Merge branch 'testbranch'
| |\
| | * 42b153d (oldrepo/testbranch) change 5 made
| | |  a.txt     | 1 +
| | |  aa/aa.txt | 1 +
| | |  aa/ab.txt | 1 +
| | |  3 files changed, 3 insertions(+)
| | * ff1be9d change 4 made
| | |  a.txt     | 1 +
| | |  aa/aa.txt | 1 +
| | |  aa/ab.txt | 1 +
| | |  3 files changed, 3 insertions(+)
| | * 90f050c change 3 made
| | |  a.txt     | 1 +
| | |  aa/aa.txt | 1 +
| | |  aa/ab.txt | 1 +
| | |  3 files changed, 3 insertions(+)
| | * d2d2136 change 2 made
| | |  a.txt     | 1 +
| | |  aa/aa.txt | 1 +
| | |  aa/ab.txt | 1 +
| | |  3 files changed, 3 insertions(+)
| | * ab237ac change 1 made
| |/
| |    a.txt     | 1 +
| |    aa/aa.txt | 1 +
| |    aa/ab.txt | 1 +
| |    3 files changed, 3 insertions(+)
| * ea0a32d added a.txt
|    a.txt | 1 +
|    1 file changed, 1 insertion(+)
* 8e99c2d Initial commit by Bob
   README | 1 +
   1 file changed, 1 insertion(+)

我想要的是,提交ea0a32d added a.txt遵循8e99c2d Initial commit by Bob->之后的/ stem,这将是前面提到的存储库的“联接”。

还请注意,您可以从git format-patch --root HEAD -o ../开始oldrepo_git,然后使用newrepo_git将补丁导入for ix in ../*.patch; do echo $ix; git am -k < $ix; done-但这不是 保留合并历史记录(所有历史记录将被展平)!

因此,为了进行适当的“连接”,我首先进行提取:

$ git remote add old-repo ../oldrepo_filt_git

$ git fetch old-repo
warning: no common commits
remote: Enumerating objects: 29, done.
remote: Counting objects: 100% (29/29), done.
remote: Compressing objects: 100% (17/17), done.
remote: Total 29 (delta 2), reused 0 (delta 0)
Unpacking objects: 100% (29/29), done.
From ../oldrepo_filt_git
 * [new branch]      master     -> old-repo/master
 * [new branch]      testbranch -> old-repo/testbranch

...然后按照帖子中的建议添加和重命名分支(并在/tmp/hashlist中保存时间戳)-然后在旧仓库中挑选第一个提交:

$ git branch oldrepo-head old-repo/master
Branch 'oldrepo-head' set up to track remote branch 'master' from 'old-repo'.

$ git branch oldrepo-root $(git log oldrepo-head --reverse --pretty=%H | head -n 1)

$ git log --pretty='%T %ct' ..oldrepo-head > /tmp/hashlist

$ git branch -m master new-master

$ git cherry-pick --strategy-option=theirs oldrepo-root
[new-master 427cf77] added a.txt
 Author: tester <tester@example.com>
 Date: Mon May 27 14:31:10 2019 +0200
 1 file changed, 1 insertion(+)
 create mode 100644 a.txt

此时,回购状态为:

$ git log --oneline --graph
* 427cf77 (HEAD -> new-master) added a.txt
* 8e99c2d Initial commit by Bob

现在,我们可以在这里进行重新设置-请注意,在引用的帖子中,它们在此处出现错误,但是对于此特定示例,它似乎继续执行而没有错误:

$ git rebase --preserve-merges --onto new-master --root oldrepo-head
Successfully rebased and updated refs/heads/oldrepo-head.

此时,newrepo的历史记录几乎在 那里-唯一的问题是提交时间戳是不同的:

$ git log --graph --pretty=fuller
*   commit 61fbe54721a9432e91e48917ed036f55da4105a4 (HEAD -> oldrepo-head)
|\  Merge: 427cf77 f8e8f8a
| | Author:     tester <tester@example.com>
| | AuthorDate: Mon May 27 14:32:10 2019 +0200
| | Commit:     bob <bob@example.com>
| | CommitDate: Tue May 28 12:57:00 2019 +0200
| |
| |     Merge branch 'testbranch'
| |
| * commit f8e8f8aedaa7bc999bdfdd49542c9ee04edb770c
| | Author:     tester <tester@example.com>
| | AuthorDate: Mon May 27 14:32:00 2019 +0200
| | Commit:     bob <bob@example.com>
| | CommitDate: Tue May 28 12:56:58 2019 +0200
| |
| |     change 5 made
| |
| * commit b084029040d6596e0795e7567b2684dc59c02241
| | Author:     tester <tester@example.com>
| | AuthorDate: Mon May 27 14:31:50 2019 +0200
| | Commit:     bob <bob@example.com>
| | CommitDate: Tue May 28 12:56:56 2019 +0200
| |
| |     change 4 made
| |
| * commit b62dabca3a46efbe76edb10591935db136f74aaa
| | Author:     tester <tester@example.com>
| | AuthorDate: Mon May 27 14:31:40 2019 +0200
| | Commit:     bob <bob@example.com>
| | CommitDate: Tue May 28 12:56:54 2019 +0200
| |
| |     change 3 made
| |
| * commit 252f3e9697b87b4f59cd0a74681ef25401340fcf
| | Author:     tester <tester@example.com>
| | AuthorDate: Mon May 27 14:31:30 2019 +0200
| | Commit:     bob <bob@example.com>
| | CommitDate: Tue May 28 12:56:51 2019 +0200
| |
| |     change 2 made
| |
| * commit c382c8a713489ca0e5dc106bed29fdce379952b0
|/  Author:     tester <tester@example.com>
|   AuthorDate: Mon May 27 14:31:20 2019 +0200
|   Commit:     bob <bob@example.com>
|   CommitDate: Tue May 28 12:56:49 2019 +0200
|
|       change 1 made
|
* commit 427cf77417a2406db5dd6a0e9bd4fb60542f2ee1 (new-master)
| Author:     tester <tester@example.com>
| AuthorDate: Mon May 27 14:31:10 2019 +0200
| Commit:     bob <bob@example.com>
| CommitDate: Tue May 28 12:55:43 2019 +0200
|
|     added a.txt
|
* commit 8e99c2d71048b4999d012b33d34386351d6d0fef
  Author:     bob <bob@example.com>
  AuthorDate: Mon May 27 14:31:00 2019 +0200
  Commit:     bob <bob@example.com>
  CommitDate: Mon May 27 14:31:00 2019 +0200

      Initial commit by Bob

它们在引用的帖子中也有同样的问题,建议使用filter-branch重写提交时间戳,与作者时间戳相同:

$ git filter-branch --env-filter 'export GIT_COMMITTER_DATE=$(fgrep -m 1 $(git log -1 --pretty=%T $GIT_COMMIT) /tmp/hashlist | cut -d" " -f2)' new-master..oldrepo-head
Rewrite 61fbe54721a9432e91e48917ed036f55da4105a4 (3/6) (1 seconds passed, remaining 1 predicted)
Ref 'refs/heads/oldrepo-head' was rewritten

...但是,这对我不起作用,因为到目前为止,提交哈希值已经从/tmp/hashlist中的内容开始发生了变化。

因此,我使用了一种更简单的方法-只需让filter-branch在每次提交时读取作者日期时间戳,然后将其复制/重新应用为提交者日期(请注意,我在这里使用-f来补偿效果filter-branch中的一个,否则得到“无法创建新备份。...用-f强制覆盖备份”):

$ git filter-branch -f --env-filter 'export GIT_COMMITTER_DATE=$(git log -1 --pretty=%at $GIT_COMMIT)' new-master..oldrepo-head
Rewrite f2b2385d85c74dbf0cbf8fabc02ec30cb50d8f2a (3/6) (1 seconds passed, remaining 1 predicted)
Ref 'refs/heads/oldrepo-head' was rewritten

在这一点上,我们可以看到我需要的回购状态几乎是 -除了第一个oldrepo提交没有更改提交时间戳;所以我再试一次:

sd@DESKTOP-RO11QOC MSYS /c/Users/sd/AppData/Local/Temp/newrepo_git
$ git filter-branch -f --env-filter 'export GIT_COMMITTER_DATE=$(git log -1 --pretty=%at $GIT_COMMIT)' 427cf77417a
You must specify a ref to rewrite.

sd@DESKTOP-RO11QOC MSYS /c/Users/sd/AppData/Local/Temp/newrepo_git
$ git filter-branch -f --env-filter 'export GIT_COMMITTER_DATE=$(git log -1 --pretty=%at $GIT_COMMIT)' new-master
Rewrite 427cf77417a2406db5dd6a0e9bd4fb60542f2ee1 (2/2) (0 seconds passed, remaining 0 predicted)
Ref 'refs/heads/new-master' was rewritten

...但是在日志中的时间戳之间仍然显示相同的差异:

$ git log --graph --stat --pretty=fuller
*   commit cdaa4b82f3833770a9051a2490487548603e3af8 (HEAD -> oldrepo-head)
|\  Merge: 427cf77 9bfc6cd
| | Author:     tester <tester@example.com>
| | AuthorDate: Mon May 27 14:32:10 2019 +0200
| | Commit:     bob <bob@example.com>
| | CommitDate: Mon May 27 14:32:10 2019 +0200
| |
| |     Merge branch 'testbranch'
| |
...
* commit 427cf77417a2406db5dd6a0e9bd4fb60542f2ee1 (refs/original/refs/heads/new-master)
| Author:     tester <tester@example.com>
| AuthorDate: Mon May 27 14:31:10 2019 +0200
| Commit:     bob <bob@example.com>
| CommitDate: Tue May 28 12:55:43 2019 +0200
|
|     added a.txt
|
|  a.txt | 1 +
|  1 file changed, 1 insertion(+)
...

无论如何,现在我们应该按照帖子中的建议进行“清理”:

$ git branch -m oldrepo-head master
$ git branch -D oldrepo-root
Deleted branch oldrepo-root (was ea0a32d).
$ git branch -D new-master
Deleted branch new-master (was 4ac225e).
$ rm .git/refs/original/refs/heads/new-master
$ git remote remove old-repo

最后,我设法通过在此处添加临时分支(由于filter-branch需要引用,它似乎无法直接使用提交哈希)覆盖提交427cf774的提交时间戳,并使用它来指定{{1 }}作为过滤范围:

tmp^..tmp

...最后,我可以看到newrepo包含了我所设想的oldrepo提交:

$ git branch tmp 427cf774
$ git filter-branch -f --env-filter 'export GIT_COMMITTER_DATE=$(git log -1 --pretty=%at $GIT_COMMIT)' tmp^..tmp
Rewrite 427cf77417a2406db5dd6a0e9bd4fb60542f2ee1 (1/1) (0 seconds passed, remaining 0 predicted)
Ref 'refs/heads/tmp' was rewritten
$ git log --graph --stat --pretty=fuller tmp
* commit 4ac225e308e280e3a96be0168c6e9dece44d4979 (tmp)
| Author:     tester <tester@example.com>
| AuthorDate: Mon May 27 14:31:10 2019 +0200
| Commit:     bob <bob@example.com>
| CommitDate: Mon May 27 14:31:10 2019 +0200
|
|     added a.txt
|
|  a.txt | 1 +
|  1 file changed, 1 insertion(+)
|
...
$ git branch -D tmp
Deleted branch tmp (was 4ac225e).

容易,是吗? $ git log --graph --stat --pretty=fuller * commit cdaa4b82f3833770a9051a2490487548603e3af8 |\ Merge: 427cf77 9bfc6cd | | Author: tester <tester@example.com> | | AuthorDate: Mon May 27 14:32:10 2019 +0200 | | Commit: bob <bob@example.com> | | CommitDate: Mon May 27 14:32:10 2019 +0200 | | | | Merge branch 'testbranch' | | | * commit 9bfc6cde58be9102102f839e5cc0fe8f25f0f78c | | Author: tester <tester@example.com> | | AuthorDate: Mon May 27 14:32:00 2019 +0200 | | Commit: bob <bob@example.com> | | CommitDate: Mon May 27 14:32:00 2019 +0200 | | | | change 5 made | | | | a.txt | 1 + | | aa/aa.txt | 1 + | | aa/ab.txt | 1 + | | 3 files changed, 3 insertions(+) | | | * commit 485ae0f50054610b6a41098fb695e59d194cc856 | | Author: tester <tester@example.com> | | AuthorDate: Mon May 27 14:31:50 2019 +0200 | | Commit: bob <bob@example.com> | | CommitDate: Mon May 27 14:31:50 2019 +0200 | | | | change 4 made | | | | a.txt | 1 + | | aa/aa.txt | 1 + | | aa/ab.txt | 1 + | | 3 files changed, 3 insertions(+) | | | * commit b6804b6e8e313b5c4766568a287f0785503e3a11 | | Author: tester <tester@example.com> | | AuthorDate: Mon May 27 14:31:40 2019 +0200 | | Commit: bob <bob@example.com> | | CommitDate: Mon May 27 14:31:40 2019 +0200 | | | | change 3 made | | | | a.txt | 1 + | | aa/aa.txt | 1 + | | aa/ab.txt | 1 + | | 3 files changed, 3 insertions(+) | | | * commit 8b463423d2a99929a6a248e38ba1368a56d3769d | | Author: tester <tester@example.com> | | AuthorDate: Mon May 27 14:31:30 2019 +0200 | | Commit: bob <bob@example.com> | | CommitDate: Mon May 27 14:31:30 2019 +0200 | | | | change 2 made | | | | a.txt | 1 + | | aa/aa.txt | 1 + | | aa/ab.txt | 1 + | | 3 files changed, 3 insertions(+) | | | * commit 3bc0bed30ebea1498a15711825b2ea8347cc374d |/ Author: tester <tester@example.com> | AuthorDate: Mon May 27 14:31:20 2019 +0200 | Commit: bob <bob@example.com> | CommitDate: Mon May 27 14:31:20 2019 +0200 | | change 1 made | | a.txt | 1 + | aa/aa.txt | 1 + | aa/ab.txt | 1 + | 3 files changed, 3 insertions(+) | * commit 427cf77417a2406db5dd6a0e9bd4fb60542f2ee1 | Author: tester <tester@example.com> | AuthorDate: Mon May 27 14:31:10 2019 +0200 | Commit: bob <bob@example.com> | CommitDate: Tue May 28 12:55:43 2019 +0200 | | added a.txt | | a.txt | 1 + | 1 file changed, 1 insertion(+) | * commit 8e99c2d71048b4999d012b33d34386351d6d0fef Author: bob <bob@example.com> AuthorDate: Mon May 27 14:31:00 2019 +0200 Commit: bob <bob@example.com> CommitDate: Mon May 27 14:31:00 2019 +0200 Initial commit by Bob README | 1 + 1 file changed, 1 insertion(+)


但是我不太确定这是否是正确的过程-因此,如果有更多知识渊博的人可以确认这一点-或如果有更简单的方法,那就太好了...