我想知道' git merge'背后的确切算法(或接近该算法)。至少对这些子问题的答案将有所帮助:
但整个算法的描述会好得多。
答案 0 :(得分:40)
您最好寻找三向合并算法的描述。高级描述会是这样的:
B
- 该文件的一个版本,它是两个新版本(X
和Y
)的祖先,通常是最新的此类基础(虽然有些情况下它必须进一步返回,这是git
默认recursive
合并的功能之一X
和B
与Y
执行B
的差异。完整算法更详细地处理了这个问题,甚至还有一些文档(/usr/share/doc/git-doc/technical/trivial-merge.txt
一个,以及git help XXX
页面,其中XXX是merge-base
之一, merge-file
,merge
,merge-one-file
以及其他一些人。如果这还不够深入,总会有源代码......
答案 1 :(得分:8)
当合并分支有多个共同基础时,git如何执行?
这篇文章非常有用:http://codicesoftware.blogspot.com/2011/09/merge-recursive-strategy.html(此处为part 2)。
递归递归使用diff3生成一个虚拟分支,用作祖先。
E.g:
(A)----(B)----(C)-----(F)
| | |
| | +---+
| | |
| +-------+
| | |
| +---+ |
| | |
+-----(D)-----(E)
然后:
git checkout E
git merge F
有两个最佳共同祖先(不是其他任何祖先的共同祖先),C
和D
。 Git将它们合并到一个新的虚拟分支V
中,然后使用V
作为基础。
(A)----(B)----(C)--------(F)
| | |
| | +---+
| | |
| +----------+
| | | |
| +--(V) | |
| | | |
| +---+ | |
| | | |
| +------+ |
| | |
+-----(D)--------(E)
我想Git会继续讨论如果有更多最好的共同祖先,将V
合并到下一个祖先。
文章说,如果在生成虚拟分支时存在合并冲突,Git只会将冲突标记留在原来的位置并继续。
当我一次合并多个分支时会发生什么?
正如@Nevik Rehnel解释的那样,这取决于策略,在man git-merge
MERGE STRATEGIES
部分对其进行了详细解释。
只有octopus
和ours
/ theirs
支持一次合并多个分支,recursive
例如不支持。
octopus
拒绝合并,ours
是一个简单的合并,因此不存在冲突。
那些生成新提交的命令将拥有2个以上的父级。
我在Git 1.8.5上做了一个merge -X octopus
而没有冲突,看看它是怎么回事。
初始状态:
+--B
|
A--+--C
|
+--D
动作:
git checkout B
git merge -Xoctopus C D
新州:
+--B--+
| |
A--+--C--+--E
| |
+--D--+
正如预期的那样,E
有3个父母。
TODO:章鱼如何在单个文件修改上运行。递归的二乘三向合并?
当没有合并分支的共同基础时,git如何执行?
@Torek提到自2.9以来,合并失败而没有--allow-unrelated-histories
。
我在Git 1.8.5上凭经验尝试了它:
git init
printf 'a\nc\n' > a
git add .
git commit -m a
git checkout --orphan b
printf 'a\nb\nc\n' > a
git add .
git commit -m b
git merge master
a
包含:
a
<<<<<<< ours
b
=======
>>>>>>> theirs
c
然后:
git checkout --conflict=diff3 -- .
a
包含:
<<<<<<< ours
a
b
c
||||||| base
=======
a
c
>>>>>>> theirs
解读:
a\nc\n
的3向合并作为单行添加来解决答案 2 :(得分:5)
我也很感兴趣。我不知道答案,但是......
一个复杂的系统总是被发现是从一个有效的简单系统发展而来的
我认为git的合并非常复杂并且很难理解 - 但是解决这个问题的一种方法是从它的前身出发,并专注于你关注的核心。也就是说,给定两个没有共同祖先的文件,git merge如何合并它们以及冲突的位置?
让我们试着找一些前兆。来自git help merge-file
:
git merge-file is designed to be a minimal clone of RCS merge; that is,
it implements all of RCS merge's functionality which is needed by
git(1).
来自维基百科:http://en.wikipedia.org/wiki/Git_%28software%29 - &gt; http://en.wikipedia.org/wiki/Three-way_merge#Three-way_merge - &gt; http://en.wikipedia.org/wiki/Diff3 - &gt; http://www.cis.upenn.edu/~bcpierce/papers/diff3-short.pdf
最后一个链接是详细描述diff3
算法的论文的pdf。这是一个google pdf-viewer version。它只有12页长,算法只有几页 - 但是全面的数学处理。这可能看起来有点过于正式,但如果你想了解git的合并,你首先需要了解更简单的版本。我还没有检查,但是使用diff3
之类的名称,您可能还需要了解diff(使用longest common subsequence算法)。但是,如果你有谷歌...... {/ p>,可能会对diff3
进行更直观的解释
现在,我刚做了一个比较diff3
和git merge-file
的实验。他们使用相同的三个输入文件 version1 oldversion version2 并以相同的方式标记冲突,<<<<<<< version1
,=======
,>>>>>>> version2
(diff3
也有||||||| oldversion
),展示他们的共同遗产。
我为 oldversion 使用了一个空文件,为 version1 和 version2 使用了几乎相同的文件,只添加了一行额外的 >版本2
结果:git merge-file
将单个更改的行标识为冲突;但是diff3
将整个两个文件视为冲突。因此,像diff3一样复杂,即使对于这种最简单的情况,git的合并也更加复杂。
这是实际结果(我使用了@ twalberg对文本的回答)。请注意所需的选项(请参阅相应的联机帮助页)。
<强> $ git merge-file -p fun1.txt fun0.txt fun2.txt
强>
You might be best off looking for a description of a 3-way merge algorithm. A
high-level description would go something like this:
Find a suitable merge base B - a version of the file that is an ancestor of
both of the new versions (X and Y), and usually the most recent such base
(although there are cases where it will have to go back further, which is one
of the features of gits default recursive merge) Perform diffs of X with B and
Y with B. Walk through the change blocks identified in the two diffs. If both
sides introduce the same change in the same spot, accept either one; if one
introduces a change and the other leaves that region alone, introduce the
change in the final; if both introduce changes in a spot, but they don't match,
mark a conflict to be resolved manually.
<<<<<<< fun1.txt
=======
THIS IS A BIT DIFFERENT
>>>>>>> fun2.txt
The full algorithm deals with this in a lot more detail, and even has some
documentation (/usr/share/doc/git-doc/technical/trivial-merge.txt for one,
along with the git help XXX pages, where XXX is one of merge-base, merge-file,
merge, merge-one-file and possibly a few others). If that's not deep enough,
there's always source code...
<强> $ diff3 -m fun1.txt fun0.txt fun2.txt
强>
<<<<<<< fun1.txt
You might be best off looking for a description of a 3-way merge algorithm. A
high-level description would go something like this:
Find a suitable merge base B - a version of the file that is an ancestor of
both of the new versions (X and Y), and usually the most recent such base
(although there are cases where it will have to go back further, which is one
of the features of gits default recursive merge) Perform diffs of X with B and
Y with B. Walk through the change blocks identified in the two diffs. If both
sides introduce the same change in the same spot, accept either one; if one
introduces a change and the other leaves that region alone, introduce the
change in the final; if both introduce changes in a spot, but they don't match,
mark a conflict to be resolved manually.
The full algorithm deals with this in a lot more detail, and even has some
documentation (/usr/share/doc/git-doc/technical/trivial-merge.txt for one,
along with the git help XXX pages, where XXX is one of merge-base, merge-file,
merge, merge-one-file and possibly a few others). If that's not deep enough,
there's always source code...
||||||| fun0.txt
=======
You might be best off looking for a description of a 3-way merge algorithm. A
high-level description would go something like this:
Find a suitable merge base B - a version of the file that is an ancestor of
both of the new versions (X and Y), and usually the most recent such base
(although there are cases where it will have to go back further, which is one
of the features of gits default recursive merge) Perform diffs of X with B and
Y with B. Walk through the change blocks identified in the two diffs. If both
sides introduce the same change in the same spot, accept either one; if one
introduces a change and the other leaves that region alone, introduce the
change in the final; if both introduce changes in a spot, but they don't match,
mark a conflict to be resolved manually.
THIS IS A BIT DIFFERENT
The full algorithm deals with this in a lot more detail, and even has some
documentation (/usr/share/doc/git-doc/technical/trivial-merge.txt for one,
along with the git help XXX pages, where XXX is one of merge-base, merge-file,
merge, merge-one-file and possibly a few others). If that's not deep enough,
there's always source code...
>>>>>>> fun2.txt
如果你真的对此感兴趣,那就是一个兔子洞。对我而言,它似乎与正则表达式一样深,差异,上下文无关语法或关系代数的最长公共子序列算法。如果你想深究它,我想你可以,但这需要一些坚定的研究。
答案 3 :(得分:2)
这是最初的实现
http://git.kaarsemaker.net/git/blob/857f26d2f41e16170e48076758d974820af685ff/git-merge-recursive.py
基本上,您为两个提交创建一个共同祖先列表,然后递归合并它们,或者快速转发它们,或者创建用于文件三向合并的虚拟提交。
答案 4 :(得分:1)
git如何检测特定非冲突变更的上下文? git如何发现这些确切的行存在冲突?
如果合并两侧的同一行发生了变化,那就是冲突;如果他们没有,则从一方(如果存在)的变更被接受。
git自动合并有哪些事情?
不冲突的变更(见上文)
当合并分支有多个共同基础时,git如何执行?
根据Git merge-base的定义,只有一个(最新的共同祖先)。
当我一次合并多个分支时会发生什么?
这取决于合并策略(只有octopus
和ours
/ theirs
策略支持合并两个以上的分支)。
合并策略有什么区别?
git merge
manpage中解释了这一点。