'git merge'如何在细节上工作?

时间:2013-02-19 15:35:14

标签: git merge conflict

我想知道' git merge'背后的确切算法(或接近该算法)。至少对这些子问题的答案将有所帮助:

  • git如何检测特定非冲突变更的上下文?
  • git如何发现这些确切的行存在冲突?
  • git自动合并有哪些内容?
  • 当没有合并分支的共同基础时,git如何执行?
  • 当合并分支有多个共同基础时,git如何执行?
  • 当我一次合并多个分支时会发生什么?
  • 合并策略有什么区别?

但整个算法的描述会好得多。

5 个答案:

答案 0 :(得分:40)

您最好寻找三向合并算法的描述。高级描述会是这样的:

  1. 找到合适的合并基础B - 该文件的一个版本,它是两个新版本(XY)的祖先,通常是最新的此类基础(虽然有些情况下它必须进一步返回,这是git默认recursive合并的功能之一
  2. 使用XBY执行B的差异。
  3. 浏览两个差异中标识的更改块。如果双方在同一地点引入相同的变化,则接受其中任何一个;如果一个人引入了一个变化而另一个人只留下了那个区域,那就引入最后的变化;如果两者都引入了某个点的更改,但它们不匹配,则标记要手动解决的冲突。
  4. 完整算法更详细地处理了这个问题,甚至还有一些文档(/usr/share/doc/git-doc/technical/trivial-merge.txt一个,以及git help XXX页面,其中XXX是merge-base之一, merge-filemergemerge-one-file以及其他一些人。如果这还不够深入,总会有源代码......

答案 1 :(得分:8)

当合并分支有多个共同基础时,git如何执行?

这篇文章非常有用:http://codicesoftware.blogspot.com/2011/09/merge-recursive-strategy.html(此处为part 2)。

递归递归使用diff3生成一个虚拟分支,用作祖先。

E.g:

(A)----(B)----(C)-----(F)
        |      |       |
        |      |   +---+
        |      |   |
        |      +-------+
        |          |   |
        |      +---+   |
        |      |       |
        +-----(D)-----(E)

然后:

git checkout E
git merge F

有两个最佳共同祖先(不是其他任何祖先的共同祖先),CD。 Git将它们合并到一个新的虚拟分支V中,然后使用V作为基础。

(A)----(B)----(C)--------(F)
        |      |          |
        |      |      +---+
        |      |      |
        |      +----------+
        |      |      |   |
        |      +--(V) |   |
        |          |  |   |
        |      +---+  |   |
        |      |      |   |
        |      +------+   |
        |      |          |
        +-----(D)--------(E)

我想Git会继续讨论如果有更多最好的共同祖先,将V合并到下一个祖先。

文章说,如果在生成虚拟分支时存在合并冲突,Git只会将冲突标记留在原来的位置并继续。

当我一次合并多个分支时会发生什么?

正如@Nevik Rehnel解释的那样,这取决于策略,在man git-merge MERGE STRATEGIES部分对其进行了详细解释。

只有octopusours / theirs支持一次合并多个分支,recursive例如不支持。

如果存在冲突,

octopus拒绝合并,ours是一个简单的合并,因此不存在冲突。

那些生成新提交的命令将拥有2个以上的父级。

我在Git 1.8.5上做了一个merge -X octopus而没有冲突,看看它是怎么回事。

初始状态:

   +--B
   |
A--+--C
   |
   +--D

动作:

git checkout B
git merge -Xoctopus C D

新州:

   +--B--+
   |     |
A--+--C--+--E
   |     |
   +--D--+

正如预期的那样,E有3个父母。

TODO:章鱼如何在单个文件修改上运行。递归的二乘三向合并?

当没有合并分支的共同基础时,git如何执行?

@Torek提到自2.9以来,合并失败而没有--allow-unrelated-histories

我在Git 1.8.5上凭经验尝试了它:

git init
printf 'a\nc\n' > a
git add .
git commit -m a

git checkout --orphan b
printf 'a\nb\nc\n' > a
git add .
git commit -m b
git merge master

a包含:

a
<<<<<<< ours
b
=======
>>>>>>> theirs
c

然后:

git checkout --conflict=diff3 -- .

a包含:

<<<<<<< ours
a
b
c
||||||| base
=======
a
c
>>>>>>> theirs

解读:

  • 基地是空的
  • 当基数为空时,无法解析单个文件的任何修改;只能解决新文件添加等问题。上述冲突将通过与基础a\nc\n的3向合并作为单行添加来解决
  • 认为没有基本文件的3向合并被称为双向合并,这只是一个差异

答案 2 :(得分:5)

我也很感兴趣。我不知道答案,但是......

  

一个复杂的系统总是被发现是从一个有效的简单系统发展而来的

我认为git的合并非常复杂并且很难理解 - 但是解决这个问题的一种方法是从它的前身出发,并专注于你关注的核心。也就是说,给定两个没有共同祖先的文件,git merge如何合并它们以及冲突的位置?

让我们试着找一些前兆。来自git help merge-file

git merge-file is designed to be a minimal clone of RCS merge; that is,
       it implements all of RCS merge's functionality which is needed by
       git(1).

来自维基百科:http://en.wikipedia.org/wiki/Git_%28software%29 - &gt; http://en.wikipedia.org/wiki/Three-way_merge#Three-way_merge - &gt; http://en.wikipedia.org/wiki/Diff3 - &gt; http://www.cis.upenn.edu/~bcpierce/papers/diff3-short.pdf

最后一个链接是详细描述diff3算法的论文的pdf。这是一个google pdf-viewer version。它只有12页长,算法只有几页 - 但是全面的数学处理。这可能看起来有点过于正式,但如果你想了解git的合并,你首先需要了解更简单的版本。我还没有检查,但是使用diff3之类的名称,您可能还需要了解diff(使用longest common subsequence算法)。但是,如果你有谷歌...... {/ p>,可能会对diff3进行更直观的解释


现在,我刚做了一个比较diff3git merge-file的实验。他们使用相同的三个输入文件 version1 oldversion version2 并以相同的方式标记冲突,<<<<<<< version1=======>>>>>>> version2diff3也有||||||| oldversion),展示他们的共同遗产。

我为 oldversion 使用了一个空文件,为 version1 version2 使用了几乎相同的文件,只添加了一行额外的 >版本2

结果:git merge-file将单个更改的行标识为冲突;但是diff3将整个两个文件视为冲突。因此,像diff3一样复杂,即使对于这种最简单的情况,git的合并也更加复杂。

这是实际结果(我使用了@ twalberg对文本的回答)。请注意所需的选项(请参阅相应的联机帮助页)。

<强> $ git merge-file -p fun1.txt fun0.txt fun2.txt

You might be best off looking for a description of a 3-way merge algorithm. A
high-level description would go something like this:

    Find a suitable merge base B - a version of the file that is an ancestor of
both of the new versions (X and Y), and usually the most recent such base
(although there are cases where it will have to go back further, which is one
of the features of gits default recursive merge) Perform diffs of X with B and
Y with B.  Walk through the change blocks identified in the two diffs. If both
sides introduce the same change in the same spot, accept either one; if one
introduces a change and the other leaves that region alone, introduce the
change in the final; if both introduce changes in a spot, but they don't match,
mark a conflict to be resolved manually.
<<<<<<< fun1.txt
=======
THIS IS A BIT DIFFERENT
>>>>>>> fun2.txt

The full algorithm deals with this in a lot more detail, and even has some
documentation (/usr/share/doc/git-doc/technical/trivial-merge.txt for one,
along with the git help XXX pages, where XXX is one of merge-base, merge-file,
merge, merge-one-file and possibly a few others). If that's not deep enough,
there's always source code...

<强> $ diff3 -m fun1.txt fun0.txt fun2.txt

<<<<<<< fun1.txt
You might be best off looking for a description of a 3-way merge algorithm. A
high-level description would go something like this:

    Find a suitable merge base B - a version of the file that is an ancestor of
both of the new versions (X and Y), and usually the most recent such base
(although there are cases where it will have to go back further, which is one
of the features of gits default recursive merge) Perform diffs of X with B and
Y with B.  Walk through the change blocks identified in the two diffs. If both
sides introduce the same change in the same spot, accept either one; if one
introduces a change and the other leaves that region alone, introduce the
change in the final; if both introduce changes in a spot, but they don't match,
mark a conflict to be resolved manually.

The full algorithm deals with this in a lot more detail, and even has some
documentation (/usr/share/doc/git-doc/technical/trivial-merge.txt for one,
along with the git help XXX pages, where XXX is one of merge-base, merge-file,
merge, merge-one-file and possibly a few others). If that's not deep enough,
there's always source code...
||||||| fun0.txt
=======
You might be best off looking for a description of a 3-way merge algorithm. A
high-level description would go something like this:

    Find a suitable merge base B - a version of the file that is an ancestor of
both of the new versions (X and Y), and usually the most recent such base
(although there are cases where it will have to go back further, which is one
of the features of gits default recursive merge) Perform diffs of X with B and
Y with B.  Walk through the change blocks identified in the two diffs. If both
sides introduce the same change in the same spot, accept either one; if one
introduces a change and the other leaves that region alone, introduce the
change in the final; if both introduce changes in a spot, but they don't match,
mark a conflict to be resolved manually.
THIS IS A BIT DIFFERENT

The full algorithm deals with this in a lot more detail, and even has some
documentation (/usr/share/doc/git-doc/technical/trivial-merge.txt for one,
along with the git help XXX pages, where XXX is one of merge-base, merge-file,
merge, merge-one-file and possibly a few others). If that's not deep enough,
there's always source code...
>>>>>>> fun2.txt

如果你真的对此感兴趣,那就是一个兔子洞。对我而言,它似乎与正则表达式一样深,差异,上下文无关语法或关系代数的最长公共子序列算法。如果你想深究它,我想你可以,但这需要一些坚定的研究。

答案 3 :(得分:2)

这是最初的实现

http://git.kaarsemaker.net/git/blob/857f26d2f41e16170e48076758d974820af685ff/git-merge-recursive.py

基本上,您为两个提交创建一个共同祖先列表,然后递归合并它们,或者快速转发它们,或者创建用于文件三向合并的虚拟提交。

答案 4 :(得分:1)

  

git如何检测特定非冲突变更的上下文?   git如何发现这些确切的行存在冲突?

如果合并两侧的同一行发生了变化,那就是冲突;如果他们没有,则从一方(如果存在)的变更被接受。

  

git自动合并有哪些事情?

不冲突的变更(见上文)

  

当合并分支有多个共同基础时,git如何执行?

根据Git merge-base的定义,只有一个(最新的共同祖先)。

  

当我一次合并多个分支时会发生什么?

这取决于合并策略(只有octopusours / theirs策略支持合并两个以上的分支)。

  

合并策略有什么区别?

git merge manpage中解释了这一点。