Question

我正在尝试在C中包含相当大的双向链接列表的键的键上合并排序a，该列表具有大约100,000个元素。这是DLL元素的结构：

struct Pore {
    int ns;    /* voxel number */
    int radius;  /* effective radius of porosity surrounding a pore */
    struct Pore *next;
    struct Pore *prev;
};

在搜索算法之后，我发现最常用的算法包括三个功能：mergeSort，merge和split。我在这里包括它们...请原谅printf函数中的多个merge，因为我一直在尝试调试在4097592-nd递归进入{{1 }}函数。 merge和Recur01是我定义的全局变量，有助于调试。

Recur02

运行代码就像我说的一样，直到进入void mergeSort(struct Pore **head) { Recur01++; /* Base case: 0 or 1 pore */ if ((*head) == NULL) { printf("\nEnter mergeSort %ld, list head is NULL ",Recur01); fflush(stdout); return; } if ((*head)->next == NULL) { printf("\nEnter mergeSort %ld, list head next is NULL ",Recur01); fflush(stdout); return; } printf("\nEnter mergeSort %ld",Recur01); fflush(stdout); /* Split head into 'a' and 'b' sublists */ struct Pore *a = *head; struct Pore *b = NULL; split(*head, &a, &b); /* Recursively sort the sublists */ mergeSort(&a); mergeSort(&b); /* Merge the two sorted halves */ *head = merge(a,b); printf("\nExit mergeSort %ld",Recur01); fflush(stdout); return; } void split(struct Pore *head, struct Pore **a, struct Pore **b) { int count = 0; int lngth = 1; struct Pore *slow = head; struct Pore *fast = head->next; struct Pore *temp; temp = head; while (temp->next != NULL) { lngth++; /* printf("\n Length = %d",lngth); fflush(stdout); */ if (temp->next) { temp = temp->next; } } while (fast != NULL) { printf("\nCount = %d",count); fflush(stdout); fast = fast->next; if (fast != NULL) { slow = slow->next; fast = fast->next; } count++; } printf("\nDone with while loop, final count = %d",count); fflush(stdout); *b = slow->next; slow->next = NULL; printf("\nExit split"); fflush(stdout); return; } struct Pore *merge(struct Pore *a, struct Pore *b) { Recur02++; if (Recur02 >= 4097591) { printf("\nEnter merge %ld",Recur02); fflush(stdout); } /** If first linked list is empty, return the second list */ /* Base cases */ if (a == NULL) return b; if (b == NULL) return a; if (Recur02 >= 4097591) { printf("\n Made it 01"); fflush(stdout); } /* Pick the larger key */ if (a->radius > b->radius) { if (Recur02 >= 4097591) { printf("\n Made it 02 a is bigger, Recur02 = %ld",Recur02); fflush(stdout); printf(" a->next->ns = %d",a->next->ns); fflush(stdout); printf(" b->ns = %d",b->ns); fflush(stdout); } a->next = merge(a->next,b); a->next->prev = a; a->prev = NULL; if (Recur02 >= 4097591) { printf("\nExit merge a %ld",Recur02); fflush(stdout); } return a; } else { if (Recur02 >= 4097591) { printf("\n Made it 02 b is bigger, Recur02 = %ld",Recur02); fflush(stdout); printf(" b->next->ns = %d",b->next->ns); fflush(stdout); printf(" a->ns = %d",a->ns); fflush(stdout); } b->next = merge(a,b->next); b->next->prev = b; b->prev = NULL; if (Recur02 >= 4097591) { printf("\nExit merge b %ld",Recur02); fflush(stdout); } return b; } }的4097592-nd条目为止。我在函数调用之前放了merge，在函数进入后立即放了printf。我也printf在function参数中的元素的键，它们似乎也没问题。我不确定还有什么要尝试深入了解这一点的。下面是输出的最后几行：

Exit mergeSort 529095
Exit mergeSort 529095
Enter merge 4097591
    Made it 01
    Made it 02 a is bigger, Recur02 = 4097591      a->next->ns = 156692      b->ns = 20
Enter merge 4097591
Enter merge 4097592
    Made it 01
    Made it 02 a is bigger, Recur02 = 4097592      a->next->ns = 156693      b->ns = 20

这是分段错误之前从缓冲区刷新的最后一行。我已经没有足够的想法来调试它了，所以请多多关照。

Answer 1

分段错误是由于使用了递归合并而导致的，此递归合并会为合并的每个节点调用自身。主代码自上而下是可以的，因为它的堆栈空间复杂度为O（log2（n）），但是合并功能需要迭代。

最常用的

std :: list :: sort（）的原始实现是对链表的自下而上合并排序，它使用列表的小数组（25到32）（或指向列表的第一个节点的指针或迭代器）

https://en.wikipedia.org/wiki/Merge_sort#Bottom-up_implementation_using_lists

在Visual Studio 2015之前，std :: list :: sort的大多数实现可能都是自下而上的，Visual Studio 2015从使用列表数组切换为使用迭代器（以避免像没有默认分配器那样的问题并提供异常安全性）。这是在先前的线程中提出的，最初我只是接受更改，假设切换到迭代器要求更改自上而下。问题稍后再次出现，因此我调查了一下，并确定无需切换到自上而下的合并排序。我的主要遗憾是没有从原始问题中对此进行研究。我确实更新了答案，以显示基于独立迭代器的自下而上的合并排序，以及VS2019包含文件中std :: list :: sort的替换。

`std::list<>::sort()` - why the sudden switch to top-down strategy?

在大多数情况下，只要有足够的内存，将列表复制到数组（或向量），对数组进行排序并创建新的排序列表的速度就更快。如果大型链表中的节点是随机分散的，则几乎可以访问每个节点，这会转化为缓存未命中。通过将列表移至数组，对数组中的运行进行按合并排序的顺序访问将更易于缓存。这是实现Java的链表本地排序的方式，尽管部分原因是由于对多个容器类型（包括链表）使用了公共的collections.sort（），而C ++标准库std :: list是一个独立的容器输入其列表特定的成员函数。

Answer 2

@VladfromMoscow建议使用非递归排序算法，因为递归算法不适用于长列表。因此，我尝试将合并排序here的迭代版本用于我的双链表。奇迹般有效。至少在这种情况下，对于这么长的列表，似乎递归确实太深了。

合并排序算法中的C分割错误

2 个答案: