Question

在类似Lisp的系统中，缺点是将元素预置到列表的常规方法。附加到列表的函数要贵得多，因为它们将列表移到最后，然后使用对附加项的引用替换最后的null。 IOW（pseudoLisp）

(prepend list item) = (cons item list) = cheap!
(append list item) = (cond ((null? list) (cons item null))
                           (#t (cons (car list (append (cdr list) item)))))

问题是Mathemtica的情况是否相似？在大多数情况下，Mathematica的列表似乎像lisp的列表一样单独链接，如果是这样，我们可以假设Append [list，item]比Prepend [list，item]贵得多。但是，我无法在Mathematica文档中找到任何解决此问题的内容。如果Mathematica的列表是双重链接或更巧妙地实现，比如说，在堆中或只是保持指向最后一个，那么插入可能具有完全不同的性能配置文件。

任何建议或经验都将不胜感激。

Answer 1

Mathematica的列表不像Common Lisp那样是单链表。最好将mathematica列表视为数组或矢量类结构。插入速度为O（n），但检索速度不变。

查看page的Data structures and Efficient Algorithms in Mathematica，其中详细介绍了mathematica列表。

另外，请查看链接列表上的this Stack Overflow问题及其在mathematica中的表现。

Answer 2

如上所述，由于Mathematica列表是作为数组实现的，因此Append和Prepend等操作会在每次添加元素时复制列表。一种更有效的方法是预先分配一个列表并填充它，但是我的下面的实验并没有像我预期的那样显示出很大的差异。显然，更好的是链表方法，我将不得不进行调查。

Needs["PlotLegends`"]
test[n_] := Module[{startlist = Range[1000]},
   datalist = RandomReal[1, n*1000];
   appendlist = startlist;
   appendtime = 
    First[AbsoluteTiming[AppendTo[appendlist, #] & /@ datalist]];
   preallocatedlist = Join[startlist, Table[Null, {Length[datalist]}]];
   count = -1;
   preallocatedtime = 
    First[AbsoluteTiming[
      Do[preallocatedlist[[count]] = datalist[[count]]; 
       count--, {Length[datalist]}]]];
   {{n, appendtime}, {n, preallocatedtime}}];
results = test[#] & /@ Range[26];
ListLinePlot[Transpose[results], Filling -> Axis, 
 PlotLegend -> {"Appending", "Preallocating"}, 
 LegendPosition -> {1, 0}]

比较AppendTo与预分配的时序图。（运行时间：82秒）

timing chart

修改

使用nixeagle的建议修改大大改善了预分配时间，即使用preallocatedlist = Join[startlist, ConstantArray[0, {Length[datalist]}]];

第二次修改

{{{startlist}，data1}，data2}形式的链接列表效果更好，并且具有很大的优势，即不需要事先知道大小，就像预分配一样。

Needs["PlotLegends`"] test[n_] := Module[{startlist = Range[1000]}, datalist = RandomReal[1, n*1000]; linkinglist = startlist; linkedlisttime = First[AbsoluteTiming[ Do[linkinglist = {linkinglist, datalist[[i]]}, {i, Length[datalist]}]; linkedlist = Flatten[linkinglist];]]; preallocatedlist = Join[startlist, ConstantArray[0, {Length[datalist]}]]; count = -1; preallocatedtime = First[AbsoluteTiming[ Do[preallocatedlist[[count]] = datalist[[count]]; count--, {Length[datalist]}]]]; {{n, preallocatedtime}, {n, linkedlisttime}}]; results = test[#] & /@ Range[26]; ListLinePlot[Transpose[results], Filling -> Axis, PlotLegend -> {"Preallocating", "Linked-List"}, LegendPosition -> {1, 0}]

链表与预分配的时序比较。（运行时间：6秒）

Answer 3

作为一个小小的补充，这里是M -

中“AppendTo”的有效替代品

myBag = Internal`Bag[]
Do[Internal`StuffBag[myBag, i], {i, 10}]
Internal`BagPart[myBag, All]

Answer 4

如果您知道结果将包含多少元素，并且您是否可以计算元素，则不需要整个Append，AppendTo，Linked-List等。在Chris的速度测试中，预分配只能起作用，因为他事先知道元素的数量。对datelist的访问操作代表当前元素的虚拟计算。

如果情况是这样的话，我绝不会使用这种方法。一个简单的表与Join相结合是更快的地狱。让我重用Chris的代码：我将预分配添加到时间测量中，因为当使用Append或链表时，也会测量内存分配。此外，我真的使用结果列表并检查它们是否相等，因为聪明的解释器可能会识别简单，无用的命令并优化这些。

Needs["PlotLegends`"]
test[n_] := Module[{
    startlist = Range[1000],
    datalist, joinResult, linkedResult, linkinglist, linkedlist, 
    preallocatedlist, linkedlisttime, preallocatedtime, count, 
    joinTime, preallocResult},


   datalist = RandomReal[1, n*1000];
   linkinglist = startlist;
   {linkedlisttime, linkedResult} = 
    AbsoluteTiming[
     Do[linkinglist = {linkinglist, datalist[[i]]}, {i, 
       Length[datalist]}];
     linkedlist = Flatten[linkinglist]
     ];

   count = -1;
   preallocatedtime = First@AbsoluteTiming[
      (preallocatedlist = 
        Join[startlist, ConstantArray[0, {Length[datalist]}]];
       Do[preallocatedlist[[count]] = datalist[[count]];
        count--, {Length[datalist]}]
       )
      ];

   {joinTime, joinResult} =
    AbsoluteTiming[
     Join[startlist, 
      Table[datalist[[i]], {i, 1, Length[datalist]}]]];
   PrintTemporary[
    Equal @@@ Tuples[{linkedResult, preallocatedlist, joinResult}, 2]];
   {preallocatedtime, linkedlisttime, joinTime}];

results = test[#] & /@ Range[40];
ListLinePlot[Transpose[results], PlotStyle -> {Black, Gray, Red}, 
 PlotLegend -> {"Prealloc", "Linked", "Joined"}, 
 LegendPosition -> {1, 0}]

enter image description here

在我看来，有趣的情况是，当你事先不知道元素的数量时，你必须决定是否必须附加/前置某些内容。在那些情况下，Reap []和Sow []可能值得一看。一般来说，我会说，AppendTo是邪恶的，在使用之前，请看看替代方案：

n = 10.^5 - 1;
res1 = {};
t1 = First@AbsoluteTiming@Table[With[{y = Sin[x]},
      If[y > 0, AppendTo[res1, y]]], {x, 0, 2 Pi, 2 Pi/n}
     ];

{t2, res2} = AbsoluteTiming[With[{r = Release@Table[
        With[{y = Sin[x]},
         If[y > 0, y, Hold@Sequence[]]], {x, 0, 2 Pi, 2 Pi/n}]},
    r]];

{t3, res3} = AbsoluteTiming[Flatten@Table[
     With[{y = Sin[x]},
      If[y > 0, y, {}]], {x, 0, 2 Pi, 2 Pi/n}]];

{t4, res4} = AbsoluteTiming[First@Last@Reap@Table[With[{y = Sin[x]},
        If[y > 0, Sow[y]]], {x, 0, 2 Pi, 2 Pi/n}]];

{res1 == res2, res2 == res3, res3 == res4}
{t1, t2, t3, t4}

给出{5.151575,0.250336,0.128624,0.148084}。构造

Flatten@Table[ With[{y = Sin[x]}, If[y > 0, y, {}]], ...]

幸运的是真的可读且快速。

备注

小心在家里尝试这最后一个例子。在我的Ubuntu 64bit和Mma 8.0.4上，n = 10 ^ 5的AppendTo需要10GB的内存。 n = 10 ^ 6占用我所有的32GB内存来创建一个包含15MB数据的数组。有趣的。

Mathematica中的Prepend vs. Append perf

4 个答案:

备注