Question

我有以下问题。

我正在开发一个随机模拟器，它随机对系统的配置进行采样，并存储在某些时间点访问每个配置的次数的统计数据。大致代码就像这样

f[_Integer][{_Integer..}] :=0
...
someplace later in the code, e.g.,
index = get index;
c = get random configuration (i.e. a tuple of integers, say a pair {n1, n2}); 
f[index][c] = f[index][c] + 1;
which tags that configuration c has occurred once more in the simulation at time instance index.

代码完成后，会有一个f的定义列表，看起来像这样（我手工打字只是为了强调最重要的部分）

?f
f[1][{1, 2}] = 112
f[1][{3, 4}] = 114
f[2][{1, 6}] = 216
f[2][{2, 7}] = 227
...
f[index][someconfiguration] = some value
...
f[_Integer][{_Integer..}] :=0

请注意，首先出现的无模式定义可能相当稀疏。也无法知道将选择哪些值和配置。

问题是有效地提取所需索引的值，例如发布类似

的内容

result = ExtractConfigurationsAndOccurences[f, 2]

应该给出一个结构列表

result = {list1, list2}

，其中

list1 = {{1, 6}, {2, 7}} (* the list of configurations that occurred during the simulation*)
list2 = {216, 227} (* how many times each of them occurred *)

问题是ExtractConfigurationsAndOccurences应该非常快。我能想到的唯一解决方案是使用SubValues [f]（它给出完整列表）并使用Cases语句对其进行过滤。我意识到应该不惜一切代价避免这个过程，因为会有指数级的测试配置（定义），这会大大减慢代码的速度。

Mathematica有一种自然的方法可以快速完成吗？

我希望Mathematica将f [2]视为具有许多下降值的单个头，但是使用DownValues [f [2]]什么都没有。同样使用SubValues [f [2]]会导致错误。

Answer 1

这完全重写了我以前的答案。事实证明，在我之前的尝试中，我忽略了一种基于打包数组和稀疏数组的组合的简单方法，它比以前的所有方法都快得多且内存效率更高（至少在我的样本大小范围内）测试它），同时只是最小化改变原始的基于SubValues的方法。由于问到了最有效的方法问题，我将从答案中删除其他问题（假设它们相当复杂并占用了大量空间。那些希望看到它们的人可以检查过去的修订版本回答）。

原始基于`SubValues`的方法

我们首先介绍一个函数来为我们生成配置的测试样本。这是：

Clear[generateConfigurations];
generateConfigurations[maxIndex_Integer, maxConfX_Integer, maxConfY_Integer, 
  nconfs_Integer] :=
Transpose[{
  RandomInteger[{1, maxIndex}, nconfs],
  Transpose[{
     RandomInteger[{1, maxConfX}, nconfs],
     RandomInteger[{1, maxConfY}, nconfs]
  }]}];

我们可以生成一个小样本来说明：

In[3]:= sample  = generateConfigurations[2,2,2,10]
Out[3]= {{2,{2,1}},{2,{1,1}},{1,{2,1}},{1,{1,2}},{1,{1,2}},
          {1,{2,1}},{2,{1,2}},{2,{2,2}},{1,{2,2}},{1,{2,1}}}

我们这里只有2个索引和配置，其中“x”和“y”数字仅从1到2变化--10个这样的配置。

以下函数将帮助我们模拟配置的频率累积，因为我们为重复出现的计数器递增基于SubValues的计数器：

Clear[testAccumulate];
testAccumulate[ff_Symbol, data_] :=
  Module[{},
   ClearAll[ff];
   ff[_][_] = 0;
   Do[
     doSomeStuff;
     ff[#1][#2]++ & @@ elem;
     doSomeMoreStaff;
   , {elem, data}]];

此处doSomeStuff和doSomeMoreStaff符号代表可能排除或遵循计数代码的一些代码。 data参数应该是generateConfigurations生成的表单列表。例如：

In[6]:= 
testAccumulate[ff,sample];
SubValues[ff]

Out[7]= {HoldPattern[ff[1][{1,2}]]:>2,HoldPattern[ff[1][{2,1}]]:>3,
   HoldPattern[ff[1][{2,2}]]:>1,HoldPattern[ff[2][{1,1}]]:>1,
   HoldPattern[ff[2][{1,2}]]:>1,HoldPattern[ff[2][{2,1}]]:>1,
   HoldPattern[ff[2][{2,2}]]:>1,HoldPattern[ff[_][_]]:>0}

以下函数将从SubValues列表中提取结果数据（索引，配置及其频率）：

Clear[getResultingData];
getResultingData[f_Symbol] :=
   Transpose[{#[[All, 1, 1, 0, 1]], #[[All, 1, 1, 1]], #[[All, 2]]}] &@
        Most@SubValues[f, Sort -> False];

例如：

In[10]:= result = getResultingData[ff]
Out[10]= {{2,{2,1},1},{2,{1,1},1},{1,{2,1},3},{1,{1,2},2},{2,{1,2},1},
{2,{2,2},1},{1,{2,2},1}}

要完成数据处理周期，这里有一个简单的功能，可以根据Select提取固定索引的数据：

Clear[getResultsForFixedIndex];
getResultsForFixedIndex[data_, index_] := 
  If[# === {}, {}, Transpose[#]] &[
    Select[data, First@# == index &][[All, {2, 3}]]];

对于我们的测试示例，

In[13]:= getResultsForFixedIndex[result,1]
Out[13]= {{{2,1},{1,2},{2,2}},{3,2,1}}

这可能与@zorank在代码中尝试的很接近。

基于压缩数组和稀疏数组的更快解决方案

正如@zorank所指出的，对于具有更多索引和配置的更大样本，这变得缓慢。我们现在将生成一个大样本来说明（注意！这需要大约4-5 Gb的RAM，因此如果超出可用RAM ，您可能希望减少配置数量）：

In[14]:= 
largeSample = generateConfigurations[20,500,500,5000000];
testAccumulate[ff,largeSample];//Timing

Out[15]= {31.89,Null}

我们现在将从SubValues的{{1}}中提取完整数据：

ff

这需要一些时间，但只需要做一次。但是当我们开始提取固定索引的数据时，我们发现它很慢：

In[16]:= (largeres = getResultingData[ff]); // Timing
Out[16]= {10.844, Null}

我们将在这里使用的主要思想是加速In[24]:= getResultsForFixedIndex[largeres,10]//Short//Timing Out[24]= {2.687,{{{196,26},{53,36},{360,43},{104,144},<<157674>>,{31,305},{240,291}, {256,38},{352,469}},{<<1>>}}}内的单个列表，索引，组合和频率。虽然无法打包完整列表，但这些部分可以单独包含：

largeres

这也需要一些时间，但这是一次性的操作。

然后将使用以下函数更有效地提取固定索引的结果：

In[18]:= Timing[
   subIndicesPacked = Developer`ToPackedArray[largeres[[All,1]]];
   subCombsPacked =  Developer`ToPackedArray[largeres[[All,2]]];
   subFreqsPacked =  Developer`ToPackedArray[largeres[[All,3]]];
]
Out[18]= {1.672,Null}

现在，我们有：

Clear[extractPositionFromSparseArray];
extractPositionFromSparseArray[HoldPattern[SparseArray[u___]]] := {u}[[4, 2, 2]]

Clear[getCombinationsAndFrequenciesForIndex];
getCombinationsAndFrequenciesForIndex[packedIndices_, packedCombs_, 
    packedFreqs_, index_Integer] :=
With[{positions = 
         extractPositionFromSparseArray[
               SparseArray[1 - Unitize[packedIndices - index]]]},
  {Extract[packedCombs, positions],Extract[packedFreqs, positions]}];

我们得到了30倍的加速度。天真的In[25]:= getCombinationsAndFrequenciesForIndex[subIndicesPacked,subCombsPacked,subFreqsPacked,10] //Short//Timing Out[25]= {0.094,{{{196,26},{53,36},{360,43},{104,144},<<157674>>,{31,305},{240,291}, {256,38},{352,469}},{<<1>>}}}方法。

关于复杂性的一些注释

请注意，第二种解决方案更快，因为它使用优化的数据结构，但其复杂性与基于Select的复杂性相同，即所有组合的唯一组合总列表的线性长度指数。因此，理论上，先前讨论的基于嵌套哈希表等的解决方案可能渐近更好。问题是，在实践中，我们可能很久就会遇到内存限制。对于1000万配置样本，上述代码仍然比我之前发布的最快解决方案快2-3倍。

修改

以下修改：

Select

使代码仍然快两倍。此外，对于更多稀疏索引（例如，使用Clear[getCombinationsAndFrequenciesForIndex]; getCombinationsAndFrequenciesForIndex[packedIndices_, packedCombs_, packedFreqs_, index_Integer] := With[{positions = extractPositionFromSparseArray[ SparseArray[Unitize[packedIndices - index], Automatic, 1]]}, {Extract[packedCombs, positions], Extract[packedFreqs, positions]}];等参数调用样本生成函数），相对于基于generateConfigurations[2000, 500, 500, 5000000]的函数的加速大约 100 次。

Answer 2

我可能在这里使用SparseArrays（参见下面的更新），但是如果你坚持使用函数和*值来存储和检索值，那么方法就是拥有第一部分（f [ 2]等）替换为您在飞行中创建的符号，如：

Table[Symbol["f" <> IntegerString[i, 10, 3]], {i, 11}]
(* ==> {f001, f002, f003, f004, f005, f006, f007, f008, f009, f010, f011} *)

Symbol["f" <> IntegerString[56, 10, 3]]
(* ==> f056 *)

Symbol["f" <> IntegerString[56, 10, 3]][{3, 4}] = 12;
Symbol["f" <> IntegerString[56, 10, 3]][{23, 18}] = 12;

Symbol["f" <> IntegerString[56, 10, 3]] // Evaluate // DownValues
(* ==> {HoldPattern[f056[{3, 4}]] :> 12, HoldPattern[f056[{23, 18}]] :> 12} *)

f056 // DownValues
(* ==> {HoldPattern[f056[{3, 4}]] :> 12, HoldPattern[f056[{23, 18}]] :> 12} *)

我个人更喜欢Leonid的解决方案，因为它更优雅但是YMMV。

<强>更新

根据OP的要求，关于使用SparseArrays：
大型SparseArrays占用标准嵌套列表大小的一小部分。我们可以使f成为稀疏数组的大型（100,000个）稀疏数组：

f = SparseArray[{_} -> 0, 100000];
f // ByteCount
(* ==> 672 *)

(* initialize f with sparse arrays, takes a few seconds with f this large *)
Do[  f[[i]] = SparseArray[{_} -> 0, {100, 110}], {i,100000}] // Timing//First
(* ==> 18.923 *)

(* this takes about 2.5% of the memory that a normal array would take: *)
f // ByteCount
(* ==>  108000040 *)

ConstantArray[0, {100000, 100, 100}] // ByteCount
(* ==> 4000000176 *)

(* counting phase *)
f[[1]][[1, 2]]++;
f[[1]][[1, 2]]++;
f[[1]][[42, 64]]++;
f[[2]][[100, 11]]++;

(* reporting phase *)
f[[1]] // ArrayRules
f[[2]] // ArrayRules
f // ArrayRules

(* 
 ==>{{1, 2} -> 2, {42, 64} -> 1, {_, _} -> 0}
 ==>{{100, 11} -> 1, {_, _} -> 0}
 ==>{{1, 1, 2} -> 2, {1, 42, 64} -> 1, {2, 100, 11} ->  1, {_, _, _} -> 0}
*)

正如您所看到的，ArrayRules列出了一个包含贡献和计数的好列表。这可以分别为每个f [i]或整个束（最后一行）完成。

Answer 3

在某些情况下（取决于生成值所需的性能），使用辅助列表(f[i,0])的以下简单解决方案可能很有用：

f[_Integer][{_Integer ..}] := 0;
f[_Integer, 0] := Sequence @@ {};

Table[
  r = RandomInteger[1000, 2];
  f[h = RandomInteger[100000]][r] = RandomInteger[10];
  f[h, 0] = Union[f[h, 0], {r}];
  , {i, 10^6}];

ExtractConfigurationsAndOccurences[f_, i_] := {f[i, 0], f[i][#] & /@ f[i, 0]};

Timing@ExtractConfigurationsAndOccurences[f, 10]

Out[252]= {4.05231*10^-15, {{{172, 244}, {206, 115}, {277, 861}, {299,
 862}, {316, 194}, {361, 164}, {362, 830}, {451, 306}, {614, 
769}, {882, 159}}, {5, 2, 1, 5, 4, 10, 4, 4, 1, 8}}}

Answer 4

非常感谢所有人提供的帮助。我一直在考虑每个人的输入，我相信在模拟设置中，以下是最佳解决方案：

SetAttributes[linkedList, HoldAllComplete];

temporarySymbols = linkedList[];

SetAttributes[bookmarkSymbol, Listable];

bookmarkSymbol[symbol_]:= 
   With[{old = temporarySymbols}, temporarySymbols= linkedList[old,symbol]];

registerConfiguration[index_]:=registerConfiguration[index]=
  Module[
   {
    cs = linkedList[],
    bookmarkConfiguration,
    accumulator
    },
    (* remember the symbols we generate so we can remove them later *)
   bookmarkSymbol[{cs,bookmarkConfiguration,accumulator}];
   getCs[index] := List @@ Flatten[cs, Infinity, linkedList];
   getCsAndFreqs[index] := {getCs[index],accumulator /@ getCs[index]};
   accumulator[_]=0;
   bookmarkConfiguration[c_]:=bookmarkConfiguration[c]=
     With[{oldCs=cs}, cs = linkedList[oldCs, c]];
   Function[c,
    bookmarkConfiguration[c];
    accumulator[c]++;
    ]
   ]

pattern = Verbatim[RuleDelayed][Verbatim[HoldPattern][HoldPattern[registerConfiguration [_Integer]]],_];

clearSimulationData :=
 Block[{symbols},
  DownValues[registerConfiguration]=DeleteCases[DownValues[registerConfiguration],pattern];
  symbols = List @@ Flatten[temporarySymbols, Infinity, linkedList];
  (*Print["symbols to purge: ", symbols];*)
  ClearAll /@ symbols;
  temporarySymbols = linkedList[];
  ]

它基于Leonid先前帖子中的一个解决方案，附带了belsairus的建议，包括对已处理的配置的额外索引。调整先前的方法，使得可以使用相同的代码或多或少地自然地注册和提取配置。自从簿记和检索以来，它一下子就打了两只苍蝇，并且非常相互关联。

当人们想要逐步添加模拟数据时，这种方法会更好地工作（所有曲线通常都是嘈杂的，因此必须逐步添加运行以获得良好的图形）。当数据一次性生成然后进行分析时，稀疏数组方法将更好地工作，但我不记得在我必须这样做的情况下亲自处理。

另外，我很天真地认为数据提取和生成可以单独处理。在这种特殊情况下，似乎应该考虑到两种观点。我直截了当地道歉，直截了当地驳回了以前在这个方向上提出的任何建议（很少有隐含的建议）。

有一些我不知道如何处理的开放/小问题，例如清除符号时，我无法清除累加器$ 164之类的标题，我只能清除与之关联的子符号。不知道为什么。此外，如果将With[{oldCs=cs}, cs = linkedList[oldCs, c]];更改为类似cs = linkedList[cs, c]];的内容，则不会存储配置。不知道为什么第二个选项不起作用。但这些小问题是明确定义的卫星问题，可以在将来解决。总的来说，所有参与者的慷慨帮助解决了这个问题。

再次感谢所有人的帮助。

此致卓然

P.S。有一些时间，但要了解发生了什么，我将附加用于基准测试的代码。简而言之，想法是生成配置列表，并通过调用registerConfiguration只映射它们。这基本上模拟了数据生成过程。以下是用于测试的代码：

fillSimulationData[sampleArg_] :=MapIndexed[registerConfiguration[#2[[1]]][#1]&, sampleArg,{2}];

sampleForIndex[index_]:=
  Block[{nsamples,min,max},
   min = Max[1,Floor[(9/10)maxSamplesPerIndex]];
   max =  maxSamplesPerIndex;
   nsamples = RandomInteger[{min, max}];
   RandomInteger[{1,10},{nsamples,ntypes}]
   ];

generateSample := 
  Table[sampleForIndex[index],{index, 1, nindexes}];

measureGetCsTime :=((First @ Timing[getCs[#]])& /@ Range[1, nindexes]) // Max

measureGetCsAndFreqsTime:=((First @ Timing[getCsAndFreqs[#]])& /@ Range[1, nindexes]) // Max

reportSampleLength[sampleArg_] := StringForm["Total number of confs = ``, smallest accumulator length ``, largest accumulator length = ``", Sequence@@ {Total[#],Min[#],Max[#]}& [Length /@ sampleArg]]

第一个例子相对温和：

clearSimulationData;

nindexes=100;maxSamplesPerIndex = 1000; ntypes = 2;

largeSample1 = generateSample;

reportSampleLength[largeSample1];

Total number of confs = 94891, smallest accumulator length 900, largest accumulator length = 1000;

First @ Timing @ fillSimulationData[largeSample1]

给出了1.375秒，我觉得这很快。

With[{times = Table[measureGetCsTime, {50}]}, 
 ListPlot[times, Joined -> True, PlotRange -> {0, Max[times]}]]

给出大约0.016秒的时间，

With[{times = Table[measureGetCsAndFreqsTime, {50}]}, 
 ListPlot[times, Joined -> True, PlotRange -> {0, Max[times]}]]

给出相同的时间。现在真正的杀手

nindexes = 10; maxSamplesPerIndex = 100000; ntypes = 10;
largeSample3 = generateSample;
largeSample3 // Short
{{{2,2,1,5,1,3,7,9,8,2},92061,{3,8,6,4,9,9,7,8,7,2}},8,{{4,10,1,5,9,8,8,10,8,6},95498,{3,8,8}}}

报告为

Total number of confs = 933590, smallest accumulator length 90760, largest accumulator length = 96876

生成时间约为1.969 - 2.016秒，这是无法令人难以置信的快速。我的意思是，这就像通过巨大的一百万个元素列表并将函数应用于每个元素。

配置和{configs，freqs}的提取时间分别约为0.015和0.03秒。

对我来说，这是一个令人兴奋的速度，我绝对不会期待Mathematica！

从稀疏定义列表中挑选无模式下值的算法

4 个答案:

原始基于`SubValues`的方法

基于压缩数组和稀疏数组的更快解决方案

关于复杂性的一些注释

从稀疏定义列表中挑选无模式下值的算法

4 个答案:

原始基于SubValues的方法

基于压缩数组和稀疏数组的更快解决方案

关于复杂性的一些注释

原始基于`SubValues`的方法