优化零件提取

时间:2011-05-01 17:29:35

标签: optimization wolfram-mathematica

我有这个特定的功能来提取表单中的部分列表:Give[list, elem]返回 list 的部分,该部分对应于 elem 的位置全局$Reference变量(如果已定义)。我在整个代码中大量使用此函数,因此我决定对其进行优化。这是我设法到目前为止的地方,但坦率地说,我不知道如何前进。

ClearAll[Give, $Reference, set];

Give::noref = "No, non-list or empty $Reference was defined to refer to by Give.";
Give::noelem = "Element (or some of the elements in) `1` is is not part of the reference set `2`.";
Give::nodepth = "Give cannot return all the elements corresponding to `1` as the list only has depth `2`.";

give[list_, elem_List, ref_] := Flatten[Pick[list, ref, #] & /@ elem, 1];
give[list_, elem_, ref_] := First@Pick[list, ref, elem];

Options[Give] = {Reference :> $Reference}; (* RuleDelayed is necessary, for it is possible that $Reference changes between two subsequent Give calls, and without delaying its assignment, ref would use previous value of $Reference instead of actual one. *)
Give[list_List, elem___, opts___?OptionQ] := Module[{ref, pos},
   ref = Reference /. {opts} /. Options@Give;
   Which[
      Or[ref === {}, Head@ref =!= List], Message[Give::noref]; {},
      Complement[Union@Flatten@{elem}, ref] =!= {}, Message[Give::noelem, elem, ref]; {},
      Length@{elem} > Depth@list - 1, Message[Give::nodepth, {elem}, Depth@list]; {},
      True, Fold[give[#1, #2, ref] &, list, {elem}]
]];



In[106]:= $Reference = {"A", "B", "C"};
set = {{1, 2, 3}, {4, 5, 6}, {7, 8, 9}};

Give[set, "B"](* return specified row *)
Out[108]= {4, 5, 6}

In[109]:= Give[set, "B", "A"] (* return entry at specified row & column *)
Out[109]= 4

In[110]:= Give[set, {"B", "A"}] (* return multiple rows *)
Out[110]= {{4, 5, 6}, {1, 2, 3}}

我决定删除不同的签名函数调用,因为列表版本可能会调用非列表版本,这意味着必须多次执行错误处理(对于列表中的每个元素)。遗憾的是,错误处理不能被丢弃。如果改进的版本更加健壮(例如可以处理更多维度),那不是问题,但上面的示例就足够了。

In[139]:= First@Timing[Give[set, RandomChoice[$Reference, 10000]]] (* 1D test *)

Out[139]= 0.031

In[138]:= First@Timing[Table[Give[set, Sequence @@ RandomChoice[$Reference, 2]], {10000}]] (* 2d test *)

Out[138]= 0.499

我确定这不是有效的代码,所以请随意改进它。任何帮助都是值得赞赏的,即使它仅减少了几纳秒。

4 个答案:

答案 0 :(得分:3)

大型列表的主要效率问题似乎来自映射Pick。如果用这个替换give的相应定义,则可以避免这种情况:

give[list_, elem_List, ref_] := 
    list[[elem /. Dispatch[Thread[ref -> Range[Length[ref]]]]]];

这是我的测试代码:

In[114]:= 
  Block[{$Reference = Range[100000],set = Range[100000]^2,rnd,ftiming,stiming},
      rnd = RandomSample[$Reference,10000];
      ftiming = First@Timing[res1 = Give[set,rnd]];
      Block[{give},
        give[list_,elem_List,ref_]:=list[[elem/.Dispatch[Thread[ref->Range[Length[ref]]]]]];
        give[list_,elem_,ref_]:=First@Pick[list,ref,elem];
        stiming = First@Timing[res2 = Give[set,rnd]];];
   {ftiming,stiming,res1===res2}
]

Out[114]= {1.703,0.188,True}

对于这个用例,你的速度提高了10倍。我没有测试2D,但猜测它也应该有帮助。

修改

您可以通过在$Reference正文的开头缓存Dispatch[Thread[ref->Range[Length[$Reference]]]Give)的已调度表格来进一步提高效果,然后将其传递给give (明确地或通过使give成为内部函数 - 通过Module变量 - 这将引用它),以便在您调用give时不必重新计算它多次通过Fold。您也可以有条件地执行此操作,比如在elem中有大量元素列表,以证明创建调度表所需的时间。

答案 1 :(得分:3)

这是基于我索引实数的问题的另一个解决方案。它使用延迟评估来显示错误消息(如果需要的话)(我在这个网站上学到的一个技巧!感谢所有人的奉献精神,在这里学习新东西总是很愉快!)

ListToIndexFunction[list_List,precision_:0.00001]:=
   Module[{numbersToIndexFunction},

      numbersToIndexFunction::indexNotFound="Index of `1` not found.";

      MapThread[(numbersToIndexFunction[#1]=#2)&,{Round[list,precision],Range[Length@list]}];
      numbersToIndexFunction[x_]/;(Message[numbersToIndexFunction::indexNotFound,x];False):=Null;

      numbersToIndexFunction[Round[#,precision]]&
   ];

Test: 
f=ListToIndexFunction[{1.23,2.45666666666,3}]
f[2.456666]
f[2.456665]

答案 2 :(得分:2)

这与列昂尼德的答案类似,但是以我自己的风格。

我使用相同的Dispatch表格,我建议尽可能将其作为外部表格。为此,我建议在$Rules更改时更新新符号$Reference。例如:

$Reference = RandomSample["A"~CharacterRange~"Z"];

$Rules = Dispatch@Thread[$Reference -> Range@Length@$Reference];

如果经常这样做(问),这可以自动为方便起作用。

除此之外,我的完整代码:

ClearAll[Give, $Reference, Reference, $Rules];

Give::noref = "No, non-list or empty $Reference was defined to refer to by Give.";
Give::noelem = "Element (or some of the elements in) `1` is is not part of the reference set `2`.";
Give::nodepth = "Give cannot return all the elements corresponding to `1` as the list only has depth `2`.";

Options[Give] = {Reference :> $Reference};

Give[list_List, elem___, opts : OptionsPattern[]] := 
  Module[{ref, pos, rls},
   ref = OptionValue[Reference];
   rls = If[{opts} == {}, $Rules, Dispatch@Thread[ref -> Range@Length@ref]];
   Which[
    ref === {} || Head@ref =!= List,
        Message[Give::noref]; {},
    Complement[Union@Flatten@{elem}, ref] =!= {},
        Message[Give::noelem, elem, ref]; {},
    Length@{elem} > Depth@list - 1, 
        Message[Give::nodepth, {elem}, Depth@list]; {},
    True,
        list[[##]] & @@ ({elem} /. rls)
   ]
  ];

答案 3 :(得分:2)

这是我让这段代码休息2年后得到的。它会记住给定引用集的调度表,并使用Part - 类型语法。我删除了所有错误消息,并删除了全局$Reference符号。非常不喜欢Mathematica ,我从不喜欢它。

dispatch[ref_] := dispatch@ref = (Dispatch@Thread[ref -> Range@Length@ref]);
give[list_, elem__, ref_] := list[[Sequence @@ ({elem} /. dispatch@ref)]];

Memoization确保给定ref的调度表仅计算一次。在内存中维护多个调度表不是问题,因为它们通常很小。

ref = Reference = {"A", "B", "C"};
set = {{1, 2, 3}, {4, 5, 6}, {7, 8, 9}};

give[set, "B", ref]          (* ==> {4, 5, 6}              *)
give[set, "B", "A", ref]     (* ==> 4                      *)
give[set, {"B", "A"}, ref]   (* ==> {{4, 5, 6}, {1, 2, 3}} *)

定时:

n = 20000;
{
First@Timing[give[set, #, ref] & /@ RandomChoice[ref, n]],
First@Timing[give[set, RandomChoice[ref, n], ref]],
First@Timing[Table[give[set, Sequence @@ RandomChoice[ref, 2], ref], {n}]]
}
{0.140401, 0., 0.202801}

将其与原始功能的时间进行比较:

{
First@Timing[Give[set, #] & /@ RandomChoice[ref, n]],
First@Timing[Give[set, RandomChoice[ref, n]]],
First@Timing[Table[Give[set, Sequence @@ RandomChoice[ref, 2]], {n}]]
}
{0.780005, 0.015600, 1.029607}