我有这个特定的功能来提取表单中的部分列表:Give[list, elem]
返回 list 的部分,该部分对应于 elem 的位置全局$Reference
变量(如果已定义)。我在整个代码中大量使用此函数,因此我决定对其进行优化。这是我设法到目前为止的地方,但坦率地说,我不知道如何前进。
ClearAll[Give, $Reference, set];
Give::noref = "No, non-list or empty $Reference was defined to refer to by Give.";
Give::noelem = "Element (or some of the elements in) `1` is is not part of the reference set `2`.";
Give::nodepth = "Give cannot return all the elements corresponding to `1` as the list only has depth `2`.";
give[list_, elem_List, ref_] := Flatten[Pick[list, ref, #] & /@ elem, 1];
give[list_, elem_, ref_] := First@Pick[list, ref, elem];
Options[Give] = {Reference :> $Reference}; (* RuleDelayed is necessary, for it is possible that $Reference changes between two subsequent Give calls, and without delaying its assignment, ref would use previous value of $Reference instead of actual one. *)
Give[list_List, elem___, opts___?OptionQ] := Module[{ref, pos},
ref = Reference /. {opts} /. Options@Give;
Which[
Or[ref === {}, Head@ref =!= List], Message[Give::noref]; {},
Complement[Union@Flatten@{elem}, ref] =!= {}, Message[Give::noelem, elem, ref]; {},
Length@{elem} > Depth@list - 1, Message[Give::nodepth, {elem}, Depth@list]; {},
True, Fold[give[#1, #2, ref] &, list, {elem}]
]];
In[106]:= $Reference = {"A", "B", "C"};
set = {{1, 2, 3}, {4, 5, 6}, {7, 8, 9}};
Give[set, "B"](* return specified row *)
Out[108]= {4, 5, 6}
In[109]:= Give[set, "B", "A"] (* return entry at specified row & column *)
Out[109]= 4
In[110]:= Give[set, {"B", "A"}] (* return multiple rows *)
Out[110]= {{4, 5, 6}, {1, 2, 3}}
我决定删除不同的签名函数调用,因为列表版本可能会调用非列表版本,这意味着必须多次执行错误处理(对于列表中的每个元素)。遗憾的是,错误处理不能被丢弃。如果改进的版本更加健壮(例如可以处理更多维度),那不是问题,但上面的示例就足够了。
In[139]:= First@Timing[Give[set, RandomChoice[$Reference, 10000]]] (* 1D test *)
Out[139]= 0.031
In[138]:= First@Timing[Table[Give[set, Sequence @@ RandomChoice[$Reference, 2]], {10000}]] (* 2d test *)
Out[138]= 0.499
我确定这不是有效的代码,所以请随意改进它。任何帮助都是值得赞赏的,即使它仅减少了几纳秒。
答案 0 :(得分:3)
大型列表的主要效率问题似乎来自映射Pick
。如果用这个替换give
的相应定义,则可以避免这种情况:
give[list_, elem_List, ref_] :=
list[[elem /. Dispatch[Thread[ref -> Range[Length[ref]]]]]];
这是我的测试代码:
In[114]:=
Block[{$Reference = Range[100000],set = Range[100000]^2,rnd,ftiming,stiming},
rnd = RandomSample[$Reference,10000];
ftiming = First@Timing[res1 = Give[set,rnd]];
Block[{give},
give[list_,elem_List,ref_]:=list[[elem/.Dispatch[Thread[ref->Range[Length[ref]]]]]];
give[list_,elem_,ref_]:=First@Pick[list,ref,elem];
stiming = First@Timing[res2 = Give[set,rnd]];];
{ftiming,stiming,res1===res2}
]
Out[114]= {1.703,0.188,True}
对于这个用例,你的速度提高了10倍。我没有测试2D,但猜测它也应该有帮助。
修改强>
您可以通过在$Reference
正文的开头缓存Dispatch[Thread[ref->Range[Length[$Reference]]]
(Give
)的已调度表格来进一步提高效果,然后将其传递给give
(明确地或通过使give
成为内部函数 - 通过Module
变量 - 这将引用它),以便在您调用give
时不必重新计算它多次通过Fold
。您也可以有条件地执行此操作,比如在elem
中有大量元素列表,以证明创建调度表所需的时间。
答案 1 :(得分:3)
这是基于我索引实数的问题的另一个解决方案。它使用延迟评估来显示错误消息(如果需要的话)(我在这个网站上学到的一个技巧!感谢所有人的奉献精神,在这里学习新东西总是很愉快!)
ListToIndexFunction[list_List,precision_:0.00001]:=
Module[{numbersToIndexFunction},
numbersToIndexFunction::indexNotFound="Index of `1` not found.";
MapThread[(numbersToIndexFunction[#1]=#2)&,{Round[list,precision],Range[Length@list]}];
numbersToIndexFunction[x_]/;(Message[numbersToIndexFunction::indexNotFound,x];False):=Null;
numbersToIndexFunction[Round[#,precision]]&
];
Test:
f=ListToIndexFunction[{1.23,2.45666666666,3}]
f[2.456666]
f[2.456665]
答案 2 :(得分:2)
这与列昂尼德的答案类似,但是以我自己的风格。
我使用相同的Dispatch
表格,我建议尽可能将其作为外部表格。为此,我建议在$Rules
更改时更新新符号$Reference
。例如:
$Reference = RandomSample["A"~CharacterRange~"Z"];
$Rules = Dispatch@Thread[$Reference -> Range@Length@$Reference];
如果经常这样做(问),这可以自动为方便起作用。
除此之外,我的完整代码:
ClearAll[Give, $Reference, Reference, $Rules];
Give::noref = "No, non-list or empty $Reference was defined to refer to by Give.";
Give::noelem = "Element (or some of the elements in) `1` is is not part of the reference set `2`.";
Give::nodepth = "Give cannot return all the elements corresponding to `1` as the list only has depth `2`.";
Options[Give] = {Reference :> $Reference};
Give[list_List, elem___, opts : OptionsPattern[]] :=
Module[{ref, pos, rls},
ref = OptionValue[Reference];
rls = If[{opts} == {}, $Rules, Dispatch@Thread[ref -> Range@Length@ref]];
Which[
ref === {} || Head@ref =!= List,
Message[Give::noref]; {},
Complement[Union@Flatten@{elem}, ref] =!= {},
Message[Give::noelem, elem, ref]; {},
Length@{elem} > Depth@list - 1,
Message[Give::nodepth, {elem}, Depth@list]; {},
True,
list[[##]] & @@ ({elem} /. rls)
]
];
答案 3 :(得分:2)
这是我让这段代码休息2年后得到的。它会记住给定引用集的调度表,并使用Part
- 类型语法。我删除了所有错误消息,并删除了全局$Reference
符号。非常不喜欢Mathematica ,我从不喜欢它。
dispatch[ref_] := dispatch@ref = (Dispatch@Thread[ref -> Range@Length@ref]);
give[list_, elem__, ref_] := list[[Sequence @@ ({elem} /. dispatch@ref)]];
Memoization确保给定ref
的调度表仅计算一次。在内存中维护多个调度表不是问题,因为它们通常很小。
ref = Reference = {"A", "B", "C"};
set = {{1, 2, 3}, {4, 5, 6}, {7, 8, 9}};
give[set, "B", ref] (* ==> {4, 5, 6} *)
give[set, "B", "A", ref] (* ==> 4 *)
give[set, {"B", "A"}, ref] (* ==> {{4, 5, 6}, {1, 2, 3}} *)
定时:
n = 20000;
{
First@Timing[give[set, #, ref] & /@ RandomChoice[ref, n]],
First@Timing[give[set, RandomChoice[ref, n], ref]],
First@Timing[Table[give[set, Sequence @@ RandomChoice[ref, 2], ref], {n}]]
}
{0.140401, 0., 0.202801}
将其与原始功能的时间进行比较:
{
First@Timing[Give[set, #] & /@ RandomChoice[ref, n]],
First@Timing[Give[set, RandomChoice[ref, n]]],
First@Timing[Table[Give[set, Sequence @@ RandomChoice[ref, 2]], {n}]]
}
{0.780005, 0.015600, 1.029607}