在julia中分析并行代码的合适方法是什么?我跑的时候
@profile foo(...)
foo是我的功能,我得到了
julia> Profile.print()
1234 task.jl; anonymous; line: 23
4 multi.jl; remotecall_fetch; line: 695
2 multi.jl; send_msg_; line: 172
2 serialize.jl; serialize; line: 74
2 serialize.jl; serialize; line: 299
2 serialize.jl; serialize; line: 130
2 serialize.jl; serialize; line: 299
1 dict.jl; serialize; line: 369
1 serialize.jl; serialize_type; line: 278
1 serialize.jl; serialize; line: 199
1 serialize.jl; serialize; line: 227
1 serialize.jl; serialize; line: 160
1 serialize.jl; serialize; line: 160
1 serialize.jl; serialize; line: 299
1 serialize.jl; serialize; line: 294
1 io.jl; write; line: 47
1 ./iobuffer.jl; write; line: 234
1 ./iobuffer.jl; ensureroom; line: 151
1 ./array.jl; resize!; line: 503
2 multi.jl; send_msg_; line: 178
2 stream.jl; write; line: 724
1230 multi.jl; remotecall_fetch; line: 696
1230 ./multi.jl; wait_full; line: 595
1230 ./task.jl; wait; line: 189
1229 ./task.jl; wait; line: 269
1229 ./stream.jl; process_events; line: 529
1 ./task.jl; wait; line: 282
1 ./stream.jl; process_events; line: 529
402 task.jl; anonymous; line: 95
402 REPL.jl; eval_user_input; line: 53
401 profile.jl; anonymous; line: 14
401 ...mba/src/model/mcmc.jl; mcmc; line: 314
401 ./task.jl; sync_end; line: 306
401 task.jl; wait; line: 48
401 ./task.jl; wait; line: 189
401 ./task.jl; wait; line: 269
401 ./stream.jl; process_events; line: 529
1 profile.jl; anonymous; line: 16
217 task.jl; task_done_hook; line: 83
217 ./task.jl; wait; line: 269
217 ./stream.jl; process_events; line: 529
答案 0 :(得分:1)
不知道它是否可行,但你可能想考虑让工作人员调用类似这样的函数:
function profile_function(func, args...)
Profile.clear()
ret = @profile apply(func, args)
pdata = Profile.retrieve()
ret, pdata
end
pdata
应包含该工作人员的分析数据。您应该能够在ProfileView中查看它。
答案 1 :(得分:0)
您可以尝试VTune Amplifier(https://software.intel.com/en-us/intel-vtune-amplifier-xe)在功能级别分析Julia代码,如https://software.intel.com/en-us/blogs/2013/10/10/profiling-julia-code-with-intel-vtune-amplifier所述。您可能还需要将修补程序应用于LLVM(https://gist.github.com/ArchRobison/d3601433d160b05ed5ee)以解决源级别性能信息错误,以便在源代码级别获取正确的数据