我正在处理CFD数据(对坐标应用旋转)。为此,我要做以下事情:
- 阅读文件
- 将数据存储到结构化数组
- 操纵数据(进行计算)
- 写一个新文件
它可以工作,但每个文件需要7秒。我有(15000 * 4)个文件要继续......
The output folder already exists. The data in it will be erased
StartReading B--0.000018_tec.dat in progress. - 0.001s elapsed
EndReading B--0.000018_tec.dat in progress. - 0.433s elapsed
StartWriting B--0.000018_tec.dat in progress. - 0.435s elapsed
EndWriting B--0.000018_tec.dat in progress. - 7.585s elapsed
StartReading B--0.000036_tec.dat in progress. - 7.586s elapsed
EndReading B--0.000036_tec.dat in progress. - 7.697s elapsed
StartWriting B--0.000036_tec.dat in progress. - 7.697s elapsed
EndWriting B--0.000036_tec.dat in progress. - 13.472s elapsed
你有什么想法可以改善这种想法吗?我考虑过写作,但我不确定它会改进什么。
以下是阅读/计算/写作时间的示例:
NSMutableString *dict = [NSMutableString string];
[dict appendString:@"{"];
[dict appendFormat:@"'notes':'%@'", notes];
[dict appendFormat:@",'date':'%f'",seconds];
[dict appendFormat:@",'count':'%d'",[ss.count intValue]];
[dict appendFormat:@",'weather':'%@'",wx];
[dict appendFormat:@",'location':'%@'",ss.event.location.name];
[dict appendFormat:@",'latitude':'%@'",[ss.event.location.latitude stringValue]];
[dict appendFormat:@",'longitude':'%@'",[ss.event.location.longitude stringValue]];
[dict appendString:@"}"];
脚本和样本试图让它更加鲁莽:
http://s000.tinyupload.com/index.php?file_id=80589646527340633700
答案 0 :(得分:1)
问题不在于写作本身,而是如何为写作准备和格式化数据。
如果您使用python -m cProfile -s cumtime Plane_modifier_rev4-multiple_files.py > out.txt
之类的内容对脚本进行分析,您会发现大部分时间花在数组格式上
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.003 0.003 22.297 22.297 Plane_modifier_rev4-multiple_files.py:6(<module>)
2 0.282 0.141 21.881 10.941 ASCII_TEC.py:101(write_tecplot)
77424/48512 0.091 0.000 21.527 0.000 numeric.py:1681(array_str)
77424/48512 0.424 0.000 21.477 0.000 arrayprint.py:343(array2string)
48512 0.928 0.000 21.149 0.000 arrayprint.py:233(_array2string)
145536 0.360 0.000 12.532 0.000 arrayprint.py:533(__init__)
145536 5.891 0.000 12.172 0.000 arrayprint.py:547(fillFormat)
48512 0.219 0.000 7.922 0.000 arrayprint.py:700(__init__)
48512 0.620 0.000 5.623 0.000 arrayprint.py:465(_formatArray)
170236 2.416 0.000 4.413 0.000 arrayprint.py:598(__call__)
631546 1.300 0.000 2.933 0.000 numeric.py:2428(seterr)
434430 2.310 0.000 2.310 0.000 {method 'reduce' of 'numpy.ufunc' objects}
315773 0.337 0.000 1.941 0.000 numeric.py:2813(__enter__)
143356 0.234 0.000 1.814 0.000 fromnumeric.py:1772(any)
315773 0.359 0.000 1.689 0.000 numeric.py:2818(__exit__)
48512 0.473 0.000 1.268 0.000 arrayprint.py:639(__init__)
143356 0.157 0.000 1.163 0.000 {method 'any' of 'numpy.ndarray' objects}
631546 0.967 0.000 1.034 0.000 numeric.py:2524(geterr)
143356 0.092 0.000 1.006 0.000 _methods.py:37(_any)
443944 0.763 0.000 0.944 0.000 arrayprint.py:632(_digits)
143358 0.166 0.000 0.418 0.000 numeric.py:464(asanyarray)
145536 0.410 0.000 0.410 0.000 {method 'compress' of 'numpy.ndarray' objects}
e.g。
这个
for name in names:
for col_index in range(0,N,5): #The tecplot data for each variable are saved within 5 columns
f.write(str(Data["node"][name][col_index:col_index+5])[1:-1]+"\n")
f.write("\n"+"\n")
可以改写(并且它必须更快),如
for name in names:
n = Data["node"][name]
for col_index in range(0,N,5): #The tecplot data for each variable are saved within 5 columns
vs = n[col_index:col_index+5]
f.write(",".join([str(v) for v in vs])+"\n")
f.write("\n"+"\n")
修改
write_tecplot上的一些变化
def write_tecplot(outfile,Data):
"""
The expected Data is a dictionary with one structured array: node and one simple array: face
"""
N = Data["node"].shape[0] #N is the number of nodes
E = Data["face"].shape[0] #E is the number of faces
#Create the file and the main names
with open(outfile+'.dat', 'w') as f:
""" Write HEADER """
f.write('TITLE = \"title\"\n')
f.write('VARIABLES = ')
#initialize
names = Data["node"].dtype.names
#write variable names
f.write(u'"'+'\",\"'.join(names)+'"\n')
f.write('ZONE T="tecdata", N=%s, E=%s, ET=QUADRILATERAL, F=FEBLOCK\n\n'%(N,E))
# Data_number = len(Data["node"]) #Data_number is the
""" WRITE DATA """
#Write node data
for name in names:
n = Data["node"][name]
for col_index in range(0,N,5): #The tecplot data for each variable are saved within 5 columns
f.write(",".join([str(v) for v in n[col_index:col_index+5]])+"\n")
f.write("\n\n")
face = Data["face"]
for col_index in range(0,E,1): #The tecplot data for each variable are saved within 5 columns
f.write(",".join([str(v) for v in face[col_index]])+"\n")
f.write("\n\n")