我正在尝试读取大型示波器.trc
文件并绘制它们。绘制一个文件是可行的,但是一旦将脚本放入循环中,尝试绘制所有文件(一个文件一个循环),我就会得到MemoryError
。
代码:
import os
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import readTrc #external file, same location as script
foldername = 'trc_folder'
folder = os.listdir(foldername)
path = os.path.dirname(os.path.realpath(__file__))
for filenumber, i in enumerate(folder):
trc = path + '/' + foldername + '/' + i
print('reading trc file ' + str(filenumber))
datX, datY, m = readTrc.readTrc(trc)
srx, sry = pd.Series(datX * 1000), pd.Series(datY * 1000)
df_oszi = pd.concat([srx, sry], axis = 1)
df_oszi.set_index(0, inplace = True)
#ERROR APPEARS with xticks argument
#removing xticks does not help, because then errorpath changes to
#/usr/local/lib/python3.6/dist-packages/pandas/plotting/_core.py
df_oszi.plot(grid = 1,
color = 'blue',
linewidth = 0.5,
figsize = (9,5),
legend = False,
xticks = np.arange(df_oszi.index[0], df_oszi.index[-1], 1))
print('plotting file ' + str(filenumber))
plt.savefig('Plot_' + str(filenumber) + '.png', dpi = 300)
问题似乎出在外部模块readTrc
上。我花了相当长的时间才弄清楚这一点,因为python在Matplotlib
和Pandas
而不是readTrc
周围抛出了错误,这似乎是读取.trc
文件的非正式脚本。我在网上寻找它的原因是我正在寻找一种方法来读取python中的.trc
文件。如果您知道读取示波器文件的更好方法,请告诉我。
我将执行脚本所需的所有内容压缩到以下文件夹:folder
(它非常大582MB
,因为每个.trc
文件的大小约为200MB
),您可以在脚本中找到一个脚本,一个包含.trc
个文件的文件夹以及一个外部python文件(模块)readTrc
,这是读取.trc
文件所必需的。执行脚本应该绘制第一个文件,但是至少在我的Ubuntu机器上,绘制/构造第二个文件时会抛出MemoryError
。令我困惑的是,我只能在 Ubuntu (18.04)上获得此MemoryError
,而不是在 Windows 10 上获得。
我将非常感谢您的帮助,以便我可以继续进行我的项目。如果您需要其他信息,请告诉我。
编辑:
readTrc.py的单个下载
Script.py的单个下载
print(type(datX))
返回:
<class 'numpy.ndarray'>
打印datX
返回一个具有 5000万值的对象:
[-0.005 -0.005 -0.005 ... 0.005 0.005 0.005]
这些通过print()
函数是有效的,并且是:
-0.004999999906663635
-0.004999999806663634
-0.004999999706663633
-0.004999999606663631
-0.00499999950666363
编辑2 :
要使用新版本的readTrc
运行代码,请进行以下更改:
import os
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import readTrc
foldername = 'trc_folder'
folder = os.listdir(foldername)
path = os.path.dirname(os.path.realpath(__file__))
for filenumber, i in enumerate(folder):
trc = path + '/' + foldername + '/' + i
print('reading trc file ' + str(filenumber))
datX, datY, d = readTrc.Trc().open(trc)
srx, sry = pd.Series(datX * 1000), pd.Series(datY * 1000)
df_oszi = pd.concat([srx, sry], axis = 1)
df_oszi.set_index(0, inplace = True)
df_oszi.plot(grid = 1,
color = 'blue',
linewidth = 0.5,
figsize = (9,5),
legend = False,
xticks = np.arange(df_oszi.index[0], df_oszi.index[-1], 1))
print('plotting file ' + str(filenumber))
plt.savefig('Plot_' + str(filenumber) + '.png', dpi = 300)
内存错误:
Traceback (most recent call last):
File "/home/artur/Desktop/zip_original/Script.py", line 27, in <module>
xticks = np.arange(df_oszi.index[0], df_oszi.index[-1], 1))
File "/usr/local/lib/python3.6/dist-packages/pandas/plotting/_core.py", line 2941, in __call__
sort_columns=sort_columns, **kwds)
File "/usr/local/lib/python3.6/dist-packages/pandas/plotting/_core.py", line 1977, in plot_frame
**kwds)
File "/usr/local/lib/python3.6/dist-packages/pandas/plotting/_core.py", line 1804, in _plot
plot_obj.generate()
File "/usr/local/lib/python3.6/dist-packages/pandas/plotting/_core.py", line 260, in generate
self._make_plot()
File "/usr/local/lib/python3.6/dist-packages/pandas/plotting/_core.py", line 985, in _make_plot
**kwds)
File "/usr/local/lib/python3.6/dist-packages/pandas/plotting/_core.py", line 1001, in _plot
lines = MPLPlot._plot(ax, x, y_values, style=style, **kwds)
File "/usr/local/lib/python3.6/dist-packages/pandas/plotting/_core.py", line 615, in _plot
return ax.plot(*args, **kwds)
File "/usr/local/lib/python3.6/dist-packages/matplotlib/__init__.py", line 1805, in inner
return func(ax, *args, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/matplotlib/axes/_axes.py", line 1604, in plot
self.add_line(line)
File "/usr/local/lib/python3.6/dist-packages/matplotlib/axes/_base.py", line 1891, in add_line
self._update_line_limits(line)
File "/usr/local/lib/python3.6/dist-packages/matplotlib/axes/_base.py", line 1913, in _update_line_limits
path = line.get_path()
File "/usr/local/lib/python3.6/dist-packages/matplotlib/lines.py", line 945, in get_path
self.recache()
File "/usr/local/lib/python3.6/dist-packages/matplotlib/lines.py", line 649, in recache
self._xy = np.column_stack(np.broadcast_arrays(x, y)).astype(float)
MemoryError
修改3:
对数据集进行采样似乎会减少数据值。这些是带有sampling = 1, sampling = 10, sampling = 100
srx, sry = pd.Series(datX[::sampling] * 1000), pd.Series(datY[::sampling] * 1000)
其原因是超高频波(UHF)的脉冲周期极短。每个脉冲只能由几个数据值组成。如果您降低考虑的值的数量,则会导致大量数据丢失。尽管此解决方案可以使代码正常工作,但它也会大大减少数据值。
答案 0 :(得分:2)
哦,哇,我看不见树木所用的木头。
您正在尝试绘制过多的 个数据点(即100000002
,我认为以600dpi打印的纸长约4公里),可以通过采样来解决:
sampling=100
srx, sry = pd.Series(datX[::sampling] * 1000), pd.Series(datY[::sampling] * 1000)
或通过有选择地研究特定范围:
srx, sry = pd.Series(datX[0:50000] * 1000), pd.Series(datY[0:50000] * 1000)
或两者的组合。
答案 1 :(得分:0)
花了很多时间,但我设法控制了MemoryError
。我不仅要在每个循环的末尾放置gc.collect()
,而且还要将plt.close()
放在循环末尾。只有这样,错误才会停止。对困惑感到抱歉。我从中学到了很多。