我正在尝试用numpy读取一个txt文件,我有以下代码
import numpy as np
def parsefile(filename):
return np.genfromtxt(filename,
delimiter = dly_delimiter,
usecols = dly_usecols,
dtype = dly_dtype,
skip_header = 4,
names = dly_names)
dly_delimiter = [7,5,9,13,10,13,9,13,10,13,10]
dly_usecols = [1,7]
dly_dtype = [np.int32, np.float64]
dly_names = ['node number', 'z_force']
force = parsefile('nodfor')
print(force)
以下出现
[(-1, nan) (-1, nan) (-1, nan) ..., (-1, nan) (-1, nan) (-1, nan)]
但我感兴趣的是'nd#'后面的数字和'zforce'后面的数字。
[the format of the txt file is like this] 这是txt文件的片段(两行):
nd# 39584 xforce= 0.0000E+00 yforce= 0.0000E+00 zforce=
0.0000E+00 energy= 0.0000E+00 setid = 1
nd# 39585 xforce= 0.0000E+00 yforce= 0.0000E+00 zforce=
0.0000E+00 energy= 0.0000E+00 setid = 1
这是一个已知问题。我该如何解决这个问题?
答案 0 :(得分:2)
由于评论字符,我的第一次尝试错过了大部分文字:
In [44]: txt = b""" nd# 39584 xforce= 0.0000E+00 yforce= 0.0000E+00 zforce= 0.0000E+00 energy= 0.0000E+00 setid = 1
...: nd# 39585 xforce= 0.0000E+00 yforce= 0.0000E+00 zforce= 0.0000E+00 energy= 0.0000E+00 setid = 1
...: """
In [45]: data=np.genfromtxt(txt.splitlines(),dtype=None)
In [46]: data
Out[46]:
array([b'nd', b'nd'],
dtype='|S2')
关闭评论:
In [53]: data=np.genfromtxt(txt.splitlines(),dtype=None, comments=None)
In [54]: data
Out[54]:
array([ (b'nd#', 39584, b'xforce=', 0., b'yforce=', 0., b'zforce=', 0., b'energy=', 0., b'setid', b'=', 1),
(b'nd#', 39585, b'xforce=', 0., b'yforce=', 0., b'zforce=', 0., b'energy=', 0., b'setid', b'=', 1)],
dtype=[('f0', 'S3'), ('f1', '<i4'), ('f2', 'S7'), ('f3', '<f8'), ('f4', 'S7'), ('f5', '<f8'), ('f6', 'S7'), ('f7', '<f8'), ('f8', 'S7'), ('f9', '<f8'), ('f10', 'S5'), ('f11', 'S1'), ('f12', '<i4')])
添加usecols
In [55]: dly_usecols = [1,7]
In [56]: data=np.genfromtxt(txt.splitlines(),dtype=None, comments=None,usecols=dly_usecols)
In [57]: data
Out[57]:
array([(39584, 0.), (39585, 0.)],
dtype=[('f0', '<i4'), ('f1', '<f8')])
与位置分隔符相同。
因此列分隔符不会覆盖注释。在继续使用分隔符分割行之前,代码可能会删除注释。