我是五个要在python中读取的.xpt数据文件。我想将这些文件中的数据生成一个数组。这是一个机器学习项目。我正在尝试使用此代码,但在“使用xport.Reader(fname)作为读者:”行上遇到错误。 虽然我仍然不确定是否解决此错误会解决整个代码还是现在。我还有其他方法可以使用吗?
# imports to read the xpt file data
import xport
import numpy as np
import os
#read the xpt file data
FN = ["BMX.XPT", "BMX_B.XPT", "BMX_C.XPT", "BMX_D.XPT", "BMX_E.XPT"]
def get_data(fname):
Z={}
H=None
with xport.Reader(fname) as reader:
for row in reader:
if H is None:
H=row.keys()
H.remove("SEQN")
H.sort()
Z[row["SEQN"]] = [row[k] for k in H]
return Z,H
# call get_data method on each file
D,VN =[],[]
for fn in FN:
fn_full = os.path.join("../Data/", fn)
X,H = get_data(fn_full)
s = fn.replace(".XPT", "")
H = [s + ":" + x for x in H]
D.append(X)
VN += H
## The sequence numbers that are in all data sets
KY = set(D[0].keys())
for d in D[1:]:
KY &= set(d.keys())
KY = list(KY)
KY.sort()
def to_float(x):
try:
return float(x)
except ValueError:
return float("nan")
## Merge the data
Z = []
for ky in KY:
z = []
map(z.extend, (d[ky] for d in D))
## equivalent to
## for d in D:
## z.extend(d[ky])
z = [to_float(a) for a in z]
## equivalent to
## map(to_float, z)
Z.append(z)
Z = np.array(Z)
答案 0 :(得分:0)
似乎xport.reader的输入应该是文件而不是字符串。您应该将其作为文件打开,而不是将文件名作为参数传递。
例如
with xport.Reader(open(fname, ‘r’)) as reader:
....