如何在numpy.loadtxt之后操作数据?

时间:2013-10-14 13:44:55

标签: python arrays numpy file-io matrix

我有如下的原始数据。例如,加载第1行有xlabel的文本文件,第1列是ylabel。让调用文件名为'131014-data-xy-conv-1.txt'。

Y/X (mm),   0,  10, 20, 30, 40
686.6,  -5.02,  -0.417, 0,  100.627,    0
694.08, -5.02,  -4.529, -17.731,    -5.309, -3.535
701.56, 1.869,  -4.529, -17.731,    -5.309, -3.535
709.04, 1.869,  -4.689, -17.667,    -5.704, -3.482
716.52, 4.572,  -4.689, -17.186,    -5.704, -2.51 
724,    4.572,  -4.486, -17.186,    -5.138, -2.51
731.48, 6.323,  -4.486, -16.396,    -5.138, -1.933
738.96, 6.323,  -4.977, -16.396,    -5.319, -1.933
746.44, 7.007,  -4.251, -16.577,    -5.319, -1.688
753.92, 7.007,  -4.251, -16.577,    -5.618, -1.688
761.4,  7.338,  -3.514, -16.78, -5.618, -1.207
768.88, 7.338,  -3.514, -16.78, -4.657, -1.207
776.36, 7.263,  -3.877, -15.99, -4.657, -0.822

(Q1)正如您可以看到的原始数据,它们分别在第1行,第1列中有xlabel和ylabel。如果我使用numpy.loadtxt函数,如何拆分“xs”和“ys”?

rawdata = numpy.loadtxt('131014-data-xy-conv-1.txt')
xs, ys, data = func(rawdata)

我是否必须实施其他逻辑?还是有什么功能?

3 个答案:

答案 0 :(得分:5)

实际上,np.loadtxt无法单独处理第一行,因此您必须做一些聪明的事情。我会给出两种方式,第一种方式更短,但第二种方式更直接

1)你可以通过读第一行作为标题名称来做“黑客”:

y_and_data = np.genfromtxt('131014-data-xy-conv-1.txt', names=True, delimiter=',')
x = np.array(y_and_data.dtype.names[1:], int)
y = y_and_data['YX_mm']
data = y_and_data.view(np.float).reshape(-1, len(y_and_data.dtype))[:,1:]

2)但我建议先单独阅读第一行,保存,然后使用loadtxt打开其余部分(或genfromtxt,因为我已经使用并推荐):

with open('131014-data-xy-conv-1.txt', 'r') as f:
    x = np.array(f.readline().split(',')[1:], int)
    y_and_data = np.genfromtxt(f, delimiter=',')
y = y_and_data[:,0]
data = y_and_data[:,1:]

工作原理,首先打开文件,然后将其命名为f

with open('131014-data-xy-conv-1.txt', 'r') as f:

    firstline = f.readline()           # read off the first line
    firstvalues = firstline.split(',') # split it on the comma
    xvalues = firstvalues[1:]          # and keep the all but the first elements
    x = np.array(xvalues, int)         # make it an array of integers (or float if you prefer)

现在已使用ff.readline读取了第一行,其余部分可以使用genfromtxt读取:

    y_and_data = np.genfromtxt(f, delimiter=',')

现在,其他答案显示如何拆分其余部分:

y = y_and_data[:,0]       # the first column is the y-values
data = y_and_data[:,1:]   # the remaining columns are the data

这是输出:

In [58]: with open('131014-data-xy-conv-1.txt', 'r') as f:
   ....:     x = np.array(f.readline().split(',')[1:], int)
   ....:     y_and_data = np.genfromtxt(f, delimiter=',')
   ....: y = y_and_data[:,0]
   ....: data = y_and_data[:,1:]
   ....: 

In [59]: x
Out[59]: array([ 0, 10, 20, 30, 40])

In [60]: y
Out[60]: 
array([ 686.6 ,  694.08,  701.56,  709.04,  716.52,  724.  ,  731.48,
        738.96,  746.44,  753.92,  761.4 ,  768.88,  776.36])

In [61]: data
Out[61]: 
array([[  -5.02 ,   -0.417,    0.   ,  100.627,    0.   ],
       [  -5.02 ,   -4.529,  -17.731,   -5.309,   -3.535],
       [   1.869,   -4.529,  -17.731,   -5.309,   -3.535],
       [   1.869,   -4.689,  -17.667,   -5.704,   -3.482],
       [   4.572,   -4.689,  -17.186,   -5.704,   -2.51 ],
       [   4.572,   -4.486,  -17.186,   -5.138,   -2.51 ],
       [   6.323,   -4.486,  -16.396,   -5.138,   -1.933],
       [   6.323,   -4.977,  -16.396,   -5.319,   -1.933],
       [   7.007,   -4.251,  -16.577,   -5.319,   -1.688],
       [   7.007,   -4.251,  -16.577,   -5.618,   -1.688],
       [   7.338,   -3.514,  -16.78 ,   -5.618,   -1.207],
       [   7.338,   -3.514,  -16.78 ,   -4.657,   -1.207],
       [   7.263,   -3.877,  -15.99 ,   -4.657,   -0.822]])

答案 1 :(得分:1)

如果您只想在单独的数组中使用xsysdata,则可以执行以下操作:

xs = np.array(open('131014-data-xy-conv-1.txt').readline().split(',')[1:], int)
rawdata = numpy.loadtxt('131014-data-xy-conv-1.txt', skiprows=1)
ys = rawdata[:, 0]
data = rawdata[:, 1:]

请注意skiprows关键字忽略文件的第一行。

答案 2 :(得分:1)

添加到@ bogatron的答案,您可以将参数unpack=True传递到一行中获取xs, ys, data

xs, ys, data = numpy.loadtxt('131014-data-xy-conv-1.txt', skiprows=1, unpack=True)