如何将以下示例文本作为Python数组阅读?

时间:2016-01-12 22:28:49

标签: python numpy

我想编写一个Python代码来读取文本文件(下面提供的示例)并创建两个数组x(a,b,i)y(a,b,i)其中

在数据中:

  • 第一个号码只出现一次&确定文件中的#个块数 保存文件中的块数(在本例中为2个块)
  • 2nd&在每个块之前重复第3行。

此外,第三行确定每个块中的行数。 (参见aa& bb的循环)

示例数据:

3
23.32
2 
  0.0000000e+00   0.0000000e+00
  5.9366029e-02   0.0000000e+00
 -3.9595528e-01  -1.1149379e-01
  1.3524440e-01   0.0000000e+00
  1.7891648e-01  -5.9942935e-02
  1.2513530e-01  -6.5704141e-02
20.58
2
  0.0000000e+00   0.0000000e+00
  1.2513530e-01  -6.5704141e-02
 -1.1417124e-01   3.4994752e-02
  2.8721574e-02  -4.8673200e-02
  5.7040237e-02  -5.2522356e-02
  1.5232250e-01   0.0000000e+00
18.79
2
  5.8818579e-02  -5.6853819e-03
  8.1824907e-03   0.0000000e+00
 -4.5727895e-02  -6.0582991e-02
  1.1766566e-01  -8.2278552e-02
 -5.8346102e-02  -1.7881959e-02
 -4.0660942e-02   9.1207909e-02

我尝试使用以下代码将数据保存在一个数组中(而不是两个)但是我被卡住了:

file = './sample.txt'
with open(file, mode = 'rt') as f: lines = f.readline()
bl = int( lines[0] ) # number of blocks
s = float(lines[1]) # the 1st number before every block
a = int(lines[2]) # the 2nd number right before every block. It determines how many lines every block has (see loops for aa & bb)

# s = np.zeros(bl)
# y = np.zeros(bl)
# x = np.zeros(bl,a+1,a)

for i in range(bl,0,-1): # reverse iteration (3 to 1)
    s[i] = float( lines[] ) # 
    y[i] = int( lines[] )

    for aa in range(a + 1):
        for bb in range(aa+1):
            x(aa,bb,i) = np.array([row.split() for row in lines[3+i*bl+(i*2):3+
            (i+1)*bl+(i*2)]],np.float128) # I know this isn't right; see below

注意:如果我单独使用最后一行和i的特定值,它适用于一个     块:

i = 1
bl = lines[0]
np.array([row.split() for row in lines[3+i*bl+(i*2):3+(i+1)*bl+
            (i*2)]],np.float128)

更新&溶液 我使用下面的代码解决了我的问题。我是Numpy的新手,所以我的问题很基本。

for i in np.arange(bl):
    tmp = np.array([row.split() for row in lines[3+i*bl+(i*2):3+(i+1)*bl(i*2)]],np.float128)
    x[i,:,:] = np.reshape(tmp[:,0], (6,1))
    y[i,:,:] = np.reshape(tmp[:,1], (6,1))

1 个答案:

答案 0 :(得分:1)

我将文件读作一组行,例如

with open('file') as f: txt=f.readlines()

txt应该看起来像(通过复制和粘贴模拟):

In [4]: txt
Out[4]: 
['2',
 '252.6',
 '28',
 '  0.0000000e+00   0.0000000e+00',
 '  1.3524440e-01   0.0000000e+00',
 ...
 '  8.1824907e-03   0.0000000e+00',
 ' -4.5727895e-02  -6.0582991e-02']

我可以通过将第1行转换为整数来找到块数:

In [5]: int(txt[0])
Out[5]: 2

我可以用。

选择第一个数据块
In [7]: txt[3:3+14]
Out[7]: 
['  0.0000000e+00   0.0000000e+00',
 '  1.3524440e-01   0.0000000e+00',
 '  3.1416022e-02  -1.2261651e-01',
 '  5.9366029e-02   0.0000000e+00',
 '  1.7891648e-01  -5.9942935e-02',
 ' -3.9595528e-01  -1.1149379e-01',
 ' -2.6080093e-01   0.0000000e+00',
 ' -1.7136031e-02  -4.2415285e-02',
 ' -2.0362735e-01  -3.2264882e-02',
 '  1.2513530e-01  -6.5704141e-02',
 '  2.7092705e-02   0.0000000e+00',
 ' -2.5833845e-02  -1.2462721e-02',
 '  1.4509934e-01  -7.4327486e-02',
 ' -6.1786646e-02  -5.9974697e-03']

我通过计算线条找到切片范围,但在工作代码中我​​会扫描线条,应用split并计算具有2个此类字符串的线条。

我可以通过split从这些行创建一个数组。

In [16]: x = np.array([row.split() for row in txt[3:3+14]],float)

In [17]: x
Out[17]: 
array([[ 0.        ,  0.        ],
       [ 0.1352444 ,  0.        ],
       [ 0.03141602, -0.12261651],
       [ 0.05936603,  0.        ],
       [ 0.17891648, -0.05994294],
       [-0.39595528, -0.11149379],
       [-0.26080093,  0.        ],
       [-0.01713603, -0.04241528],
       [-0.20362735, -0.03226488],
       [ 0.1251353 , -0.06570414],
       [ 0.02709271,  0.        ],
       [-0.02583385, -0.01246272],
       [ 0.14509934, -0.07432749],
       [-0.06178665, -0.00599747]])

np.array正在转换字符串列表列表,如下所示:

[['0.0000000e+00', '0.0000000e+00'],
 ['1.3524440e-01', '0.0000000e+00'],
 ['3.1416022e-02', '-1.2261651e-01'],
 ....