使用生成器从文件构造列表

时间:2018-10-02 18:52:45

标签: python

我正在编写一个脚本,该脚本需要从大文件中的特定位置读取矩阵。文件中感兴趣的位置如下所示:

VOLUME and BASIS-vectors are now :
 -----------------------------------------------------------------------------
  energy-cutoff  :      500.00
  volume of cell :      478.32
      direct lattice vectors                 reciprocal lattice vectors
     7.831488362  0.000000000  0.000000000     0.127689649  0.000000000  0.000000000
     0.000000000  7.773615667  0.000000000     0.000000000  0.128640268  0.000000000
     0.000000000  0.000000000  7.856881120     0.000000000  0.000000000  0.127276967

我需要倒数晶格向量。有很多方法可以获取这些数字,但是文件长数千行,所以我不能(不应该)将整个内容存储为行列表。这种限制使得提取我想要的数据更加困难。这是我到目前为止的内容:

with open('OUTCAR','r') as read_outcar:
    for line in read_outcar:
        if 'VOLUME' in line:
            for i in range(5):  #skip to line with data
                next(read_outcar)
            buffer = line.split()
            x = [float(buffer(i+3)) for i in buffer]
            next(read_outcar)
            buffer = line.split()
            y = [float(buffer(i+3)) for i in buffer]
            next(read_outcar)
            buffer = line.split()
            z = [float(buffer(i+3)) for i in buffer]
            break

这里有两个问题:

1。)我不确定我对'next'的使用是否正确/正确,但是我不知道如何从文件中获取与迭代器关联的当前行以外的行

2。)我的发电机不工作。解释器引发类型错误,因为我显然试图连接str和int类型。我想要的是互易晶格矩阵中每一行的浮点数列表。

任何对此的帮助将不胜感激。预先感谢。

3 个答案:

答案 0 :(得分:1)

代码存在一些问题:

  • next返回迭代器的下一项,由于您在line上进行了拆分,因此应通过line = next(read_outcat)
  • 捕获它
  • 然后,缓冲区是一个列表,并通过方括号(即buffer[...])对其进行索引。但是,由于您似乎对最后三个元素感兴趣,因此可以通过buffer[-3:]来访问它们。

此处是修改后的代码:

with open('OUTCAR') as read_outcar:
    for line in read_outcar:
        if 'VOLUME' in line:
            for i in range(5):  #skip to line with data
                line = next(read_outcar)
            buffer = line.split()
            x = [float(b) for b in buffer[-3:]]
            line = next(read_outcar)
            buffer = line.split()
            y = [float(b) for b in buffer[-3:]]
            line = next(read_outcar)
            buffer = line.split()
            z = [float(b) for b in buffer[-3:]]
            print(f'x = {x}, y = {y}, z = {z}')
            break

答案 1 :(得分:0)

在我看来您可以做类似的事情

starting_row = 5
filename = r"file.txt"
def make_me_a_generator(filename = None):
    with open(filename, 'r') as f:
        for index, line in enumerate(f.readlines()):
            if index >= starting_row:
                line.replace(r"\n", "")
                row = line[47:].split("  ")

                x = float(row[0])
                y = float(row[1])
                z = float(row[2])
                print(f'{x} {y} {z}')
                yield x, y, z

将文件读入生成器,可以根据需要使用

答案 2 :(得分:0)

skip_lines = 0
read_lines = 0
with open('OUTCAR') as read_outcar:
    for line in read_outcar:
        if 'VOLUME' in line:
            skip_lines = 4
            read_lines = 4
        elif skip_lines:
            skip_lines -= 1
        elif read_lines:
            read_lines -= 1

            buffer = line.split()
            x = [float(b) for b in buffer[-3:]]
            print(x)

或使用while循环

    while true:
         line = next(read_outcar, '')
         if not x: break
         ...