Question

我正在尝试从文本文件中读取一列数字，如下所示：

一些文字和数字......，然后：

 q-pt=    1    0.000000  0.000000  0.000000      1.0000000000
   1      -0.066408              0.0000000                      
   2      -0.053094              0.0000000                      
   3      -0.037643              0.0000000 
   ...
   156    3107.735577            6.8945617
...more text file

我有兴趣阅读包含-0.066408，-0.053094等的secound列。
我试图编写的代码在某种程度上没有给出任何错误。我试过这个：

import re                                                                            
import sys                                                                           
from string import atof                                                              
from math import exp                                                                 
from numpy import *                                                                  

file1 = open('castepfreq.dat', 'w')                                                  
with open('xd_geo_Efield.phonon') as file:                                           
    File = file.readlines()                                                          
    p1 = re.compile("q-pt=    1    0.000000  0.000000  0.000000      1.0000000000")  
    for i in range(len(File)):                                                       
        m1 = p1.search(File[i])                                                      
          if  m1:                                                                       
            read = int(float(File[i+1][10:23]))      
            freq = (read)                                                            
    print >> file1, freq    
file1.close()

如果有人能帮助我，那就太棒了。

Answer 1

您可以拆分空格，然后提取第二个元素：

with open('xd_geo_Efield.phonon') as f:
    col = [line.split()[1] for line in f]
    print(col)

如果您的输入是：

q-pt=    1    0.000000  0.000000  0.000000      1.0000000000
1      -0.066408              0.0000000
2      -0.053094              0.0000000
3      -0.037643              0.0000000

输出将是：

[('1', '-0.066408', '-0.053094', '-0.037643')]

或使用itertools和转置：

from itertools import izip, islice, imap
with open('xd_geo_Efield.phonon') as f:
    col = islice(izip(*imap(str.split,f)), 1,2)
    print(list(col))

如果要进行强制转换，请将值强制转换为float：

 [float(line.split()[1]) for line in f]

此外，如果您想在使用其余代码之前跳过标题并忽略1调用next(f)文件对象，例如：

with open('xd_geo_Efield.phonon') as f:
      next(f)
      col = [float(line.split()[1]) for line in f]
      print(list(col))

哪个会输出：

 [-0.066408, -0.053094, -0.037643]

如果您想要忽略数据并且只从第q-pt=..行开始，您可以使用itertools.dropwhile忽略开头的行：

from itertools import dropwhile

with open('xd_geo_Efield.phonon') as f:
    col = [float(line.split()[1]) for line in dropwhile(
           lambda x: not x.startswith("q-pt="), f)]
    print(list(col))

如果您还想忽略该行，可以再次调用next，但这次是在dropwhile对象上：

from itertools import dropwhile

with open('xd_geo_Efield.phonon') as f:
    dp = dropwhile(lambda x: not x.startswith("q-pt="), f)
    next(dp)
    col = [float(line.split()[1]) for line in dp]
    print(list(col))

所以对于输入：

some 1 1 1 1 1
meta 2 2 2 2 2
data 3 3 3 3 3
and 4 4 4 4 4
numbers 5 5 5 5 5
q-pt=    1    0.000000  0.000000  0.000000      1.0000000000
1      -0.066408              0.0000000
2      -0.053094              0.0000000
3      -0.037643              0.0000000
3      -0.037643              0.0000000

输出将是：

[-0.066408, -0.053094, -0.037643, -0.037643]

对于前导空格，lstrip关闭：

from itertools import dropwhile, imap, takewhile

with open('xd_geo_Efield.phonon') as f:
    # for python3 just use map
    dp = dropwhile(lambda x: not x.startswith("q-pt="), imap(str.lstrip,f))
    next(dp)
    col = [float(line.split(None,2)[1]) for line in takewhile(lambda x: x.strip() != "", dp)]
    print(list(col))

takewhile会一直走线，直到我们点击文件末尾的第一个空行。

从python中的文本文件中读取数字

1 个答案: