我正在尝试从文本文件中读取一列数字,如下所示:
一些文字和数字......,然后:
q-pt= 1 0.000000 0.000000 0.000000 1.0000000000
1 -0.066408 0.0000000
2 -0.053094 0.0000000
3 -0.037643 0.0000000
...
156 3107.735577 6.8945617
...more text file
我有兴趣阅读包含-0.066408,-0.053094等的secound列。
我试图编写的代码在某种程度上没有给出任何错误。我试过这个:
import re
import sys
from string import atof
from math import exp
from numpy import *
file1 = open('castepfreq.dat', 'w')
with open('xd_geo_Efield.phonon') as file:
File = file.readlines()
p1 = re.compile("q-pt= 1 0.000000 0.000000 0.000000 1.0000000000")
for i in range(len(File)):
m1 = p1.search(File[i])
if m1:
read = int(float(File[i+1][10:23]))
freq = (read)
print >> file1, freq
file1.close()
如果有人能帮助我,那就太棒了。
答案 0 :(得分:1)
您可以拆分空格,然后提取第二个元素:
with open('xd_geo_Efield.phonon') as f:
col = [line.split()[1] for line in f]
print(col)
如果您的输入是:
q-pt= 1 0.000000 0.000000 0.000000 1.0000000000
1 -0.066408 0.0000000
2 -0.053094 0.0000000
3 -0.037643 0.0000000
输出将是:
[('1', '-0.066408', '-0.053094', '-0.037643')]
或使用itertools和转置:
from itertools import izip, islice, imap
with open('xd_geo_Efield.phonon') as f:
col = islice(izip(*imap(str.split,f)), 1,2)
print(list(col))
如果要进行强制转换,请将值强制转换为float:
[float(line.split()[1]) for line in f]
此外,如果您想在使用其余代码之前跳过标题并忽略1
调用next(f)
文件对象,例如:
with open('xd_geo_Efield.phonon') as f:
next(f)
col = [float(line.split()[1]) for line in f]
print(list(col))
哪个会输出:
[-0.066408, -0.053094, -0.037643]
如果您想要忽略数据并且只从第q-pt=..
行开始,您可以使用itertools.dropwhile忽略开头的行:
from itertools import dropwhile
with open('xd_geo_Efield.phonon') as f:
col = [float(line.split()[1]) for line in dropwhile(
lambda x: not x.startswith("q-pt="), f)]
print(list(col))
如果您还想忽略该行,可以再次调用next,但这次是在dropwhile对象上:
from itertools import dropwhile
with open('xd_geo_Efield.phonon') as f:
dp = dropwhile(lambda x: not x.startswith("q-pt="), f)
next(dp)
col = [float(line.split()[1]) for line in dp]
print(list(col))
所以对于输入:
some 1 1 1 1 1
meta 2 2 2 2 2
data 3 3 3 3 3
and 4 4 4 4 4
numbers 5 5 5 5 5
q-pt= 1 0.000000 0.000000 0.000000 1.0000000000
1 -0.066408 0.0000000
2 -0.053094 0.0000000
3 -0.037643 0.0000000
3 -0.037643 0.0000000
输出将是:
[-0.066408, -0.053094, -0.037643, -0.037643]
对于前导空格,lstrip
关闭:
from itertools import dropwhile, imap, takewhile
with open('xd_geo_Efield.phonon') as f:
# for python3 just use map
dp = dropwhile(lambda x: not x.startswith("q-pt="), imap(str.lstrip,f))
next(dp)
col = [float(line.split(None,2)[1]) for line in takewhile(lambda x: x.strip() != "", dp)]
print(list(col))
takewhile
会一直走线,直到我们点击文件末尾的第一个空行。