我正在尝试从文本文件10-10-1CNT_pot.pot_fmt
中读取特定数据。在这种情况下,所需数据为a
,b
和c
以及fft coefficients
(25,300,300)。目前我能想到的唯一方法就是从文本文件中的位置开始。我不喜欢这样,因为如果文本文件略有变化,它很容易出错。有人可以提出另一种方法吗?
请参阅下面的示例文本文件(以及错误代码):
BEGIN header
Real Lattice(A) Lattice parameters(A) Cell Angles
2.4675850 0.0000000 0.0000000 a = 2.467585 alpha = 90.000000
0.0000000 30.0000000 0.0000000 b = 30.000000 beta = 90.000000
0.0000000 0.0000000 30.0000000 c = 30.000000 gamma = 90.000000
1 ! nspins
25 300 300 ! fine FFT grid along <a,b,c>
END header: data is "<a b c> pot" in units of Hartrees
代码:
file = open("10-10-1CNT_pot.pot_fmt", 'r')
lines = file.readlines()
file.close()
parts = lines[3].split()
a = parts[5]
parts1 = lines[4].split()
b = parts1[5]
parts2 = lines[5].split()
c = parts2[5]
parts3 = lines[8].split()
width = parts3[0]
parts4 = lines[8].split()
height = parts4[1]
parts5 = lines[8].split()
depth = parts5[2]
答案 0 :(得分:2)
你需要使用正则表达式:
import re
s=""
with open('your_file_name','r') as myfile:
a = myfile.readlines()
for i in a:
s +=i
list1=list()
list2=list()
list1.append(re.findall('(a = .* ) alpha | (b = .* ) beta | (c = .* ) gamma', s ,re.M))
list2.append(re.findall('(.*) !',s))
for i in list2:
print i[1]
for i in list1 :
for j in i:
print j[0],j[1],j[2]
输出:
25 300 300
a = 2.467585
b = 30.000000
c = 30.000000