我有一个H2S的.xyz文件,如果我这样读取文件:
with open('H2S.xyz','r') as stream:
for line in stream:
print(line)
我明白了:
3
XYZ file of the hydrogen sulphide molecule
S 0.00000000 0.00000000 0.10224900
H 0.00000000 0.96805900 -0.81799200
H 0.00000000 -0.96805900 -0.81799200
第一行给出原子数,最后三行给出这些原子的坐标。
我应该写一些代码来提取分子中每个原子的位置,形式为列表,其中每个元素都是具有原子坐标的另一个列表。
如果我这样做:
with open('H2S.xyz','r') as stream:
new=list(stream)
new
我将每一行作为列表中的元素,如果这样做,
with open('H2S.xyz','r') as stream:
new_list=[]
for line in stream:
new_list=new_list+line.split()
new_list
我分别得到每个元素:
['3','XYZ','file','of','the','hydrogen','sulphide','molecule','S',
'0.00000000','0.00000000','0.10224900','H','0.00000000','0.96805900',
'-0.81799200','H','0.00000000','-0.96805900','-0.81799200']
我不想要。 我想要的列表如下所示:
[['0.00000000','0.00000000','0.10224900'],
['0.00000000','0.96805900','-0.81799200'],
['0.00000000','-0.96805900','-0.81799200']]
但是我不确定如何为此编写代码。
答案 0 :(得分:0)
此功能应为您提供正确的输出。
def parse_xyz(file_name):
output = []
with open(file_name) as infile:
data = infile.readlines()
for row in data[2:]: # Throw away the first few lines
if row[1:]: # Throw away the first column
output.append(row[1:].split())
return output
result = parse_xyz('h2s.xyz')
print(result)
有关其功能的一些说明:
result = parse_xyz('h2o.xyz')
for row in data[2:]:
是list slicing,因此我们不会从少数几行开始捕获任何结果。for
循环中重复切片符号,这等同于丢弃要记录的行的第一个字符。答案 1 :(得分:0)
我会做类似的事情:
import re
with open("file.txt", "r") as f:
print([re.split(r"\s+", x.strip(), 3) for x in f if len(re.split(r"\s+", x, 3)) == 4])
[['S', '0.00000000', '0.00000000', '0.10224900'], ['H', '0.00000000', '0.96805900', '-0.81799200'], ['H', '0.00000000', '-0.96805900', '-0.81799200']]
答案 2 :(得分:0)
读取.xyz文件的所有行,拆分元素和位置,并将位置附加到列表中。
H2S.xyz
3
XYZ file of the hydrogen sulphide molecule
S 0.00000000 0.00000000 0.10224900
H 0.00000000 0.96805900 -0.81799200
H 0.00000000 -0.96805900 -0.81799200
代码
with open('H2S.xyz') as data:
lines=data.readlines() # read all lines
new_list = []
for atom in lines[2:]: # start from third line
position = atom.split() # get the values
new_list.append(position[1:]) # append only the the positions
print(new_list)
您的列表
[['0.00000000', '0.00000000', '0.10224900'],
['0.00000000', '0.96805900', '-0.81799200'],
['0.00000000', '-0.96805900', '-0.81799200']]