Question

我正在尝试将xyz文件读入python，但不断收到这些错误消息。有点新的python所以会喜欢一些帮助解释它！

def main():
    atoms = []
    coordinates = []
    name = input("Enter filename: ")
    xyz = open(name, 'r')
    n_atoms = xyz.readline()
    title = xyz.readline()
    for line in xyz:
        atom, x, y, z = line.split()
        atoms.append(atom)
        coordinates.append([float(x), float(y), float(z)])
    xyz.close()

    return atoms, coordinates


if __name__ == '__main__':
    main()

Error:
Traceback (most recent call last):
  File "Project1.py", line 25, in <module>
    main()
  File "Project1.py", line 16, in main
    atom, x, y, z = line.split()
ValueError: not enough values to unpack (expected 4, got 3)

我认为价值误差是因为在几行之后只有3个值。但不确定为什么我会收到返回错误。

Answer 1

一个非常重要的经验法则，特别是在python中：不要重新发明轮子并使用现有的库。

xyz文件是化学中为数不多的通用标准文件格式之一。所以恕我直言，你不需要任何逻辑来确定你的线的长度。第一行是一个整数n_atoms，它给你原子数，第二行是一个忽略的注释行，下一行n_atoms行是[string, float, float, float]，因为你已经在你的代码中写了。与此不同的文件可能已损坏。

使用pandas library，您只需写下：

import pandas as pd
molecule = pd.read_table(inputfile, skiprows=2, delim_whitespace=True,
                         names=['atom', 'x', 'y', 'z'])

或者你使用chemcoord包，它有自己的笛卡尔类代表笛卡尔坐标中的分子：

import chemcoord as cc
molecule = cc.Cartesian.read_xyz(inputfile)

免责声明：我是chemcoord的作者。

Answer 2

由于您在行

中解压缩列表，因此收到错误

atom, x, y, z = line.split()

只有在该行中有4个项目时才有意义。

你必须定义当一行中只有3个项目时会发生什么的逻辑，比如这个（在for循环中）：

for line in xyz:
    line_data = line.split()
    if len(line_data) == 3:
         # Behavior when only 3 items in a line goes here!
         # Add your code here!
         continue

    atom, x, y, z = line_data
    atoms.append(atom)
    coordinates.append([float(x), float(y), float(z)])

当你的程序遇到只有3个项目的行时，它会做什么取决于你想要它。

读取具有可变列的文件

2 个答案: