Question

我是python的新手，尤其是Biopython。我正在尝试从Entrez.efetch的XML文件中获取一些信息，然后阅读它。上周这个脚本效果很好：

handle = Entrez.efetch(db="Protein", id="YP_008872780.1", retmode="xml")
records = Entrez.read(handle)

但现在我收到了错误：

> Bio.Entrez.Parser.ValidationError: Failed to find tag 'GBSeq_xrefs' in
    the DTD. To skip all tags that are not represented in the DTD, please
    call Bio.Entrez.read or Bio.Entrez.parse with validate=False.

所以我运行这个：

records = Entrez.read(handle, validate=False)

但我仍然收到错误：

TypeError: 'str' object does not support item assignment

经过一些研究后，我意识到NCBI对RefSeq提出了{{1}}，它在xml文件中创建了新的标签（GenPept）

我是否需要更改DTD中的某些内容才能支持这些新标记？

Answer 1

看来我的DTD文件已过期可以找到新版本here或here。

NCBI的新RefSeq版本与Bio.Entrez.Parser兼容？

1 个答案: