Question

因此，我正在尝试使用Goodreads的API在Python中编写Goodreads Information Fetcher App。我目前正在开发该应用程序的第一个功能，该功能将从API提取信息，该API返回XML文件。

我解析了XML文件并将其转换为JSON文件，然后进一步将其转换为字典。但我似乎仍然无法从中提取信息，我在这里查找了其他帖子，但没有任何效果。

main.py

def get_author_books(authorId):
    url = "https://www.goodreads.com/author/list/{}?format=xml&key={}".format(authorId, key)
    r = requests.get(url)

    xml_file = r.content
    json_file = json.dumps(xmltodict.parse(xml_file))

    data = json.loads(json_file)
    print("Book Name: " + str(data[0]["GoodreadsResponse"]["author"]["books"]["book"]))

我希望输出结果会给我字典中第一本书的名字。

Here是Goodreads提供的XML示例文件。

Answer 1

我认为您对xml的工作原理了解不足，或者至少对所得到的响应的格式没有了解。

您链接到的xml文件具有以下格式：

<GoodreadsResponse>
    <Request>...</Request>
    <Author>
        <id>...</id>
        <name>...</name>
        <link>...</link>
        <books>
            <book> [some stuff about the first book] </book>
            <book> [some stuff about the second book] </book>
            [More books]
        </books>
    </Author>
</GoodreadsResponse>

这意味着在您的data对象中，data["GoodreadsResponse"]["author"]["books"]["book"]是响应中所有书籍的集合（所有由<book>标记包围的元素）。所以：

data["GoodreadsResponse"]["author"]["books"]["book"][0]是第一本书。
data["GoodreadsResponse"]["author"]["books"]["book"][1]是第二本书，依此类推。

回头看一下xml，每个book元素都有一个id，isbn，title，description等标签。因此，您可以通过打印来打印第一本书的标题：

data["GoodreadsResponse"]["author"]["books"]["book"][0]["title"]

作为参考，我正在使用链接到的xml文件运行以下代码，通常可以从API中获取以下代码：

import json
import xmltodict

f = open("source.xml", "r") # xml file in OP
xml_file = f.read()

json_file = json.dumps(xmltodict.parse(xml_file))
data = json.loads(json_file)

books = data["GoodreadsResponse"]["author"]["books"]["book"] 

print(books[0]["title"]) # The Cathedral & the Bazaar: Musings on Linux and Open Source by an Accidental Revolutionary

Goodreads API错误：列表索引必须是整数或切片，而不是str

1 个答案: