Question

我正在尝试从来自PushShift API的JSON数据中编写CSV文件，但我遇到了TypeError。我的代码在

之下

import requests
import csv
import json
from urllib.request import urlopen

url = 'https://api.pushshift.io/reddit/comment/search/?subreddit=science&filter=parent_id,id,author,created_utc,subreddit,body,score,permalink'
page = requests.get(url)
page_json = json.loads(page.text)
print(page.text)
f = csv.writer(open("test.csv",'w+', newline=''))
f.writerow(["id", "parent_id", "author", "created_utc","subreddit", "body", "score"])
for x in page_json:
f.writerow([x["data"]["id"],
            x["data"]["parent_id"],
            x["data"]["author"],
            x["data"]["created_utc"],
            x["data"]["subreddit"],
            x["data"]["body"],
            x["data"]["score"]])

我得到的错误是：

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-3-82784a93576b> in <module>()
 11 f.writerow(["id", "parent_id", "author", "created_utc","subreddit", "body", "score"])
 12 for x in page:
---> 13     f.writerow([x["data"]["id"],
     14                 x["data"]["parent_id"],
     15                 x["data"]["author"],

TypeError: byte indices must be integers or slices, not str

我在这里尝试了解决方案：How can I convert JSON to CSV?

我遇到的实际问题可能是也可能不是。任何建议将不胜感激！

Answer 1

您的“数据”包含csv行的条目数组，而不是每个具有键“data”的对象数组。所以你需要先访问“数据”：

page_json = json.loads(page.text)['data']

然后迭代它：

for x in page_json:
    f.writerow([x["id"],
                x["parent_id"],
                x["author"],
                x["created_utc"],
                x["subreddit"],
                x["body"],
                x["score"]])

请注意，您需要遍历JSON对象而不是请求。

您还可以重构代码以获取此信息：

columns = ["id", "parent_id", "author", "created_utc", "subreddit", "body", "score"]
f.writerow(columns)
for x in page_json:
    f.writerow([x[column] for column in columns])

Python将JSON转换为CSV TypeError

1 个答案: