Question

我已经从我的Google搜索API结果中创建了一个JSON文件。我正在尝试读取文件并解析对象。

每个搜索结果是一个JSON数组，如下所示。我在一个JSON文件中有200个这样的数组。

{
  "kind": "customsearch#result",
  "title": "text here",
  "htmlTitle": "text here",
  "link": "link here",
  "displayLink": "text here",
  "snippet": "text here",
  "htmlSnippet": "text here",
  "cacheId": "ID string",
  "formattedUrl": "text here",
  "htmlFormattedUrl": "link here",
  "pagemap": {
  "metatags": [
    {
      "viewport": "width=device-width, initial-scale=1"
    }
  ],
  "Breadcrumb": [
    {
      "title": "text here",
      "url": "link here",
    },
    {
      "title": "text here",
      "url": "link here",
    },
    {
      "title": "text here",
      "url": "link here",
    },
    {
      "title": "text here",
      "url": "link here",
    }
  ]
}

我在将JSON文件读入json.load中时遇到问题。

如何读取此文件并开始解析项目？

def ingest_json(input):
try:
    with open(input, 'r', encoding='UTF-8') as f:
        json_data = json.loads(f)
except Exception:
    print(traceback.format_exc())
    sys.exit(1)

引发此错误：

TypeError: the JSON object must be str, 
bytes or bytearray, not 'TextIOWrapper'

def ingest_json(input):
try:
    with open(input, 'r', encoding='UTF-8') as f:
        json_data = json.load(f)
except Exception:
    print(traceback.format_exc())
    sys.exit(1)

引发此错误：

 raise JSONDecodeError("Extra data", s, end)
                   json.decoder.JSONDecodeError: Extra data: line 269 
                   column 2 (char 10330)

Answer 1

在json.loads()中，“ s”代表字符串，因此仅适用于字符串类型。

json.load()绝对是您想要的方法，尽管它对于JSON格式正确非常特别，并且根据规范，单个JSON文件只能包含单个JSON对象。

请尝试将数据拆分为多个文件，每个文件都包含一个对象，或者在解析之前将python中的对象按字符串拆分。另外，请查看Can json.loads ignore trailing commas?来处理尾随逗号问题。

读取包含多个JSON对象的文件（python）

1 个答案: