Question

我在编程和Python方面有点白痴。我知道这些在以前的问题中有很多解释，但我仔细阅读了所有这些，但我没有找到解决方案。
我正在尝试读取一个包含大约10亿个数据的JSON文件：

334465|{"color":"33ef","age":"55","gender":"m"}
334477|{"color":"3444","age":"56","gender":"f"}
334477|{"color":"3999","age":"70","gender":"m"}

我努力克服每行开头的6位数字，但我不知道如何读取多个JSON对象？这是我的代码，但我找不到它为什么不起作用？

import json

T =[]
s = open('simple.json', 'r')
ss = s.read()
for line in ss:
    line = ss[7:]
    T.append(json.loads(line))
s.close()

这就是我得到的错误：

ValueError: Extra Data: line 3 column 1 - line 5 column 48 (char 42 - 138)

任何建议对我都有帮助！

Answer 1

您的代码逻辑存在一些问题。

ss = s.read()

将整个文件s读入单个字符串。下一行

for line in ss:

逐个迭代该字符串中的每个字符。所以在每个循环line上都是一个字符。在

    line = ss[7:]

您将获得除前7个字符（在0到6位置之外）的整个文件内容，并将line的先前内容替换为该字符。然后

T.append(json.loads(line))

尝试将其转换为JSON并将生成的对象存储到T列表中。

这里有一些代码可以满足您的需求。我们不需要将整个文件读入包含.read的字符串，或者将其读入带有.readlines的行列表中，我们可以简单地将文件句柄放入for循环中，这将迭代逐行扫描文件。

我们使用with语句打开文件，以便在我们退出with块时自动关闭，或者如果出现IO错误。

import json

table = []
with open('simple.json', 'r') as f:
    for line in f:
        table.append(json.loads(line[7:]))

for row in table:
    print(row)

<强>输出

{'color': '33ef', 'age': '55', 'gender': 'm'}
{'color': '3444', 'age': '56', 'gender': 'f'}
{'color': '3999', 'age': '70', 'gender': 'm'}

我们可以通过在列表解析中构建table列表来使其更紧凑：

import json

with open('simple.json', 'r') as f:
    table = [json.loads(line[7:]) for line in f]

for row in table:
    print(row)

Answer 2

如果你使用Pandas，你可以简单地写 df = pd.read_json(f, lines=True)

根据文档lines=True：

每行读取一个json对象文件。

Answer 3

您应该使用var input = document.getElementById('target'); input2= input+"dubai "; var searchBox = new google.maps.places.SearchBox(input2);而不是readlines()，并将您的JSON解析包装在try / except块中。你的行可能包含一个尾随的换行符，这会导致错误。

read()

Answer 4

非常感谢你！你们是救命的人！这是我最终提出的代码。我认为这是所有答案的组合！

import json

table = []
with open('simple.json', 'r') as f:
    for line in f:
        try:
            j = line.split('|')[-1]
            table.append(json.loads(j))
        except ValueError:
            # You probably have bad JSON
            continue

for row in table:
    print(row)

在Python中使用多个对象读取JSON文件

4 个答案: