Question

我有多个json文件，该文件保存了来自Requests的响应，像这样，每行/每个列表包含5条记录

def spin():
    SpinValues = [[0,0,0,0,0],[0,0,0,0,0],[0,0,0,0,0]]

    for i, object in enumerate(Reels):
        length = len(Reels[i])
        StopValue = random.randint(0,length)
        SpinValues[i][1] = Reels[i][StopValue]
        if StopValue == 0:
            SpinValues[i][0] = Reels[i][len(Reels[i])]
        else:
            SpinValues[i][0] = Reels[i][StopValue - 1]
        if StopValue == Reels[i][len(Reels[i])]:
            SpinValues[i][2] = Reels[i][0]
        else:
            SpinValues[i][2] = Reels[i][StopValue +1]
    print(SpinValues)

spin()

我应该用resp.content保存它，而返回的resp.content不包含数组还是嵌套在array中的resp.json（）？最佳做法是什么？

将它们组合在一起（大约10k的文件）的最佳方法是什么，以便可以将它们放在熊猫数据框中并进行进一步分析？我试着放上它并尝试使用json.load（）加载，但是它返回了一个错误：Extra Data

[{"Record1": "1", "Record2": "2", "Record3": "3", "Record4": "4", "Record5": "5"}]

输出：

import json
import codecs
import glob

files = glob.glob('./results/*.json')

with codecs.open('combined_results.json', 'w', encoding='utf-8') as outfile:
    for file in files:
        f = open(file, 'r')
        data = json.load(f)
        json.dump(data, outfile, ensure_ascii=False, indent=None)
        outfile.write("\n")

将合并的文件加载到对象中：（错误：额外数据）

[{"Record1": "1", "Record2": "2", "Record3": "3", "Record4": "4", "Record5": "5"}]
[{"Record1": "1", "Record2": "2", "Record3": "3", "Record4": "4", "Record5": "5"}]
[{"Record1": "1", "Record2": "2", "Record3": "3", "Record4": "4", "Record5": "5"}]

Answer 1

您可以更改代码以将文件合并为有效的json对象：

combined_results = []
with open('combined_results.json', 'w', encoding='utf-8') as outfile:
    for file in files:
        f = open(file, 'r')
        combined_results.append(json.load(f)[0])
    json.dump(combined_results, outfile)

现在要在数据框中读取此文件，请尝试pd.read_json：

pd.read_json('combined_results.json')

更新：

您实际上根本不需要combined_results.json文件。除非您希望将文件合并为一个以后要使用的单个文件，否则可以将combined_results的列表直接转换为数据框。

combined_results = []
for file in files:
    f = open(file, 'r')
    combined_results.append(json.load(f)[0])

pd.DataFrame(combined_results)

Answer 2

尝试function truncateString(yourString, maxLength) { while (maxLength < yourString.length && yourString[maxLength] != ' '){ maxLength++; } return yourString.substr(0, maxLength); } console.log( truncateString('The quick brown fox jumps over the lazy dog',6) )

合并json数据的最佳方法pd dataframe

2 个答案: