在进行持久的API调用时,我循环遍历一个大型列表,以便重新组织我的数据并将其保存到文件中,如下所示:
for item in music:
# initialize data container
data = defaultdict(list)
genre = item[0]
artist= item[1]
track= item[2]
# in actual code, api calls happen here, processing genre, artist and track
data['genre']= genre
data['artist'] = artist
data['track'] = track
# use 'a' -append mode
with open('data.json', mode='a') as f:
f.write(json.dumps([data], indent=4))
注意:由于我有一个小时的窗口进行api调用(在令牌过期之后),我必须在for loop
内快速将数据保存到磁盘。< / p>
上面的方法将数据附加到data.json
文件,但我的转储列表不是逗号分隔,文件最终填充如下:
[
{
"genre": "Alternative",
"artist": "Radiohead",
"album": "Ok computer"
}
]
[
{
"genre": "Eletronic",
"artist": "Kraftwerk",
"album": "Computer World"
}
]
那么,我怎样才能将我的数据转储到以逗号分隔的列表列表?
答案 0 :(得分:0)
一种方法是在写入之前读取JSON文件。
<强>实施例强>
import json
for item in music:
# initialize data container
data = defaultdict(list)
genre = item[0]
artist= item[1]
track= item[2]
data['genre']= genre
data['artist'] = artist
data['track'] = track
# Read JSON
with open('data.json', mode='r') as f:
fileData = json.load(f)
fileData.append(data)
with open('data.json', mode='w') as f:
f.write(json.dumps(fileData, indent=4))
答案 1 :(得分:0)
这样的事情会起作用
import json
music = [['Alternative', 'Radiohead', 'Ok computer'], ['Eletronic', 'Kraftwerk', 'Computer World']]
output = list()
for item in music:
data = dict()
genre = item[0]
artist= item[1]
track= item[2]
data['genre']= genre
data['artist'] = artist
data['track'] = track
output.append(data)
with open('data.json', mode='a') as f:
f.write(json.dumps(output, indent=4))
我的data.json包含:
[
{
"genre": "Alternative",
"track": "Ok computer",
"artist": "Radiohead"
},
{
"genre": "Eletronic",
"track": "Computer World",
"artist": "Kraftwerk"
}
]
答案 2 :(得分:0)
对于大型数据集,pandas
(用于序列化)和pickle
(用于保存)像魅力一样协同工作。
df = pd.DataFrame()
for item in music:
# initialize data container
data = defaultdict(list)
genre = item[0]
artist= item[1]
track= item[2]
# in actual code, api calls happen here, processing genre, artist and track
data['genre']= genre
data['artist'] = artist
data['track'] = track
df = df.append(data, ignore_index=True)
df.to_pickle('data.pkl')