我正在创建一个ML模型,该模型将使用JSON文件来了解模式和响应格式。由于我的数据格式为excel,因此我将其转换为python中的JSON。
代码如下:
import xlrd
from collections import OrderedDict
import simplejson as json
# Open the workbook and select the first worksheet
wb = xlrd.open_workbook('D:\\android\\testdata2.xlsx')
sh = wb.sheet_by_index(0)
# List to hold dictionaries
data_list = []
# Iterate through each row in worksheet and fetch values into dict
for rownum in range(1, sh.nrows):
data = OrderedDict()
row_values = sh.row_values(rownum)
data['pattern'] = row_values[0]
data['response'] = row_values[1]
data_list.append(data)
# Serialize the list of dicts to JSON
j = json.dumps(data_list)
# Write to file
with open('data1.json', 'w') as f:
f.write(j)
我得到的输出是:
[{
"pattern": "WALLSTENT NON COUVERTE ",
"response": "ENDOPROTHESE STENT VASCULAIRE "
}, {
"pattern": "PRIMEADVANCED SURSCAN MRI ",
"response": "NEUROSTIMULATEUR NERF VAGUE GAUCHE "
}, {
"pattern": "AVASTIN FLACON DE",
"response": "BEVACIZUMAB"
}, {
"pattern": "PERJETA SOLUTION A DILUER POUR PERFUSION",
"response": "BRENTUXIMAB VEDOTIN"
}]
我想要的输出是这样的:
{
"intents": [{
"pattern": ["WALLSTENT, NON, COUVERTE "],
"response": ["ENDOPROTHESE STENT VASCULAIRE] "
}, {
"pattern": ["PRIMEADVANCED ,SURSCAN ,MRI"] ,
"response": ["NEUROSTIMULATEUR NERF VAGUE GAUCHE "]
}, {
"pattern": ["AVASTIN , FLACON ,DE"],
"response": ["BEVACIZUMAB"]
}, {
"pattern": ["PERJETA, SOLUTION, A, DILUER, POUR ,PERFUSION"],
"response": ["BRENTUXIMAB VEDOTIN"]
}]
}
我可以对函数进行哪些修改以获取所需的输出。
答案 0 :(得分:2)
应该这样做:
import xlrd
from collections import OrderedDict
import simplejson as json
# Open the workbook and select the first worksheet
wb = xlrd.open_workbook('D:\\android\\testdata2.xlsx')
sh = wb.sheet_by_index(0)
# List to hold dictionaries
data_list = []
# Iterate through each row in worksheet and fetch values into dict
for rownum in range(1, sh.nrows):
data = OrderedDict()
row_values = sh.row_values(rownum)
data['pattern'] = row_values[0]
data['response'] = row_values[1]
data_list.append(data)
data_list = {'intents': data_list} # Added line
# Serialize the list of dicts to JSON
j = json.dumps(data_list)
# Write to file
with open('data1.json', 'w') as f:
f.write(j)
请注意添加的data_list = {'intents': data_list}
。
答案 1 :(得分:0)
在python中将快照提供给pyexcel_xlsx库。我用它来将xlsx转换为json。甜美而简单的一个。与其他python库相比,速度也更快。
示例代码:
from pyexcel_xlsx import get_data;
import time;
import json;
data = get_data("D:\\android\\testdata2.xlsx")
sheetName = "Table A";
data_list = []
# Iterate through each row and append in above list
for i in range(0, len(data[sheetName])):
data_list.append({
'pattern' : data[sheetName][i][0],
'response' : data[sheetName][i][1]
})
data_list = {'intents': data_list} # Converting to required object
j = json.dumps(data_list)
# Write to file
with open('data1.json', 'w') as f:
f.write(j)