我已经收到一个包含一些数据的json文件,应该进行分析。数据来自sql数据库,因此通常在表中进行结构化。但是,当我收到它时,它看起来像这样:
{'TimeStamp1': '2018-06-03 00:21:04', 'Owner1': 'Some owner', 'Description1': 'A description', 'TimeStamp2': '2018-06-03 00:22:15', 'Owner2': 'A new Owner', 'Description2': 'A new description'}
...等等。因此,只有一条线/对象具有所有数据,而多个键具有几乎相同的名称。如何在Python中将其转换为类似于sql-setup或:
{'records':
{'TimeStamp': '2018-06-03 00:21:04', 'Owner': 'Some owner', 'Description': 'A description'},
{'TimeStamp': '2018-06-03 00:22:15', 'Owner': 'A new Owner', 'Description': 'A new description'}
}
并且仍然保证正确的所有者与相关的时间戳和说明在同一行吗? :)
答案 0 :(得分:0)
这是一种简单的方法。可能可以对其进行优化,但是它应该做您想要的并且非常简单
def sanitize(d, keys):
b = 0
records = []
#get the highest numerical key
for key in x.keys():
cur_key_num = int(" ".join(re.findall("[1-9]+", key)))
if cur_key_num > b:
b = cur_key_num
#go through key numbers 1 at a time
for i in range(1, b+1):
rec = {}
#build a dictionary for each keynum
for key in keys:
rec[key] = d[key + str(i)]
re cords.append(rec)
return records
该函数的用法如下:
data = {'TimeStamp1': '2018-06-03 00:21:04', 'Owner1': 'Some owner', 'Description1': 'A description', 'TimeStamp2': '2018-06-03 00:22:15', 'Owner2': 'A new Owner', 'Description2': 'A new description'}
k = ['TimeStamp', 'Owner', 'Description']
r = sanitize(data, k)
并返回:
[{'Owner': 'Some owner', 'TimeStamp': '2018-06-03 00:21:04', 'Description': 'A description'}, {'Owner': 'A new Owner', 'TimeStamp': '2018-06-03 00:22:15', 'Description': 'A new description'}]