我正在使用CSV文件,其中一些数据以嵌套的JSON格式保存。
示例数据集(当我将CSV文件加载到Pandas中时): - 我需要将数据(M)类别中的值转换为单独的列。目前,它们已保存为列表
deviceId (S) timestamp (S) data (M)
0 3377 1523128290722 { "deviceId" : { "N" : "3377" }, "device...
1 736 1523128294737 { "deviceId" : { "N" : "736" }, "deviceI...
2 4963 1523128290731 { "deviceId" : { "N" : "4963" }, "device...
列数据(M)中的数据是嵌套列表
user_dict = df['data (M)']
[In] type([user_dict]) is list
[Out] True
以下是我尝试过的一些代码示例:
user_dict = df['data (M)']
pd.DataFrame.from_dict({(i,j): user_dict[i][j]
for i in user_dict.keys()
for j in user_dict[i].keys()},
orient='index')
AttributeError: 'str' object has no attribute 'keys'
-
user_dict = df['data (M)']
df2 = pd.DataFrame(user_dict[1:],columns=data[0])
TypeError: Index(...) must be called with a collection of some kind, '{' was passed
最终我想要一个看起来像这样的Pandas DataFrame:
deviceId (S) timestamp (S) temperature (N) (N) Humidity (N)
0 3377 1523128290722 { "Temperature" : { "N" : "24.2424" }, { "Humdity" : { "N" : "24.2424" }
1 736 1523128294737 { "Temperature" : { "N" : "24.466" }, { "Humdity" : { "N" : "24.2424" }
2 4963 1523128290731 { "Temperature" : { "N" : "26.4534" }, { "Humdity" : { "N" : "24.2424" }