基于this post,
import pandas as pd
inp = [{'c1':10,'cols':{'c2':20,'c3':'str1'}, 'c4':'41'}, {'c1':11,'cols':{'c2':20,'c3':'str2'},'c4':'42'}, {'c1':12,'cols':{'c2':20,'c3':'str3'},'c4':'43'}]
df = pd.DataFrame(inp)
pd.io.json.json_normalize(df.to_dict('records'))
上面的脚本效果很好。
对inp
进行很少的更改:
inp=[{'c1':10,'cols':{'c2':5,'c3':NaT}, 'c4':'41'}, {'c1':11,'cols':{'c2':Timestamp('2014-06-03 21:19:26'),'c3':'str2'},'c4':'42'}, {'c1':12,'cols':{'c2':20,'c3':'str3'},'c4':'43'}]
df = pd.DataFrame(inp)
pd.io.json.json_normalize(df.to_dict('records'))
我只是将str1
更改为NaT
,将20
更改为Timestamp('2014-06-03 21:19:26')
,脚本运行不正常,出现以下错误:
NameError: name 'NaT' is not defined
NameError: name 'Timestamp' is not defined
由于NaT
在实际数据中很常见,是什么原因导致错误?
答案 0 :(得分:2)
您应该指的是pd.NaT和pd.Timestamp:
inp=[{'c1':10,'cols':{'c2':5,'c3':pd.NaT}, 'c4':'41'}, {'c1':11,'cols':{'c2':pd.Timestamp('2014-06-03 21:19:26'),'c3':'str2'},'c4':'42'}, {'c1':12,'cols':{'c2':20,'c3':'str3'},'c4':'43'}]