如何解析嵌入在pandas数据帧python中的json列

时间:2015-03-12 14:17:17

标签: python json pandas dataframe

我有一个pandas数据帧(raw csv file here),其中包含一些存储为json(d1& d2)的列。如何解析这些列以获得所需的输出:

2015-02-12,user1,05:15 | 20,16:30 | 20.0,22:00 | 10.0

我意识到我必须在成功解析之后转置输出,但是我在读取dataframe列中包含的json数据时遇到了问题。任何帮助赞赏!感谢

>>> test = pd.read_csv('schedsample.csv',sep=',', header=0)
>>> test.head()
         date username                                                 d1  \
0  2015-02-12    user1  {"d1":[{"tm":"05:15","t":"20.0"},{"tm":"16:30"...   
1  2015-02-12    user1  {"d2":[{"tm":"06:15","t":"20.0"},{"tm":"08:00"...   
2  2015-02-12    user1  {"d3":[{"tm":"07:15","t":"20.0"},{"tm":"09:00"...   
3  2015-02-12    user1  {"d4":[{"tm":"08:15","t":"20.0"},{"tm":"07:00"...   

                                                  d2  
0  {"d1":[{"tm":"05:15","t":"20.0"},{"tm":"16:30"...  
1  {"d1":[{"tm":"05:15","t":"20.0"},{"tm":"16:30"...  
2  {"d1":[{"tm":"05:15","t":"20.0"},{"tm":"16:30"...  
3  {"d1":[{"tm":"05:15","t":"20.0"},{"tm":"16:30"...  
>>> import json as js
>>> js.loads(test['d1'])
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/khurampervez/anaconda/lib/python2.7/json/__init__.py", line 338, in loads
return _default_decoder.decode(s)
  File "/Users/khurampervez/anaconda/lib/python2.7/json/decoder.py", line 366, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
TypeError: expected string or buffer`

1 个答案:

答案 0 :(得分:0)

您的test.d1列包含所有d1到d4对象,因此如果您执行json.loads(test['d1'])会导致错误,但如果您执行json_normalize(json.loads(test['d1'][0])['d1']),则会为您提供所需的d1数据帧。所以我想而不是只读入d1和d2列,你需要d3和d4列,这将产生一些空单元格。