如何使用pandas解析以下json?

时间:2017-03-21 23:35:04

标签: json pandas

您好我正在使用json,这个json包含几个对话,格式如下: 从括号到括号包含完整的对话如下:

[
    {
        "created": "2017-02-02T11:57:41+0000",
        "from": "Bank",
        "message": "Hi Alex, if you have not perform the modification to the data, please verify your DNI, celphone and the operator to verify it. Thanks."
    },
    {
        "created": "2017-02-01T22:19:58+0000"   ,
        "from": "Alex ",
        "message": "Could someone please help me?, I am callig to CC and they don't answer"
    },
    {
        "created": "2017-02-01T22:19:42+0000",
        "from": "Alex ",
        "message": "the sms with the corresponding key and token has not arrived"
    },
    {
        "created": "2017-02-01T22:19:28+0000",
        "from": "Alex ",
        "message": "I have issues to make payments from the app"
    },
    {
        "created": "2017-02-01T22:19:18+0000",
        "from": "Alex ",
        "message": "Good afternoon"
    }
],

我想解析这个json,以便将问题放在一个列中,然后将其与为银行提供的anwers匹配,在第二列中,如下所示, 第一次互动将是:

所有用户评论:

"下午好,我有问题从应用程序付款,带有相应密钥和令牌的短信没有到达,有人可以帮助我吗?,我是CC的呼叫,他们不是'回答"

所有答案:

"您好Alex,如果您尚未对数据进行修改,请验证您的DNI,手机和操作员进行验证。感谢"

我想要的输出是解析所有的json来构建这两个列,注意你可以按小时和相应的日期排序,我为了得到这个 我试过了:

with open('/home/adolfo/Desktop/CONVERSATIONS/test2.json') as json_data:
    d = json.load(json_data)
    df = pd.DataFrame.from_records(np.concatenate(d))

print(df)

然而我得到了:

                     created   from  \
0   2017-02-02T11:57:41+0000   Bank   
1   2017-02-01T22:19:58+0000  Alex    
2   2017-02-01T22:19:42+0000  Alex    
3   2017-02-01T22:19:28+0000  Alex    
4   2017-02-01T22:19:18+0000  Alex    
5   2017-02-02T11:57:41+0000   Bank   
6   2017-02-01T22:19:58+0000  Alex    
7   2017-02-01T22:19:42+0000  Alex    
8   2017-02-01T22:19:28+0000  Alex    
9   2017-02-01T22:19:18+0000  Alex    
10  2017-02-01T22:19:12+0000   Bank   
11  2017-02-01T16:22:30+0000   Alex   

                                              message  
0   Hi Alex, if you have not perform the modificat...  
1   Could someone please help me?, I am callig to ...  
2   the sms with the corresponding key and token h...  
3         I have issues to make payments from the app  
4                                      Good afternoon  
5   Hi Alex, if you have not perform the modificat...  
6   Could someone please help me?, I am callig to ...  
7   the sms with the corresponding key and token h...  
8         I have issues to make payments from the app  
9                                      Good afternoon  
10   Hello Alexander, the money is available to be...  
11  hello they have deposited the money into my ac...  

所以我非常感谢支持实现这个任务,这是json的一个例子:

[
    [
        {
            "created": "2017-02-02T11:57:41+0000",
            "from": "Bank",
            "message": "Hi Alex, if you have not perform the modification to the data, please verify your DNI, celphone and the operator to verify it. Thanks."
        },
        {
            "created": "2017-02-01T22:19:58+0000"   ,
            "from": "Alex ",
            "message": "Could someone please help me?, I am callig to CC and they don't answer"
        },
        {
            "created": "2017-02-01T22:19:42+0000",
            "from": "Alex ",
            "message": "the sms with the corresponding key and token has not arrived"
        },
        {
            "created": "2017-02-01T22:19:28+0000",
            "from": "Alex ",
            "message": "I have issues to make payments from the app"
        },
        {
            "created": "2017-02-01T22:19:18+0000",
            "from": "Alex ",
            "message": "Good afternoon"
        }
    ],
    [
        {
            "created": "2017-02-01T22:19:12+0000",
            "from": "Bank",
            "message": " Hello Alexander, the money is available to be  withdrawn, you could go to any store the number is 70307002459"
        }, 
        {            
            "created": "2017-02-01T16:22:30+0000",
            "from": "Alex",
            "message": "hello they have deposited the money into my account, I don't have account from this bank, Could I know if I can withdraw the money? DNI 427 thanks a lot"
        }

    ]


]

在我从这里获得有用的反馈后,我尝试了:

df = pd.read_json('/home/adolfo/Desktop/CONVERSATIONS/test2.json')

df.created = pd.to_datetime(df.created)

df.assign(qna=np.where(df['from'] == 'Bank', 'Answer', 'Question')).set_index(['created', 'qna']).message.unstack(fill_value='')

但我得到了:

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-44-8881c5d91cd0> in <module>()
     63 df = pd.read_json('/home/adolfo/Desktop/CONVERSATIONS/test2.json')
     64 
---> 65 df.created = pd.to_datetime(df.created)
     66 
     67 df.assign(qna=np.where(df['from'] == 'Bank', 'Answer', 'Question')).set_index(['created', 'qna']).message.unstack(fill_value='')

/usr/local/lib/python3.5/dist-packages/pandas/core/generic.py in __getattr__(self, name)
   2742             if name in self._info_axis:
   2743                 return self[name]
-> 2744             return object.__getattribute__(self, name)
   2745 
   2746     def __setattr__(self, name, value):

AttributeError: 'DataFrame' object has no attribute 'created'

1 个答案:

答案 0 :(得分:1)

    j = """[
    [
        {
            "created": "2017-02-02T11:57:41+0000",
            "from": "Bank",
            "message": "Hi Alex, if you have not perform the modification to the data, please verify your DNI, celphone and the operator to verify it. Thanks."
        },
        {
            "created": "2017-02-01T22:19:58+0000"   ,
            "from": "Alex ",
            "message": "Could someone please help me?, I am callig to CC and they don't answer"
        },
        {
            "created": "2017-02-01T22:19:42+0000",
            "from": "Alex ",
            "message": "the sms with the corresponding key and token has not arrived"
        },
        {
            "created": "2017-02-01T22:19:28+0000",
            "from": "Alex ",
            "message": "I have issues to make payments from the app"
        },
        {
            "created": "2017-02-01T22:19:18+0000",
            "from": "Alex ",
            "message": "Good afternoon"
        }
    ],
    [
        {
            "created": "2017-02-01T22:19:12+0000",
            "from": "Bank",
            "message": " Hello Alexander, the money is available to be  withdrawn, you could go to any store the number is 70307002459"
        }, 
        {            
            "created": "2017-02-01T16:22:30+0000",
            "from": "Alex",
            "message": "hello they have deposited the money into my account, I don't have account from this bank, Could I know if I can withdraw the money? DNI 427 thanks a lot"
        }

    ]


]"""

js = json.loads(j)
df = pd.concat({i: pd.DataFrame(j) for i, j in enumerate(js)})

df.created = pd.to_datetime(df.created)

df.assign(qna=np.where(df['from'] == 'Bank', 'Answer', 'Question')).set_index(['created', 'qna']).message.unstack(fill_value='')

enter image description here