我正在尝试从JSON生成数据框。我所拥有的json格式为
{
eventId: "9668383e-ec96-4d6a-b873-2312dd008e7b",
eventType: "PlannedCustomerChoiceWasUpdated",
publishedDate: "2016-05-31T18:52:29.219Z",
payload: {
plannedCustomerChoiceId: "e9301a6e-7ccf-4c89-bd05-19c1b9067a61"
},
_links: {
self: {
href: "http://gbp-router.gapinc.dev:8080/planning-service/_feeds/planning.planning-service.planned-customer-choice-events/entries/d2de62a6-1e0f-430a-bf3f-2df711f64beb"
},
source: {
href: "http://gbp-router.gapinc.dev:8080/planning-service/planning/buy-plan/planned-customer-choices/e9301a6e-7ccf-4c89-bd05-19c1b9067a61"
}
}
}
我需要将列作为单个记录。
这是我迄今为止所做的。
from pandas.io.json import json_normalize
a = { "eventId": "9668383e-ec96-4d6a-b873-2312dd008e7b", "eventType": "PlannedCustomerChoiceWasUpdated", "publishedDate": "2016-05-31T18:52:29.219Z", "payload": { "plannedCustomerChoiceId": "e9301a6e-7ccf-4c89-bd05-19c1b9067a61" }, "_links": { "self": { "href": "http://gbp-router.gapinc.dev:8080/planning-service/_feeds/planning.planning-service.planned-customer-choice-events/entries/d2de62a6-1e0f-430a-bf3f-2df711f64beb" }, "source": { "href": "http://gbp-router.gapinc.dev:8080/planning-service/planning/buy-plan/planned-customer-choices/e9301a6e-7ccf-4c89-bd05-19c1b9067a61" } } }
b = json_normalize(a)
print b
我无法达到理想的格式。任何人都可以帮助我。
使用b = pd.DataFrame(a)数据框采用以下格式。
_
links \
plannedCustomerChoiceId NaN
self {u'href': u'http://gbp-router.gapinc.dev:8080/...
source {u'href': u'http://gbp-router.gapinc.dev:8080/...
eventId \
plannedCustomerChoiceId 9668383e-ec96-4d6a-b873-2312dd008e7b
self 9668383e-ec96-4d6a-b873-2312dd008e7b
source 9668383e-ec96-4d6a-b873-2312dd008e7b
eventType \
plannedCustomerChoiceId PlannedCustomerChoiceWasUpdated
self PlannedCustomerChoiceWasUpdated
source PlannedCustomerChoiceWasUpdated
payload \
plannedCustomerChoiceId e9301a6e-7ccf-4c89-bd05-19c1b9067a61
self NaN
source NaN
publishedDate
plannedCustomerChoiceId 2016-05-31T18:52:29.219Z
self 2016-05-31T18:52:29.219Z
source 2016-05-31T18:52:29.219Z
我真正想要的是
9668383e-ec96-4d6a-b873-2312dd008e7b,PlannedCustomerChoiceWasUpdated,2016-05-31T18:52:29.219Z,e9301a6e-7ccf-4c89-bd05-19c1b9067a61,http://gbp-router.gapinc.dev:8080/planning-service/planning/buy-plan/planned-customer-choices/e9301a6e-7ccf-4c89-bd05-19c1b9067a61
答案 0 :(得分:1)
我认为您可以先更改列的排序,然后to_csv
:
import pandas as pd
from pandas.io.json import json_normalize
a = { "eventId": "9668383e-ec96-4d6a-b873-2312dd008e7b", "eventType": "PlannedCustomerChoiceWasUpdated", "publishedDate": "2016-05-31T18:52:29.219Z", "payload": { "plannedCustomerChoiceId": "e9301a6e-7ccf-4c89-bd05-19c1b9067a61" }, "_links": { "self": { "href": "http://gbp-router.gapinc.dev:8080/planning-service/_feeds/planning.planning-service.planned-customer-choice-events/entries/d2de62a6-1e0f-430a-bf3f-2df711f64beb" }, "source": { "href": "http://gbp-router.gapinc.dev:8080/planning-service/planning/buy-plan/planned-customer-choices/e9301a6e-7ccf-4c89-bd05-19c1b9067a61" } } }
b = json_normalize(a)
b = b[['eventId','eventType','publishedDate','payload.plannedCustomerChoiceId','_links.source.href']]
#print (b)
print (b.to_csv(index=False, header=False))
9668383e-ec96-4d6a-b873-2312dd008e7b,PlannedCustomerChoiceWasUpdated,2016-05-31T18:52:29.219Z,e9301a6e-7ccf-4c89-bd05-19c1b9067a61,http://gbp-router.gapinc.dev:8080/planning-service/planning/buy-plan/planned-customer-choices/e9301a6e-7ccf-4c89-bd05-19c1b9067a61
如果需要更改列名:
import pandas as pd
from pandas.io.json import json_normalize
a = { "eventId": "9668383e-ec96-4d6a-b873-2312dd008e7b", "eventType": "PlannedCustomerChoiceWasUpdated", "publishedDate": "2016-05-31T18:52:29.219Z", "payload": { "plannedCustomerChoiceId": "e9301a6e-7ccf-4c89-bd05-19c1b9067a61" }, "_links": { "self": { "href": "http://gbp-router.gapinc.dev:8080/planning-service/_feeds/planning.planning-service.planned-customer-choice-events/entries/d2de62a6-1e0f-430a-bf3f-2df711f64beb" }, "source": { "href": "http://gbp-router.gapinc.dev:8080/planning-service/planning/buy-plan/planned-customer-choices/e9301a6e-7ccf-4c89-bd05-19c1b9067a61" } } }
b = json_normalize(a)
b.columns = ['self','source','eventId','eventType','plannedCustomerChoiceId','publishedDate']
print (b)
self \
0 http://gbp-router.gapinc.dev:8080/planning-ser...
source \
0 http://gbp-router.gapinc.dev:8080/planning-ser...
eventId eventType \
0 9668383e-ec96-4d6a-b873-2312dd008e7b PlannedCustomerChoiceWasUpdated
plannedCustomerChoiceId publishedDate
0 e9301a6e-7ccf-4c89-bd05-19c1b9067a61 2016-05-31T18:52:29.219Z
b = b[['eventId','eventType','publishedDate','plannedCustomerChoiceId','source']]
print (b)
eventId eventType \
0 9668383e-ec96-4d6a-b873-2312dd008e7b PlannedCustomerChoiceWasUpdated
publishedDate plannedCustomerChoiceId \
0 2016-05-31T18:52:29.219Z e9301a6e-7ccf-4c89-bd05-19c1b9067a61
source
0 http://gbp-router.gapinc.dev:8080/planning-ser...
print (b.to_csv(index=False, header=False))
b873-2312dd008e7b,PlannedCustomerChoiceWasUpdated,2016-05-31T18:52:29.219Z,e9301a6e-7ccf-4c89-bd05-19c1b9067a61,http://gbp-router.gapinc.dev:8080/planning-service/planning/buy-plan/planned-customer-choices/e9301a6e-7ccf-4c89-bd05-19c1b9067a61