此语句读取json文件。但是它不能正确地拆分列。
df = pd.read_json('https://s3.amazonaws.com/todel162/config1.json',orient ='index')
有什么方法可以使用pandas dataframe读取json吗?
答案 0 :(得分:1)
您可以使用json.json_normalize
:
import json
from pandas.io.json import json_normalize
with open('config1.json') as f:
data = json.load(f)
df = json_normalize(data, 'configurationItems', ['fileVersion'])
print (df)
ARN awsAccountId awsRegion \
0 arn:aws:cloudtrail:us-east-1:513469704633:trai... 513469704633 us-east-1
1 arn:aws:cloudtrail:us-east-1:513469704633:trai... 513469704633 us-east-1
configurationItemCaptureTime configurationItemStatus \
0 2018-07-27T11:52:53.795Z ResourceDeleted
1 2018-07-27T11:52:53.791Z ResourceDeleted
configurationItemVersion configurationStateId configurationStateMd5Hash \
0 1.3 1532692373795
1 1.3 1532692373791
relatedEvents relationships resourceId \
0 [] [] AWSMacieTrail-DO-NOT-EDIT
1 [] [] test01
resourceType supplementaryConfiguration tags fileVersion
0 AWS::CloudTrail::Trail {} {} 1.0
1 AWS::CloudTrail::Trail {} {} 1.0
答案 1 :(得分:1)
您可以尝试一下。
import json
import urllib.request as req
import pandas as pd
with req.urlopen("https://s3.amazonaws.com/todel162/config1.json") as j:
raw = json.loads(j.read().decode())
df = pd.DataFrame(raw["configurationItems"])
df["fileVersion"] = raw["fileVersion"]
print(df)