我有一个数据集,它在单列中返回多个行,就像这样:
{'billingaddresscountry':'Brasil','ip':'187.78.30.72','billingcpf':'001.022.614-61','billingphoneddi':'55','billingphoneddd':'81','billingphonenumber':'8815-2111','date':'2012-07-10T01:05:59.6177731-03:00'}
并且像这样
{'billingcpf':'324.625.318-43','billingphoneddi':'55','billingphoneddd':'11','billingphonenumber':'989523447','billingaddresscountry':'Brasil','paymentaddresscountry':'Brasil'}
编辑: 我的数据集中有数千行看起来与此类似。它们在字典中都有不同数量的键。如何将其放在以列名作为键而行作为值的数据框内。
有人可以帮忙吗?
答案 0 :(得分:0)
我认为您错过了这个非常简单的解决方案?
import pandas as pd
data = [
{'billingaddresscountry': 'Brasil', 'ip': '187.78.30.72', 'billingcpf': '001.022.614-61', 'billingphoneddi': '55',
'billingphoneddd': '81', 'billingphonenumber': '8815-2111', 'date': '2012-07-10T01:05:59.6177731-03:00'},
{'billingcpf': '324.625.318-43', 'billingphoneddi': '55', 'billingphoneddd': '11',
'billingphonenumber': '989523447', 'billingaddresscountry': 'Brasil', 'paymentaddresscountry': 'Brasil'}
]
df = pd.DataFrame(data)
print(df)
让您对结果有更好的了解:
import pandas as pd
data = [
{'a': 1, 'b': 2, 'c': 3},
# different order
{'b': 4, 'a': 5, 'c': 6},
# missing values
{'b': 7, 'c': 8},
# new columns introduced later
{'a': 9, 'd': 10},
]
df = pd.DataFrame(data)
print(df)
结果:
a b c d
0 1.0 2.0 3.0 NaN
1 5.0 4.0 6.0 NaN
2 NaN 7.0 8.0 NaN
3 9.0 NaN NaN 10.0
您表示由于体积原因,您不想一次添加所有行:
df = df.append([{'a': 11, 'e': 12}])
print(df)
结果:
a b c d e
0 1.0 2.0 3.0 NaN NaN
1 5.0 4.0 6.0 NaN NaN
2 NaN 7.0 8.0 NaN NaN
3 9.0 NaN NaN 10.0 NaN
0 11.0 NaN NaN NaN 12.0