如何使用列中的字典行解析数据框

时间:2020-07-04 20:51:02

标签: python python-3.x dataframe dictionary

我有一个数据集,它在单列中返回多个行,就像这样:

{'billingaddresscountry':'Brasil','ip':'187.78.30.72','billingcpf':'001.022.614-61','billingphoneddi':'55','billingphoneddd':'81','billingphonenumber':'8815-2111','date':'2012-07-10T01:05:59.6177731-03:00'}

并且像这样

{'billingcpf':'324.625.318-43','billingphoneddi':'55','billingphoneddd':'11','billingphonenumber':'989523447','billingaddresscountry':'Brasil','paymentaddresscountry':'Brasil'}

编辑: 我的数据集中有数千行看起来与此类似。它们在字典中都有不同数量的键。如何将其放在以列名作为键而行作为值的数据框内。

有人可以帮忙吗?

1 个答案:

答案 0 :(得分:0)

我认为您错过了这个非常简单的解决方案?

import pandas as pd

data = [
    {'billingaddresscountry': 'Brasil', 'ip': '187.78.30.72', 'billingcpf': '001.022.614-61', 'billingphoneddi': '55',
     'billingphoneddd': '81', 'billingphonenumber': '8815-2111', 'date': '2012-07-10T01:05:59.6177731-03:00'},
    {'billingcpf': '324.625.318-43', 'billingphoneddi': '55', 'billingphoneddd': '11',
     'billingphonenumber': '989523447', 'billingaddresscountry': 'Brasil', 'paymentaddresscountry': 'Brasil'}
]

df = pd.DataFrame(data)
print(df)

让您对结果有更好的了解:

import pandas as pd

data = [
    {'a': 1, 'b': 2, 'c': 3},
    # different order
    {'b': 4, 'a': 5, 'c': 6},
    # missing values
    {'b': 7, 'c': 8},
    # new columns introduced later
    {'a': 9, 'd': 10},
]

df = pd.DataFrame(data)
print(df)

结果:

     a    b    c     d
0  1.0  2.0  3.0   NaN
1  5.0  4.0  6.0   NaN
2  NaN  7.0  8.0   NaN
3  9.0  NaN  NaN  10.0

您表示由于体积原因,您不想一次添加所有行:

df = df.append([{'a': 11, 'e': 12}])
print(df)

结果:

      a    b    c     d     e
0   1.0  2.0  3.0   NaN   NaN
1   5.0  4.0  6.0   NaN   NaN
2   NaN  7.0  8.0   NaN   NaN
3   9.0  NaN  NaN  10.0   NaN
0  11.0  NaN  NaN   NaN  12.0