将我的字典变成熊猫数据框

时间:2020-01-25 15:30:37

标签: python pandas dictionary

我有一个函数,可以根据某些条件创建多个命令。

但是,我真的很想在收集字典后将其变成一个数据框。 但是我找不到一个简单的方法...现在,我在想解决方案是将字典中的每个键乘以最内部字典中键的数量,但是希望有更好的方法< / p>

由于我的函数创建了字典,因此,如果有更好的方法可以更改它。

这是我的字典

{'TSLA': {2011: {'negative': {'lowPrice': 185.16,
    'lowDate': '05/27/19',
    'highPrice': 365.71,
    'highDate': '12/10/18',
    'change': -0.49}},
  2012: {'negative': {'lowPrice': 185.16,
    'lowDate': '05/27/19',
    'highPrice': 365.71,
    'highDate': '12/10/18',
    'change': -0.49}},
  2013: {'negative': {'lowPrice': 32.91,
    'lowDate': '01/07/13',
    'highPrice': 37.24,
    'highDate': '03/26/12',
    'change': -0.12},
   'positive': {'lowPrice': 32.91,
    'lowDate': '01/07/13',
    'highPrice': 190.9,
    'highDate': '09/23/13',
    'change': 4.8}}}}

我想要的输出将是这样的,当然带有值:

                    lowPrice lowDate highPrice highDate change
ATVI  2012 Negative      NaN     NaN       NaN      NaN  NaN
           Positive      NaN     NaN       NaN      NaN  NaN
      2013 Negative      NaN     NaN       NaN      NaN  NaN
TSLA  2014 Positive      NaN     NaN       NaN      NaN  NaN
      2012 Negative      NaN     NaN       NaN      NaN  NaN
      2013 Positive      NaN     NaN       NaN      NaN  NaN
      2014 Positive      NaN     NaN       NaN      NaN  NaN

4 个答案:

答案 0 :(得分:9)

您可以将嵌套字典平整两次以获取键的元组,然后传递给DataFrame.from_dict

d1 = {(k1, k2, k3): v3 
      for k1, v1 in d.items() 
      for k2, v2 in v1.items()
      for k3, v3 in v2.items()}

df = pd.DataFrame.from_dict(d1, orient='index')
#alternative
#df = pd.DataFrame(d1).T

print (df)
                   lowPrice   lowDate highPrice  highDate change
TSLA 2011 negative   185.16  05/27/19    365.71  12/10/18  -0.49
     2012 negative   185.16  05/27/19    365.71  12/10/18  -0.49
     2013 negative    32.91  01/07/13     37.24  03/26/12  -0.12
          positive    32.91  01/07/13     190.9  09/23/13    4.8

答案 1 :(得分:5)

类似,但您也可以使用from_dict

df=pd.DataFrame.from_dict({(i, j, x) : y
                           for i in d.keys()
                           for j in d[i].keys()
                           for x, y in d[i][j].items()},
                           orient='index')

print (df)

                    lowPrice   lowDate  highPrice  highDate  change
TSLA 2011 negative    185.16  05/27/19     365.71  12/10/18   -0.49
     2012 negative    185.16  05/27/19     365.71  12/10/18   -0.49
     2013 negative     32.91  01/07/13      37.24  03/26/12   -0.12
          positive     32.91  01/07/13     190.90  09/23/13    4.80

答案 2 :(得分:2)

引用:Construct pandas DataFrame from items in nested dictionary

df = pd.DataFrame.from_dict({(i,j): dict_[i][j][z] 
                               for i in dict_.keys() 
                               for j in dict_[i].keys()
                               for z in dict_[i][j].keys()},
                           orient='index')
df


           lowPrice   lowDate  highPrice  highDate  change
TSLA 2011    185.16  05/27/19     365.71  12/10/18   -0.49
     2012    185.16  05/27/19     365.71  12/10/18   -0.49
     2013     32.91  01/07/13     190.90  09/23/13    4.80

答案 3 :(得分:0)

x = {'TSLA': {2011: {'negative': {'lowPrice': 185.16,
    'lowDate': '05/27/19',
    'highPrice': 365.71,
    'highDate': '12/10/18',
    'change': -0.49}},
  2012: {'negative': {'lowPrice': 185.16,
    'lowDate': '05/27/19',
    'highPrice': 365.71,
    'highDate': '12/10/18',
    'change': -0.49}},
  2013: {'negative': {'lowPrice': 32.91,
    'lowDate': '01/07/13',
    'highPrice': 37.24,
    'highDate': '03/26/12',
    'change': -0.12},
   'positive': {'lowPrice': 32.91,
    'lowDate': '01/07/13',
    'highPrice': 190.9,
    'highDate': '09/23/13',
    'change': 4.8}}}}

y = []
z = []
for k0 in x:
    for k1 in x[k0]:
        for k2 in x[k0][k1]:
            y .append((k0, k1, k2))     
            col = x[k0][k1][k2].keys()
            for c in col:
                z.append(x[k0][k1][k2][c])


index = pd.MultiIndex.from_tuples(y)
df = pd.DataFrame(columns=col, index=index)
z  = np.array(z).reshape(df.shape)
df = pd.DataFrame(columns=col, index=index, data=z)

print(df)

                   lowPrice   lowDate highPrice  highDate change
TSLA 2011 negative   185.16  05/27/19    365.71  12/10/18  -0.49
     2012 negative   185.16  05/27/19    365.71  12/10/18  -0.49
     2013 negative    32.91  01/07/13     37.24  03/26/12  -0.12
          positive    32.91  01/07/13     190.9  09/23/13    4.8