当每个值都是dict时,将数据帧保存为excel

时间:2016-05-20 07:20:58

标签: python excel pandas dataframe

目前我正在开发一个项目来读取多个Excel工作表,解析数据,并将组合的数据帧保存回excel文件。

我将多个excel表读取到数据帧,将每个转换为嵌套的dict,然后将all组合成一个大的嵌套dict(三维数据),最后转换回dataframe。

我已经完成了将数据组合到一个大数据帧。但是,数据帧的每个值都是一个dict,因此在使用to_excel保存到excel之后,格式不好(如下所示)。我想要的是重新组织数据,因此每个excel单元格只包含一个键,一个索引项或一个值。

以下是我用于将数据保存到Excel的代码:

with ExcelWriter(dump_excel) as writer: DataFrame(DataFrame(new_dict)).to_excel(writer, sheet_name='yield_all')

enter image description here

我想要的是下面的内容: enter image description here

2 个答案:

答案 0 :(得分:3)

IIUC您可以使用dict构造函数和DataFrame理解删除list

print (df)
                                             BIN2_R1
0  {0.0: 23, 1.0: 31, 'yield': '13.01%', 'total':...
1  {0.0: 81, 1.0: 70, 'yield': '36.01%', 'total':...

print (pd.DataFrame([x for x in df.BIN2_R1]))
   0.0  1.0   yield  total
0   23   31  13.01%     54
1   81   70  36.01%    151

编辑:

您可以使用concat

df1 = pd.concat([(pd.DataFrame([x for x in df.BIN2_R1])),
                 (pd.DataFrame([x for x in df.FT]))], axis=1, keys=['BIN2_R1','FT'])

print (df1)
  BIN2_R1                    FT                    
      0.0 1.0   yield total 0.0   1.0   yield total
0      23  31  13.01%    54  82  6517  92.70%  6599
1      81  70  36.01%   151  51   173   0.53%    13

df1.to_excel('test.xlsx')

更通用的解决方案,如果所有列都包含dictionaries

dfs = [pd.DataFrame([x for x in df[col]]) for col in df.columns]
df1 = pd.concat(dfs, axis=1, keys=df.columns)
print (df1)

  BIN2_R1                    FT                    
      0.0 1.0   yield total 0.0   1.0   yield total
0      23  31  13.01%    54  82  6517  92.70%  6599
1      81  70  36.01%   151  51   173   0.53%    13

df1.to_excel('test.xlsx')

编辑:

主要问题是内部dict不是dict,而是string。 所以我必须转换它们。使用NaN无法转换,因此我{} import pandas as pd import ast d = {1: {u'FT1': u"{0.0: 19732, 1.0: 20495, 'total': 40227, 'yield': '93.34%'}", u'FT3': u"{0.0: 9285, 1.0: 9629, 'total': 18914, 'yield': '92.93%'}", u'FT2': u"{0.0: 1412, 1.0: 1480, 'total': 2892, 'yield': '93.87%'}", u'FT': u"{0.0: 82, 1.0: 6517, 'total': 6599, 'yield': '92.70%'}", u'FT_R1': u"{0.0: 1262, 1.0: 1418, 'total': 2680, 'yield': '53.73%'}", u'QA_R2': u"{0.0: 2, 'total': 2, 'yield': '100.00%'}", u'QA_R1': u"{0.0: 6, 'total': 6, 'yield': '75.00%'}", u'QA': u"{1.0: 750, 'total': 750, 'yield': '98.94%'}", u'BIN2_R1': u"{0.0: 23, 1.0: 31, 'total': 54, 'yield': '13.01%'}"}, 2: {u'FT1': u"{0.0: 246, 1.0: 110, 'total': 356, 'yield': '0.83%'}", u'FT3': u"{0.0: 81, 1.0: 54, 'total': 135, 'yield': '0.66%'}", u'FT2': u"{0.0: 9, 1.0: 3, 'total': 12, 'yield': '0.39%'}", u'FT': u"{0.0: 51, 1.0: 173, 'total': 224, 'yield': '3.15%'}", u'FT_R1': u"{0.0: 138, 1.0: 86, 'total': 224, 'yield': '4.49%'}", u'QA_R1': u"{0.0: 1, 'total': 1, 'yield': '12.50%'}", u'QA': u"{1.0: 5, 'total': 5, 'yield': '0.66%'}", u'BIN2_R1': u"{0.0: 81, 1.0: 70, 'total': 151, 'yield': '36.39%'}"}, 3: {u'FT1': u"{0.0: 72, 1.0: 47, 'total': 119, 'yield': '0.28%'}", u'FT3': u"{0.0: 35, 1.0: 25, 'total': 60, 'yield': '0.29%'}", u'FT2': u"{0.0: 1, 1.0: 1, 'total': 2, 'yield': '0.06%'}", u'FT': u"{0.0: 0, 1.0: 13, 'total': 13, 'yield': '0.18%'}", u'FT_R1': u"{0.0: 93, 1.0: 98, 'total': 191, 'yield': '3.83%'}", u'BIN2_R1': u"{0.0: 92, 1.0: 97, 'total': 189, 'yield': '45.54%'}"}, 4: {u'FT1': u"{0.0: 132, 1.0: 174, 'total': 306, 'yield': '0.71%'}", u'FT3': u"{0.0: 35, 1.0: 36, 'total': 71, 'yield': '0.35%'}", u'FT2': u"{0.0: 8, 1.0: 11, 'total': 19, 'yield': '0.62%'}", u'FT': u"{0.0: 1, 1.0: 37, 'total': 38, 'yield': '0.53%'}", u'FT_R1': u"{0.0: 179, 1.0: 167, 'total': 346, 'yield': '6.94%'}", u'BIN2_R1': u"{0.0: 1, 1.0: 1, 'total': 2, 'yield': '0.48%'}"}, 5: {u'FT1': u"{0.0: 27, 1.0: 28, 'total': 55, 'yield': '0.13%'}", u'FT_R1': u"{0.0: 41, 1.0: 34, 'total': 75, 'yield': '1.50%'}", u'FT3': u"{0.0: 11, 1.0: 9, 'total': 20, 'yield': '0.10%'}", u'FT2': u"{0.0: 2, 1.0: 0, 'total': 2, 'yield': '0.06%'}", u'FT': u"{0.0: 0, 1.0: 4, 'total': 4, 'yield': '0.06%'}"}, 8: {u'FT1': u"{0.0: 76, 1.0: 77, 'total': 153, 'yield': '0.35%'}", u'FT3': u"{0.0: 40, 1.0: 42, 'total': 82, 'yield': '0.40%'}", u'FT2': u"{0.0: 5, 1.0: 8, 'total': 13, 'yield': '0.42%'}", u'FT': u"{0.0: 0, 1.0: 20, 'total': 20, 'yield': '0.28%'}", u'FT_R1': u"{0.0: 131, 1.0: 133, 'total': 264, 'yield': '5.29%'}", u'BIN2_R1': u"{0.0: 1, 1.0: 2, 'total': 3, 'yield': '0.72%'}"}, 9: {u'FT1': u"{0.0: 199, 1.0: 158, 'total': 357, 'yield': '0.83%'}", u'FT3': u"{0.0: 90, 1.0: 62, 'total': 152, 'yield': '0.75%'}", u'FT2': u"{0.0: 8, 1.0: 8, 'total': 16, 'yield': '0.52%'}", u'FT': u"{0.0: 0, 1.0: 36, 'total': 36, 'yield': '0.51%'}", u'FT_R1': u"{0.0: 238, 1.0: 238, 'total': 476, 'yield': '9.54%'}", u'BIN2_R1': u"{0.0: 2, 1.0: 2, 'total': 4, 'yield': '0.96%'}"}, 10: {u'FT1': u"{0.0: 56, 1.0: 38, 'total': 94, 'yield': '0.22%'}", u'FT3': u"{0.0: 25, 1.0: 33, 'total': 58, 'yield': '0.28%'}", u'FT2': u"{0.0: 5, 1.0: 1, 'total': 6, 'yield': '0.19%'}", u'FT': u"{0.0: 0, 1.0: 11, 'total': 11, 'yield': '0.15%'}", u'FT_R1': u"{0.0: 77, 1.0: 66, 'total': 143, 'yield': '2.87%'}", u'BIN2_R1': u"{0.0: 2, 1.0: 0, 'total': 2, 'yield': '0.48%'}"}, 11: {u'FT1': u"{0.0: 2, 1.0: 0, 'total': 2, 'yield': '0.00%'}", u'FT3': u"{0.0: 1, 1.0: 0, 'total': 1, 'yield': '0.00%'}", u'BIN2_R1': u"{0.0: 1, 1.0: 0, 'total': 1, 'yield': '0.24%'}"}, 12: {u'FT1': u"{0.0: 6, 1.0: 0, 'total': 6, 'yield': '0.01%'}", u'FT3': u"{0.0: 2, 1.0: 0, 'total': 2, 'yield': '0.01%'}", u'FT_R1': u"{0.0: 1, 1.0: 0, 'total': 1, 'yield': '0.02%'}"}, 13: {u'FT1': u"{0.0: 953, 1.0: 422, 'total': 1375, 'yield': '3.19%'}", u'FT3': u"{0.0: 544, 1.0: 292, 'total': 836, 'yield': '4.11%'}", u'FT2': u"{0.0: 88, 1.0: 28, 'total': 116, 'yield': '3.77%'}", u'FT': u"{0.0: 21, 1.0: 147, 'total': 168, 'yield': '2.36%'}", u'FT_R1': u"{0.0: 289, 1.0: 225, 'total': 514, 'yield': '10.30%'}", u'QA_R1': u"{0.0: 1, 'total': 1, 'yield': '12.50%'}", u'QA': u"{1.0: 3, 'total': 3, 'yield': '0.40%'}", u'BIN2_R1': u"{0.0: 4, 1.0: 5, 'total': 9, 'yield': '2.17%'}"}, 14: {u'FT1': u"{0.0: 31, 1.0: 18, 'total': 49, 'yield': '0.11%'}", u'FT_R1': u"{0.0: 35, 1.0: 39, 'total': 74, 'yield': '1.48%'}", u'FT3': u"{0.0: 16, 1.0: 7, 'total': 23, 'yield': '0.11%'}", u'FT2': u"{0.0: 2, 1.0: 1, 'total': 3, 'yield': '0.10%'}", u'FT': u"{0.0: 0, 1.0: 6, 'total': 6, 'yield': '0.08%'}"}} df = pd.DataFrame.from_dict(d, orient='index') #print (df) df = df.fillna('{}') for col in df.columns: df[col] = df[col].map(lambda d : ast.literal_eval(d)) #print (df) dfs = [pd.DataFrame([x for x in df[col]], index=df.index) for col in df.columns] df1 = pd.concat(dfs, axis=1, keys=df.columns)

print (df1)

      FT                           FT_R1                             QA  \
     0.0     1.0   total   yield     0.0     1.0   total   yield    1.0   
1   82.0  6517.0  6599.0  92.70%  1262.0  1418.0  2680.0  53.73%  750.0   
2   51.0   173.0   224.0   3.15%   138.0    86.0   224.0   4.49%    5.0   
3    0.0    13.0    13.0   0.18%    93.0    98.0   191.0   3.83%    NaN   
4    1.0    37.0    38.0   0.53%   179.0   167.0   346.0   6.94%    NaN   
5    0.0     4.0     4.0   0.06%    41.0    34.0    75.0   1.50%    NaN   
8    0.0    20.0    20.0   0.28%   131.0   133.0   264.0   5.29%    NaN   
9    0.0    36.0    36.0   0.51%   238.0   238.0   476.0   9.54%    NaN   
10   0.0    11.0    11.0   0.15%    77.0    66.0   143.0   2.87%    NaN   
11   NaN     NaN     NaN     NaN     NaN     NaN     NaN     NaN    NaN   
12   NaN     NaN     NaN     NaN     1.0     0.0     1.0   0.02%    NaN   
13  21.0   147.0   168.0   2.36%   289.0   225.0   514.0  10.30%    3.0   
14   0.0     6.0     6.0   0.08%    35.0    39.0    74.0   1.48%    NaN   

            ...   BIN2_R1           FT3                         FT1         \
    total   ...     total   yield   0.0   1.0  total   yield    0.0    1.0   
1   750.0   ...      54.0  13.01%  9285  9629  18914  92.93%  19732  20495   
2     5.0   ...     151.0  36.39%    81    54    135   0.66%    246    110   
3     NaN   ...     189.0  45.54%    35    25     60   0.29%     72     47   
4     NaN   ...       2.0   0.48%    35    36     71   0.35%    132    174   
5     NaN   ...       NaN     NaN    11     9     20   0.10%     27     28   
8     NaN   ...       3.0   0.72%    40    42     82   0.40%     76     77   
9     NaN   ...       4.0   0.96%    90    62    152   0.75%    199    158   
10    NaN   ...       2.0   0.48%    25    33     58   0.28%     56     38   
11    NaN   ...       1.0   0.24%     1     0      1   0.00%      2      0   
12    NaN   ...       NaN     NaN     2     0      2   0.01%      6      0   
13    3.0   ...       9.0   2.17%   544   292    836   4.11%    953    422   
14    NaN   ...       NaN     NaN    16     7     23   0.11%     31     18   


    total   yield  
1   40227  93.34%  
2     356   0.83%  
3     119   0.28%  
4     306   0.71%  
5      55   0.13%  
8     153   0.35%  
9     357   0.83%  
10     94   0.22%  
11      2   0.00%  
12      6   0.01%  
13   1375   3.19%  
14     49   0.11%  
    private void SaveControlImage(Control ctr)
    {
        try
        {
            var imagePath = @"C:\Image.png";

            Image bmp = new Bitmap(ctr.Width, ctr.Height);
            var gg = Graphics.FromImage(bmp);
            var rect = ctr.RectangleToScreen(ctr.ClientRectangle);
            gg.CopyFromScreen(rect.Location, Point.Empty, ctr.Size);

            bmp.Save(imagePath);
            Process.Start(imagePath);

        }
        catch (Exception)
        {
            //
        }
    }

答案 1 :(得分:0)

我找到了一个直接读取excel文件数据作为数据帧并连接成单个数据帧的解决方案,而不是将数据帧转换为嵌套的dict中间。

list_df=list(map(lambda s:pd.read_excel(pd.ExcelFile(s), 'yield', index_col=[0]),std_excel_path))
keys_list=list(map(lambda s:get_name(s),std_excel_path))
combined=pd.concat(list_df,axis=1,keys=keys_list)
combined.fillna(0,inplace=True)
combined.columns.names = ['test', 'info']
combined.index.names = ['soft_bin']
print combined

,结果是具有多索引的组合数据框:

test              FT-20160702124027                    FT1-20160702134747  \
info                              0    1 total   yield                  0   
soft_bin                                                                    
01:pass                         957  954  1911  97.01%               4334   
02:os_open_fail                   5    5    10   0.51%                  8   
03:os_short_fail                  1    0     1   0.05%                  2   
04:io_fail                        1    2     3   0.15%                  8   
05:clk_fail                       0    0     0       0                  3   
06:reset_fir_fail                 0    0     0       0                  0   
08:mbist_fail                     1    0     1   0.05%                 10   
09:dc_scan_fail                  11   14    25   1.27%                 67   
10:ac_scan_fail                   3    2     5   0.25%                 21   
11:func_dig_fail                  0    0     0       0                  0   
12:efuse_fail                     0    0     0       0                  0   
13:func_ana_fail                  6    6    12   0.61%                 32   
14:func_idd_fail                  0    2     2   0.10%                  3   

test                                  FT2-20160702183026         ...    \
info                  1 total   yield                  0     1   ...     
soft_bin                                                         ...     
01:pass            4345  8679  96.68%               1671  1688   ...     
02:os_open_fail      10    18   0.20%                  3     3   ...     
03:os_short_fail      1     3   0.03%                  1     0   ...     
04:io_fail           10    18   0.20%                  2     2   ...     
05:clk_fail           2     5   0.06%                  2     1   ...     
06:reset_fir_fail     0     0       0                  0     0   ...     
08:mbist_fail         6    16   0.18%                  3     1   ...     
09:dc_scan_fail      58   125   1.39%                 18    11   ...     
10:ac_scan_fail      20    41   0.46%                  9     4   ...     
11:func_dig_fail      0     0       0                  0     0   ...     
12:efuse_fail         0     0       0                  0     0   ...     
13:func_ana_fail     33    65   0.72%                 16    14   ...     
14:func_idd_fail      4     7   0.08%                  0     1   ...