循环遍历数据帧将行添加到列pandas python

时间:2015-11-11 02:12:21

标签: python pandas

我有一个我读过的数据集:

import pandas as pd
data = pd.read_excel('.../data.xlsx')

内容如下所示:

Out[57]: 
        Block    Concentration          Name      value
          1            100           GlcNAc2      321
          1            100           GlcNAc2      139
          1            100           GlcNAc2      202
          1            33            GlcNAc2      86
          1            33            GlcNAc2      194
          1            33            GlcNAc2      452
          1            100            BCC         345
          1            100            BCC         6
          1            100            BCC         34
          1            33             BCC         11
          1            33             BCC         53
          1            33             BCC         87
          1            0       Print buffer       127
          1            0       Print buffer       55
          1            0       Print buffer       67


 ...     ...            ...               ...        ...               ...

         24             0       Print buffer      -9968
         24             0       Print buffer      -4526
         24             0       Print buffer      14246
  1. 我想为每个Block和Name添加三个' 0'浓度并添加3'打印缓冲液'从该块到这三个新的' 0'浓度。

        Out[57]: 
        Block    Concentration          Name      value
          1            0             GlcNAc2       127
          1            0             GlcNAc2       55
          1            0             GlcNAc2       67
          1            100           GlcNAc2      321
          1            100           GlcNAc2      139
          1            100           GlcNAc2      202
          1            33            GlcNAc2      86
          1            33            GlcNAc2      194
          1            33            GlcNAc2      452
          1            0              BCC         127
          1            0              BCC         55
          1            0              BCC         67
          1            100            BCC         345
          1            100            BCC         6
          1            100            BCC         34
          1            33             BCC         11
          1            33             BCC         53
          1            33             BCC         87
          1            0       Print buffer       127
          1            0       Print buffer       55
          1            0       Print buffer       67
    

    ...... ...... ...... ......

         24             0       Print buffer      -9968
         24             0       Print buffer      -4526
         24             0       Print buffer      14246
    
  2. 计算3'打印缓冲区的平均值'并从同一个块的每个值中减去该值。

    期望的输出:

       Out[57]: 
        Block    Concentration          Name      value         newvalue
          1            0             GlcNAc2      127            127-mean(127+55+67)
          1            0             GlcNAc2      55             55 -mean(127+55+67)
          1            0             GlcNAc2      67             67-mean(127+55+67)
          1            100           GlcNAc2      321            321-mean(127+55+67)
          1            100           GlcNAc2      139             139-mean(127+55+67)
          1            100           GlcNAc2      202            ....
          1            33            GlcNAc2      86
          1            33            GlcNAc2      194
          1            33            GlcNAc2      452
          1            0             BCC          127
          1            0             BCC          55
          1            0             BCC          67
          1            100           BCC          345
          1            100           BCC          6
          1            100           BCC          34
          1            33            BCC          11
          1            33            BCC          53
          1            33            BCC          87
          1            0        Print buffer      127
          1            0        Print buffer      55
          1            0        Print buffer      67
    
    ...     ...            ...               ...        ...               ...
    
         24             0       Print buffer      -9968
         24             0       Print buffer      -4526
         24             0       Print buffer      14246
    

    伪代码:

    for each block
       for each Name
        add concentration '0' three times
        append the three values of 'print buffer' to the three '0' concentrations 
        newvalue = value - average(three print buffer) 
    

1 个答案:

答案 0 :(得分:1)

考虑将groupby apply functions用于数据集。第一个函数仅使用mean()对“打印缓冲区”的值进行平均,而将其他值保留在块0中。然后第二个函数最大化meanvalue。最后,只需创建newvalue作为算术差异:

def add_mean_value(mgrp): 
    mgrp['meanvalue'] = mgrp[mgrp['Name'] == 'Print buffer']['value'].mean()    
    return mgrp
data = data.groupby(['Block', 'Concentration', 'Name']).apply(add_mean_value)    

def max_sum_value(mgrp):    
    mgrp['meanvalue'] = mgrp['meanvalue'].max()    
    return mgrp
data = data.groupby(['Block']).apply(max_sum_value)

data['newvalue'] = data['value'] - data['meanvalue']
print(data)

<强>输出

    Block  Concentration          Name  value  meanvalue  newvalue
0       1            100       GlcNAc2    321         83       238
1       1            100       GlcNAc2    139         83        56
2       1            100       GlcNAc2    202         83       119
3       1             33       GlcNAc2     86         83         3
4       1             33       GlcNAc2    194         83       111
5       1             33       GlcNAc2    452         83       369
6       1            100           BCC    345         83       262
7       1            100           BCC      6         83       -77
8       1            100           BCC     34         83       -49
9       1             33           BCC     11         83       -72
10      1             33           BCC     53         83       -30
11      1             33           BCC     87         83         4
12      1              0  Print buffer    127         83        44
13      1              0  Print buffer     55         83       -28
14      1              0  Print buffer     67         83       -16