熊猫集团给别人

时间:2018-01-16 20:50:03

标签: python pandas pandas-groupby

我对分组“其他”的语法有疑问。例如,

DF

Type  Start  End Count Total
A     x      a   1     3
A     x      b   1     3
A     x      c   1     3
A     y      A   2     4
A     y      b   1     4
A     y      c   1     4
B     x      A   1     6
B     x      b   2     6
B     x      c   3     6
B     y      a   3     6
B     y      b   2     6
B     y      c   1     6

按类型/开始/结束列分组,如果结束不包含“a”或“A”,则将其标记为“其他”

Type  Start  End   Count Total
A     x      a     1     3
A     x      other 2     3
A     y      A     2     4
A     y      other 2     4
B     x      A     1     6
B     x      other 5     6
B     y      a     3     6
B     y      other 3     6

3 个答案:

答案 0 :(得分:2)

你快到了。 groupby的前两个参数很好,但最后一个需要修改。

f = {'Count': 'sum', 'Total' : 'mean'}   
v = df.End.where(df.End.isin(['a', 'A']), 'other')

df.groupby(['Type', 'Start', v]).agg(f).reset_index()

  Type Start    End  Total  Count
0    A     x      a      3      1
1    A     x  other      3      2
2    A     y      A      4      2
3    A     y  other      4      2
4    B     x      A      6      1
5    B     x  other      6      5
6    B     y      a      6      3
7    B     y  other      6      3

<强>详情

使用where / mask相应地更改df.End的值;

v = df.End.where(df.End.isin(['a', 'A']), 'other')

或者,

v = df.End.mask(~df.End.isin(['a', 'A']), 'other')

v

0         a
1     other
2     other
3         A
4     other
5     other
6         A
7     other
8     other
9         a
10    other
11    other
Name: End, dtype: object

或者,将列小写并进行比较。

v = df.End.where(df.End.str.lower().eq('a'), 'other')

正如他们所说,其余的都是历史。如果您对保留列顺序感兴趣,请在最后reindex来电时拍打。

df.groupby(['Type', 'Start', v])\
  .agg(f)\
  .reset_index()\
  .reindex(columns=df.columns.tolist())

  Type Start    End  Count  Total
0    A     x      a      1      3
1    A     x  other      2      3
2    A     y      A      2      4
3    A     y  other      2      4
4    B     x      A      1      6
5    B     x  other      5      6
6    B     y      a      3      6
7    B     y  other      3      6

答案 1 :(得分:2)

我认为您需要将所有未包含aA的值替换为where other并将isin条件替换为groupby,然后使用s }列和系列s = df['End'].where(df['End'].isin(['a','A']), 'other') print (s) 0 a 1 other 2 other 3 A 4 other 5 other 6 A 7 other 8 other 9 a 10 other 11 other Name: End, dtype: object df = (df.groupby(['Type', 'Start', s]) .agg({'Count':'sum', 'Total':'mean'}) .reset_index())

End

另一个类似的解决方案是替换列groupby并使用原始解决方案agg + df['End'] = np.where(df['End'].isin(['a','A']), df['End'], 'other') #alternatively #df['End'] = df['End'].where(df['End'].isin(['a','A']), 'other') df = (df.groupby(['Type', 'Start', 'End'], as_index=False) .agg({'Count':'sum', 'Total':'mean'})) print (df) Type Start End Count Total 0 A x a 1 3 1 A x other 2 3 2 A y A 2 4 3 A y other 2 4 4 B x A 1 6 5 B x other 5 6 6 B y a 3 6 7 B y other 3 6

window.onload = function () {
    var imgPad = "../images/";
    var canSwap = document.images ? true : false;

    function swapImage(targetImg, newImg) {
        var plaatje = document.getElementById(targetImg); // There must be an img object in your HTML with the id 'mainImg' in this case.
        if (canSwap) {
            plaatje.src = imgPad + newImg; // in this case the source of image "../images/newImg.gif"
        }
    }

    mainImg.addEventListener("mouseover", function () {
        swapImage("mainImg", "backCamera.png"); // Perform function  
    }, false);

    mainImg.addEventListener("mouseout", function () {
        swapImage("mainImg", "iphonexInside.png"); // Perform function  
    }, false);
}

<div id="iphone"><img id="mainImg" src="../images/iphonexInside.png" alt="iPhone X" usemap="#hardwareMap"></div>

<map id="hardwareMap" name="hardwareMap">
    <area class="iphoneLink" alt="" title="" href="#" shape="rect" coords="552,48,688,313">
    <area class="iphoneLink" alt="" title="" href="#" shape="rect" coords="59,1209,347,1316">
    <area class="iphoneLink" alt="" title="" href="#" shape="rect" coords="477,21,528,101">
    <area class="iphoneLink" alt="" title="" href="#" shape="rect" coords="426,1339,655,1414">
    <area class="iphoneLink" alt="" title="" href="#" shape="poly"
          coords="47,220,398,218,401,814,606,822,607,1186,38,1190">
</map>

答案 2 :(得分:2)

您可以更改End中条目的值以反映所需的更改,并使用您已经描述过的groupby

df.loc[~df.End.isin(['A', 'a']), 'End'] = 'other'
df.groupby(['Type','Start','End'']).agg({'Count':'sum','Tota‌​l':'mean'})