Python Pandas Groupby:如何在一个衬里中凝聚循环

时间:2017-12-14 10:12:14

标签: python pandas pandas-groupby

对于Python Pandas:

我想简化我的代码 - 所以它最终是一个单行(原因:性能优化)。

我如何编写它以便我只有一行包含groupby语句?

类似的东西:

dfResult = df2.groupby("a").something().I()Do()Not()Understand()Yet()

这是我的代码(我想过滤掉ab之间标准偏差太大的列import pandas as pd dfResult = pd.DataFrame() df2 = pd.DataFrame({'a': ("w", "w", "w", "w", "x", "x", "x"), 'b': (30, 42, 54, 68, 7, 8, 65)}) print('input data:') print(df2) dfGroupBy = df2.groupby("a") for key, item in dfGroupBy: innerDf = dfGroupBy.get_group(key) # calculate delta between two rows for column 'b' innerDf['delta'] = innerDf['b'] - innerDf['b'].shift(1) # calculate standard deviation (without the first row) standardDeviation = pd.np.std(innerDf['delta'][1:]) if standardDeviation < 15: print ("so my standard deviation is small enough!") print(innerDf['delta'][1:]) print("standard deviation:", standardDeviation) # remove column 'delta', as I needed it only in between innerDf = innerDf.drop('delta', axis=1) dfResult = dfResult.append(innerDf) print("result:") print(dfResult) 的这些组:

input data:
   a   b
0  w  30
1  w  42
2  w  54
3  w  68
4  x   7
5  x   8
6  x  65
so my standard deviation is small enough!
1    12.0
2    12.0
3    14.0
Name: delta, dtype: float64
standard deviation: 0.942809041582
result:
   a   b
0  w  30
1  w  42
2  w  54
3  w  68

这是控制台输出:

{{1}}

0 个答案:

没有答案