Question

对于Python Pandas：

我想简化我的代码 - 所以它最终是一个单行（原因：性能优化）。

我如何编写它以便我只有一行包含groupby语句？

类似的东西：

dfResult = df2.groupby("a").something().I()Do()Not()Understand()Yet()

这是我的代码（我想过滤掉a列b之间标准偏差太大的列import pandas as pd dfResult = pd.DataFrame() df2 = pd.DataFrame({'a': ("w", "w", "w", "w", "x", "x", "x"), 'b': (30, 42, 54, 68, 7, 8, 65)}) print('input data:') print(df2) dfGroupBy = df2.groupby("a") for key, item in dfGroupBy: innerDf = dfGroupBy.get_group(key) # calculate delta between two rows for column 'b' innerDf['delta'] = innerDf['b'] - innerDf['b'].shift(1) # calculate standard deviation (without the first row) standardDeviation = pd.np.std(innerDf['delta'][1:]) if standardDeviation < 15: print ("so my standard deviation is small enough!") print(innerDf['delta'][1:]) print("standard deviation:", standardDeviation) # remove column 'delta', as I needed it only in between innerDf = innerDf.drop('delta', axis=1) dfResult = dfResult.append(innerDf) print("result:") print(dfResult)的这些组：

input data:
   a   b
0  w  30
1  w  42
2  w  54
3  w  68
4  x   7
5  x   8
6  x  65
so my standard deviation is small enough!
1    12.0
2    12.0
3    14.0
Name: delta, dtype: float64
standard deviation: 0.942809041582
result:
   a   b
0  w  30
1  w  42
2  w  54
3  w  68

这是控制台输出：

{{1}}

Python Pandas Groupby：如何在一个衬里中凝聚循环

0 个答案: