Groupby Pandas等级

时间:2016-01-10 20:10:07

标签: python pandas

与我的previous question类似,我希望按groupby拆分数据框并应用计算。

现在我想引入一个新列来分析数据帧上的计算。这是代码:

year

运行上述内容将在整个数据框中提供扩展平均值。但是,如何重新开始每个新year的计算?

我尝试在df声明中将role添加到columns =,但它包含在 game team role hw aw wins expanding_mean year 0 1 A home 0 1 0 NaN 2000 1 1 B away 0 1 1 NaN 2000 2 2 B home 1 0 1 1.000000 2000 3 2 A away 1 0 0 0.000000 2000 4 3 B home 0 0 0 1.000000 2000 5 3 A away 0 0 0 0.000000 2000 6 4 A home 1 0 1 0.000000 2000 7 4 B away 1 0 0 0.666667 2000 8 5 B home 0 1 0 NaN 2001 9 5 A away 0 1 1 NaN 2001 10 6 A home 1 0 1 0.000000 2001 11 6 B away 1 0 0 1.000000 2001 12 7 A home NaN NaN NaN 0.500000 2001 13 7 B away NaN NaN NaN 0.500000 2001 中,这是不需要的。我在理解上的差距在水平上,所以任何启蒙都会受到赞赏。

编辑:

下面的所需结果
JOIN

2 个答案:

答案 0 :(得分:2)

您可以将<?xml version="1.0" encoding="utf-8"?> <manifest xmlns:android="http://schemas.android.com/apk/res/android" package="com.nooriginalthought.bluebadgeparking"> <uses-permission android:name="android.permission.INTERNET" /> <uses-permission android:name="android.permission.ACCESS_COARSE_LOCATION" /> <uses-permission android:name="android.permission.ACCESS_FINE_LOCATION" /> <application android:allowBackup="true" android:icon="@mipmap/ic_launcher" android:label="@string/app_name" android:supportsRtl="true" android:theme="@style/AppTheme"> <activity android:name=".PreLoadChecks" android:theme="@style/AppTheme.NoActionBar"> <intent-filter> <action android:name="android.intent.action.MAIN" /> <category android:name="android.intent.category.LAUNCHER" /> </intent-filter> </activity> <activity android:name=".websiteViewActivity" android:configChanges="orientation|screenSize|keyboard|keyboardHidden" android:label="@string/app_name" android:theme="@style/AppTheme.NoActionBar"> <intent-filter> <paction android:name="android.intent.action.MAIN" /> <category android:name="android.intent.category.LAUNCHER" /> </intent-filter> </activity> </application> </manifest> 添加到year,并在df.groupby(['team', 'year'])上方的代码中添加列year,并将groupby更改为level_3 in level_4 1}},因为列rename已添加到索引:

year
import pandas as pd
import numpy as np

d = {'year' : [2000, 2000, 2000, 2000, 2001, 2001, 2001],
 'home': ['A', 'B', 'B', 'A', 'B', 'A', 'A'],
 'away': ['B', 'A', 'A', 'B', 'A', 'B', 'B'],
 'aw': [1, 0, 0, 0, 1, 0, np.nan],
 'hw': [0, 1, 0, 1, 0, 1, np.nan]}

df = pd.DataFrame(d, columns=['home', 'away', 'hw', 'aw', 'year'])
df.index = range(1, len(df) + 1)
df.index.name = 'game'

df = df.set_index(['hw', 'aw', 'year'], append=True).stack().reset_index().rename(columns={'level_4': 'role', 0: 'team'}).loc[:,
 ['game', 'team', 'role', 'hw', 'aw', 'year']]

def wins(row):
    if row['role'] == 'home':
        return row['hw']
    else:
        return row['aw']
df['wins'] = df.apply(wins, axis=1)

df['expanding_mean'] = df.groupby(['team', 'year'])['wins'].apply(lambda x: pd.expanding_mean(x).shift())

答案 1 :(得分:2)

groupby yearteam并使用transform

import pandas as pd
import numpy as np


d = {
    'year': [2000, 2000, 2000, 2000, 2001, 2001, 2001],
    'team': ['A', 'B', 'B', 'A', 'B', 'A', 'A'],
    'value': [1, 0, 0, 1, 2, 3, 3],
}

df = pd.DataFrame(d)

df['mean_per_team_and_year'] = df.groupby(['team', 'year']).transform('mean')
print(df)