与我的previous question类似,我希望按groupby拆分数据框并应用计算。
现在我想引入一个新列来分析数据帧上的计算。这是代码:
year
运行上述内容将在整个数据框中提供扩展平均值。但是,如何重新开始每个新year
的计算?
我尝试在df声明中将role
添加到columns =,但它包含在 game team role hw aw wins expanding_mean year
0 1 A home 0 1 0 NaN 2000
1 1 B away 0 1 1 NaN 2000
2 2 B home 1 0 1 1.000000 2000
3 2 A away 1 0 0 0.000000 2000
4 3 B home 0 0 0 1.000000 2000
5 3 A away 0 0 0 0.000000 2000
6 4 A home 1 0 1 0.000000 2000
7 4 B away 1 0 0 0.666667 2000
8 5 B home 0 1 0 NaN 2001
9 5 A away 0 1 1 NaN 2001
10 6 A home 1 0 1 0.000000 2001
11 6 B away 1 0 0 1.000000 2001
12 7 A home NaN NaN NaN 0.500000 2001
13 7 B away NaN NaN NaN 0.500000 2001
中,这是不需要的。我在理解上的差距在水平上,所以任何启蒙都会受到赞赏。
编辑:
下面的所需结果JOIN
答案 0 :(得分:2)
您可以将<?xml version="1.0" encoding="utf-8"?>
<manifest xmlns:android="http://schemas.android.com/apk/res/android"
package="com.nooriginalthought.bluebadgeparking">
<uses-permission android:name="android.permission.INTERNET" />
<uses-permission android:name="android.permission.ACCESS_COARSE_LOCATION" />
<uses-permission android:name="android.permission.ACCESS_FINE_LOCATION" />
<application
android:allowBackup="true"
android:icon="@mipmap/ic_launcher"
android:label="@string/app_name"
android:supportsRtl="true"
android:theme="@style/AppTheme">
<activity android:name=".PreLoadChecks"
android:theme="@style/AppTheme.NoActionBar">
<intent-filter>
<action android:name="android.intent.action.MAIN" />
<category android:name="android.intent.category.LAUNCHER" />
</intent-filter>
</activity>
<activity
android:name=".websiteViewActivity"
android:configChanges="orientation|screenSize|keyboard|keyboardHidden"
android:label="@string/app_name"
android:theme="@style/AppTheme.NoActionBar">
<intent-filter>
<paction android:name="android.intent.action.MAIN" />
<category android:name="android.intent.category.LAUNCHER" />
</intent-filter>
</activity>
</application>
</manifest>
添加到year
,并在df.groupby(['team', 'year'])
上方的代码中添加列year
,并将groupby
更改为level_3
in level_4
1}},因为列rename
已添加到索引:
year
import pandas as pd
import numpy as np
d = {'year' : [2000, 2000, 2000, 2000, 2001, 2001, 2001],
'home': ['A', 'B', 'B', 'A', 'B', 'A', 'A'],
'away': ['B', 'A', 'A', 'B', 'A', 'B', 'B'],
'aw': [1, 0, 0, 0, 1, 0, np.nan],
'hw': [0, 1, 0, 1, 0, 1, np.nan]}
df = pd.DataFrame(d, columns=['home', 'away', 'hw', 'aw', 'year'])
df.index = range(1, len(df) + 1)
df.index.name = 'game'
df = df.set_index(['hw', 'aw', 'year'], append=True).stack().reset_index().rename(columns={'level_4': 'role', 0: 'team'}).loc[:,
['game', 'team', 'role', 'hw', 'aw', 'year']]
def wins(row):
if row['role'] == 'home':
return row['hw']
else:
return row['aw']
df['wins'] = df.apply(wins, axis=1)
df['expanding_mean'] = df.groupby(['team', 'year'])['wins'].apply(lambda x: pd.expanding_mean(x).shift())
答案 1 :(得分:2)
groupby
year
和team
并使用transform
:
import pandas as pd
import numpy as np
d = {
'year': [2000, 2000, 2000, 2000, 2001, 2001, 2001],
'team': ['A', 'B', 'B', 'A', 'B', 'A', 'A'],
'value': [1, 0, 0, 1, 2, 3, 3],
}
df = pd.DataFrame(d)
df['mean_per_team_and_year'] = df.groupby(['team', 'year']).transform('mean')
print(df)