Question

我有一个Pandas df（见下文），我想根据索引列对值进行求和。我的索引列包含字符串值。请参阅下面的示例，这里我尝试将移动，播放和使用电话一起添加为＆＃34;活动时间＆＃34;并将它们的相应值相加，同时保留其他索引值，因为这些值已经存在。任何建议，我如何使用这种类型的场景？

**Activity  AverageTime**
Moving      0.000804367 
Playing     0.001191772 
Stationary  0.320701558 
Using Phone 0.594305473 
Unknown     0.060697612 
Idle        0.022299218

Answer 1

我确信必须有一种更简单的方法，但这是一种可能的解决方案。

# Filters for active and inactive rows
active_row_names = ['Moving','Playing','Using Phone']
active_filter = [row in active_row_names for row in df.index]
inactive_filter = [not row for row in active_filter]

active = df.loc[active_filter].sum()       # Sum of 'active' rows as a Series
active  = pd.DataFrame(active).transpose() # as a dataframe, and fix orientation
active.index=["active"]                    # Assign new index name

# Keep the inactive rows as they are, and replace the active rows with the
# newly defined row that is the sum of the previous active rows.
df = df.loc[inactive_filter].append(active, ignore_index=False)

<强>输出

Activity       AverageTime
Stationary     0.320702
Unknown        0.060698
Idle           0.022299
active         0.596302

即使数据帧中只存在活动行名称的子集，这也会起作用。

Answer 2

我会添加一个名为“active”的新布尔列，然后添加groupby列：

df['active']=False
df['active'][['Moving','Playing','Using Phone']] = True
df.groupby('active').AverageTime.sum()

Pandas df基于索引列对行进行求和

2 个答案: