Question

如果我有一个与此类似的数据框

Apples   Bananas   Grapes   Kiwis
2        3         nan      1
1        3         7        nan
nan      nan       2        3

我想添加一个像这样的列

Apples   Bananas   Grapes   Kiwis   Fruit Total
2        3         nan      1        6
1        3         7        nan      11
nan      nan       2        3        5

我猜你可以使用df['Apples'] + df['Bananas']等等，但我的实际数据帧比这大得多。我希望像df['Fruit Total']=df[-4:-1].sum这样的公式可以在一行代码中完成这个技巧。然而，这没有奏效。有没有办法在没有明确总结所有列的情况下做到这一点？

Answer 1

您可以先按iloc选择，然后选择sum：

df['Fruit Total']= df.iloc[:, -4:-1].sum(axis=1)
print (df)
   Apples  Bananas  Grapes  Kiwis  Fruit Total
0     2.0      3.0     NaN    1.0          5.0
1     1.0      3.0     7.0    NaN         11.0
2     NaN      NaN     2.0    3.0          2.0

Answer 2

即使不知道列数甚至没有iloc，也可以这样做：

print(df)
   Apples  Bananas  Grapes  Kiwis
0     2.0      3.0     NaN    1.0
1     1.0      3.0     7.0    NaN
2     NaN      NaN     2.0    3.0

cols_to_sum = df.columns[ : df.shape[1]-1]

df['Fruit Total'] = df[cols_to_sum].sum(axis=1)

print(df)
   Apples   Bananas Grapes  Kiwis   Fruit Total
0  2.0      3.0     NaN     1.0     5.0
1  1.0      3.0     7.0     NaN     11.0
2  NaN      NaN     2.0     3.0     2.0

Answer 3

在原始df上使用df['Fruit Total']= df.iloc[:, -4:-1].sum(axis=1)不会添加最后一列（“ Kiwis”），您应该使用df.iloc[:, -4:]来选择所有列：

print(df)
   Apples  Bananas  Grapes  Kiwis
0     2.0      3.0     NaN    1.0
1     1.0      3.0     7.0    NaN
2     NaN      NaN     2.0    3.0

df['Fruit Total']=df.iloc[:,-4:].sum(axis=1)

print(df)
   Apples  Bananas  Grapes  Kiwis  Fruit Total
0     2.0      3.0     NaN    1.0          6.0
1     1.0      3.0     7.0    NaN         11.0
2     NaN      NaN     2.0    3.0          5.0

Answer 4

这可能对初学者有帮助，所以为了完整起见，如果您知道列名（例如它们在列表中），您可以使用：

column_names = ['Apples', 'Bananas', 'Grapes', 'Kiwis']
df['Fruit Total']= df[column_names].sum(axis=1)

这使您可以灵活地选择使用哪些列，因为您只需操作列表 column_names，并且您可以执行诸如仅选择名称中带有字母“a”的列之类的操作。这样做的另一个好处是人们更容易通过列名了解他们在做什么。将此与 list(df.columns) 结合以获取列表格式的列名称。因此，如果您想删除最后一列，您所要做的就是：

column_names = list(df.columns)
df['Fruit Total']= df[column_names[:-1]].sum(axis=1)

Answer 5

如果您想在不知道数据框的形状/大小的情况下得出总数，我想以 Ramon 的答案为基础。我将在下面使用他的回答，但修复一个不包括总数的最后一列的项目。我已经从形状中删除了 -1：

cols_to_sum = df.columns[ : df.shape[1]-1]

为此：

cols_to_sum = df.columns[ : df.shape[1]]

print(df)
   Apples  Bananas  Grapes  Kiwis
0     2.0      3.0     NaN    1.0
1     1.0      3.0     7.0    NaN
2     NaN      NaN     2.0    3.0

cols_to_sum = df.columns[ : df.shape[1]]

df['Fruit Total'] = df[cols_to_sum].sum(axis=1)

print(df)
   Apples   Bananas Grapes  Kiwis   Fruit Total
0  2.0      3.0     NaN     1.0     6.0
1  1.0      3.0     7.0     NaN     11.0
2  NaN      NaN     2.0     3.0     5.0

然后在不跳过最后一列的情况下为您提供正确的总数。

熊猫：将多列汇总到一列中

5 个答案: