Question

我有一个数据框A，我想总结一下他们的行索引值大于或等于10的行。如果这是不可能的，我也可以使用一个代码，该代码也可以在2-3行之间进行求和。

import pandas as pd
import numpy as np
A = """
        Tier         Oct   Nov   Dec
    0   up to 2M     4     5     10
    1   5M           3     2     7
    2   10M          6     0     2
    3   15M          1     3     5
   """
tenplus = pd.Series(A(axis=0),index=A.columns[1:])

但这总结了整个表格。我能做的一件事就是从第2-3行构建另一个数据框并对它们进行修改，但我更愿意学习最佳实践！

谢谢！

Answer 1

您可以使用普通切片索引来选择要求和的行：

print(df)
#        Tier  Oct  Nov  Dec
# 0  up to 2M    4    5   10
# 1        5M    3    2    7
# 2       10M    6    0    2
# 3       15M    1    3    5

# select the last two rows
print(df[2:4])
#   Tier  Oct  Nov  Dec
# 2  10M    6    0    2
# 3  15M    1    3    5

# sum over them
print(df[2:4].sum())
# Tier    10M15M
# Oct          7
# Nov          3
# Dec          7
# dtype: object

正如您所看到的，总结Tier列会产生毫无意义的结果，因为＆＃34;求和＆＃34;字符串只是连接起来。仅仅总结最后三列是更有意义的：

# select the last two rows and the last 3 columns
print(df.loc[2:4, ['Oct', 'Nov', 'Dec']])
#    Oct  Nov  Dec
# 2    6    0    2
# 3    1    3    5

# sum over them
print(df.loc[2:4, ['Oct', 'Nov', 'Dec']].sum())
# Oct    7
# Nov    3
# Dec    7
# dtype: int64

# alternatively, use df.iloc[2:4, 1:] to select by column index rather than name

您可以阅读有关如何在pandas in the documentation here中使用索引的更多信息。

Answer 2

sum有一个axis参数，传递轴= 1来对行进行求和：

In [11]: df
Out[11]:
       Tier  Oct  Nov  Dec
0  up to 2M    4    5   10
1        5M    3    2    7
2       10M    6    0    2
3       15M    1    3    5

In [12]: df.sum(axis=1)
Out[12]:
0    19
1    12
2     8
3     9
dtype: int64

注意：这是丢弃非数字列，您可以在求和前明确过滤掉这些：

In [13]: df[['Oct', 'Nov', 'Dec']].sum(axis=1)
Out[13]:
0    19
1    12
2     8
3     9
dtype: int64

如何在Python中对数据帧的某一行求和

2 个答案: