python中同一列的不同sum()结果

时间:2019-01-15 02:37:00

标签: python dataframe sum nan col

我在一个数据帧上的过程摘要。

  1. 我得到了“ damageDealt”列的总和,即581294667.8516002

train.damageDealt.sum()
# 581294667.8516002

train.damageDealt.shape
# (4446966,)
  1. 我发现“ winPlacePerc”列中只有一个NaN值

train.isnull().sum()
        Id                 0
    groupId            0
    matchId            0
    assists            0
    boosts             0
    damageDealt        0
    DBNOs              0
    headshotKills      0
    heals              0
    killPlace          0
    killPoints         0
    kills              0
    killStreaks        0
    longestKill        0
    matchDuration      0
    matchType          0
    maxPlace           0
    numGroups          0
    rankPoints         0
    revives            0
    rideDistance       0
    roadKills          0
    swimDistance       0
    teamKills          0
    vehicleDestroys    0
    walkDistance       0
    weaponsAcquired    0
    winPoints          0
    winPlacePerc       1
    dtype: int64

    具有NaN的行中“ damageDealth”列的
  1. 值是0.0

train[train.winPlacePerc.isnull() == True].damageDealt
#        2744604    0.0
#    Name: damageDealt, dtype: float64

  1. 我通过dropna()删除了该元组

train2 = train.copy()
train2.dropna(inplace=True)
train2[train2.winPlacePerc.isnull() == True]
# Series([], Name: damageDealt, dtype: float64)

  1. 列的总和更改为581294667.8516004 ...!甚至养起来...!

train2.damageDealt.sum()
# 581294667.8516004

所以我不知道当只删除了0.0元的DamageDealt列时,这个结果如何出现。 如果有人可以解释这一点将很有帮助。 在此先感谢!!

0 个答案:

没有答案