与使用以下方式访问系列值相比,为什么使用熊猫系列时numpy
会返回带有缺失值的不同结果:
import pandas as pd
import numpy as np
data = pd.DataFrame(dict(a=[1, 2, 3, np.nan, np.nan, 6]))
np.sum(data['a'])
#12.0
np.sum(data['a'].values)
#nan
答案 0 :(得分:5)
在熊猫系列上调用np.sum
代表Series.sum
,data['a'].sum()
# 12.0
np.sum(data['a'])
# 12.0
在计算总和时(按默认值)会忽略NaN。
np.sum
您可以从np.sum??
def sum(a, axis=None, dtype=None, out=None, keepdims=np._NoValue, initial=np._NoValue):
...
return _wrapreduction(a, np.add, 'sum', axis, dtype, out, keepdims=keepdims,
的源代码中看到这一点:
_wrapreduction
看看np.core.fromnumeric._wrapreduction??
def _wrapreduction(obj, ufunc, method, axis, dtype, out, **kwargs):
...
if type(obj) is not mu.ndarray:
try:
reduction = getattr(obj, method) # get reference to Series.add
的源代码,我们看到:
reduction
然后最终在函数的末尾调用 return reduction(axis=axis, out=out, **passkwargs)
:
0
data_1_circulating_supply 17584875
data_2_circulating_supply 1.05209e+08
data_3_circulating_supply 41432141931
data_4_circulating_supply 6.08515e+07
data_5_circulating_supply 9.06245e+08
data_6_circulating_supply 17668725
data_7_circulating_supply 1.41175e+08
data_8_circulating_supply 1.99636e+09
data_9_circulating_supply 1.92156e+10
data_10_circulating_supply 6.66821e+10
0
data_1_symbol BTC
data_2_symbol ETH
data_3_symbol XRP
data_4_symbol LTC
data_5_symbol EOS
data_6_symbol BCH
data_7_symbol BNB
data_8_symbol USDT
data_9_symbol XLM
data_10_symbol TRX