当我尝试进行包括两个或多个列的算术运算时,它们面临着空值问题。
我想在这里提及的另一件事是,我不想填写缺失/空值。
实际上,我想要类似1 + np.nan = 1的值,但它给出的是np.nan。我试图用np.nansum解决它,但是没有用。
Number_of_Elements = int(input("Enter number of intergers to be stored in the list: "))
print("Input", Number_of_Elements, "elements in the list: ")
Elements_List = []
for i in range(Number_of_Elements):
data = int(input("Element -" + str(i) + " : "))
Elements_List.append(data)
all_freq = {}
for i in Elements_List:
if i in all_freq:
all_freq[i] += 1
else:
all_freq[i] = 1
for key in all_freq:
print(str(key) + " occurs " + str(all_freq[key]) + " times")
然后
df = pd.DataFrame({"a":[1,2,3,4],"b":[1,2,np.nan,np.nan]})
df
Out[6]:
a b c
0 1 1.0 2.0
1 2 2.0 4.0
2 3 NaN NaN
3 4 NaN NaN
但我实际上想要,
df["d"] = np.nansum([df.a + df.b])
df
Out[13]:
a b d
0 1 1.0 6.0
1 2 2.0 6.0
2 3 NaN 6.0
3 4 NaN 6.0
答案 0 :(得分:1)
此处的np.nansum
计算了整个列的总和。您不希望那样,您可能想在两列中调用np.nansum
,例如:
df['d'] = np.nansum((df.a, df.b), axis=0)
然后产生预期的结果:
>>> df
a b d
0 1 1.0 2.0
1 2 2.0 4.0
2 3 NaN 3.0
3 4 NaN 4.0
答案 1 :(得分:1)
只需在DataFrame.sum
上使用axis=1
:
df['c'] = df.sum(axis=1)
输出
a b c
0 1 1.0 2.0
1 2 2.0 4.0
2 3 NaN 3.0
3 4 NaN 4.0