我想检查一个数据框的列是否包含多个不同的值,因此我将该列作为一个集合,并检查其长度。但是我遇到了NaNs的问题。我期望所有NaN的列的长度都为零,但事实并非如此,为什么?
import pandas as pd
from numpy import nan
set([nan, nan, nan]) # set has one element
set(pd.Series([nan, nan, nan])) #set has three elements
numpy数组会发生同样的情况:
set(pd.np.array([nan, nan, nan])) #set has three elements
其他值不会发生这种情况:
set(pd.np.array([1,1,1])) #set has one element
答案 0 :(得分:2)
>>> L = [nan, nan, nan]
>>> L[0] is L[1]
True
>>> s = pd.Series([nan, nan, nan])
>> s[0] is s[1]
False
>>> s[0] == s[1]
False
>>> L[0] == L[1]
False
答案 1 :(得分:1)