从数据帧列检查字符串是否为nan

时间:2014-08-12 07:25:35

标签: python pandas

从我打印data['words'].values时的数据框中,

['from' 'fairest' 'creatures' 'we' 'desire' 'increase' nan 'that' 'thereby']

当我像这样循环时,如何确定值是否为nan?

for w in data['words'].values:
    check if w is nan ????

3 个答案:

答案 0 :(得分:6)

使用pandas方法isnull来测试:

In [45]:

df = pd.DataFrame({'words':['from', 'fairest', 'creatures' ,'we' ,'desire', 'increase' ,nan ,'that' ,'thereby']})
df
Out[45]:
       words
0       from
1    fairest
2  creatures
3         we
4     desire
5   increase
6        NaN
7       that
8    thereby
In [46]:

pd.isnull(df['words'])
Out[46]:
0    False
1    False
2    False
3    False
4    False
5    False
6     True
7    False
8    False
Name: words, dtype: bool

答案 1 :(得分:0)

问题在于是否:

SqlDataAdapter sda = new SqlDataAdapter("Select batch_id,p_name,quantity,left_qty,purchaseDate,manufacturing_date,expiryDate from batch b cross apply (select p_name from products p where p.p_id_pk = b.product_id_fk) products where Convert(DATE,expiryDate,103) BETWEEN @from AND @to", con);
        sda.SelectCommand.Parameters.AddWithValue("@from", Convert.ToDateTime(datePicker1.SelectedDate.Value).ToString("yyyyMMdd"));
        sda.SelectCommand.Parameters.AddWithValue("@to", Convert.ToDateTime(datePicker2.SelectedDate.Value).ToString("yyyyMMdd"));

或计算所有NaN:

for i in range(150,160):
  if (pd.isnull(df['Text'][i])):
    print('is null')

答案 2 :(得分:0)

我发现这比@Max 快(最后一个答案) 想象一下,你有一个标题列表可能是空的,我发现另一种更快的方法是:

nan_COLUMN_indexes=list(df.loc[pd.isna(df["COLUMN"]), :].index.values)
len(nan_COLUMN_indexes)

这样做不仅可以使列中所有索引的列表都为空,而且可以:

iterator=iter(nan_COLUMN_indexes)
next(itr,nan_COLUMN_indexes)

这允许您填充和跟踪当前的 nan/null 值,否则 Pandas 将返回 nulls/nan 的第一次出现 :D

或根据您的要求将 pd.nan() 替换为 pd.isnull()