我有一个大约有一个DataFrame。 200列,7000行列B
完全由NaN
值组成,但中间约有400行。
总而言之,B列看起来像这样(为了简洁起见):
B
1 NaN
2 NaN
3 75
4 83
5 NaN
6 NaN
但是,当我编写如下代码时,hasnans
属性似乎具有错误的值。我是不正确地使用了属性还是什么?
df['B'].hasnans
返回
False
编辑:
以下是我导入熊猫的CSV文件的小样本。该列仍然找不到NaN值。机敏的观察者会注意到列标题中B
周围的空格。那是预料之中的,而不是问题。
" DATE TIME "," A "," C "," B "
12/11/2018 15:44:36, 5448, 0.00, NaN
12/11/2018 15:44:36, 5448, 0.00, NaN
12/11/2018 15:44:36, 5448, 0.00, NaN
12/11/2018 15:44:36, 5448, 0.00, NaN
12/11/2018 15:45:07, 5448, 0.00, NaN
12/11/2018 15:45:08, 5448, 0.00, NaN
12/11/2018 15:45:08, 5448, 0.00, NaN
12/11/2018 15:45:09, 5448, 0.00, NaN
12/11/2018 15:45:09, 5448, 0.00, NaN
答案 0 :(得分:1)
考虑
Vue.mixin(titleMixin)
作为要作为熊猫数据框导入的.csv文件,您必须注意要查找的实际值。
事实上:
" DATE TIME "," A "," C "," B "
12/11/2018 15:44:36, 5448, 0.00, NaN
12/11/2018 15:44:36, 5448, 0.00, NaN
12/11/2018 15:44:36, 5448, 0.00, NaN
12/11/2018 15:44:36, 5448, 0.00, NaN
12/11/2018 15:45:07, 5448, 0.00, NaN
12/11/2018 15:45:08, 5448, 0.00, NaN
12/11/2018 15:45:08, 5448, 0.00, NaN
12/11/2018 15:45:09, 5448, 0.00, NaN
12/11/2018 15:45:09, 5448, 0.00, NaN
返回:
import pandas as pd
import numpy as np
df = pd.read_csv('filename.csv', header=0)
df[' B '].replace(' NaN', np.nan, inplace=True)
df[' B '].hasnans
答案 1 :(得分:1)
当您读入csv时,应使用skipinitialspace
选项删除数据中的前导空格。请注意,由于列名用引号引起来,所以它们周围的空格将保留
# make fake csv
from io import StringIO
mock_csv = StringIO()
mock_csv.write("""\
" DATE TIME "," A "," C "," B "
12/11/2018 15:44:36, 5448, 0.00, NaN
12/11/2018 15:44:36, 5448, 0.00, NaN
12/11/2018 15:44:36, 5448, 0.00, NaN
12/11/2018 15:44:36, 5448, 0.00, NaN
12/11/2018 15:45:07, 5448, 0.00, NaN
12/11/2018 15:45:08, 5448, 0.00, NaN
12/11/2018 15:45:08, 5448, 0.00, NaN
12/11/2018 15:45:09, 5448, 0.00, NaN
12/11/2018 15:45:09, 5448, 0.00, NaN
""")
mock_csv.seek(0)
# disregard initial whitespace
df = pd.read_csv(mock_csv, skipinitialspace=True)
assert df[' B '].hasnans
请参阅文档here
答案 2 :(得分:0)
我认为它显示为false,因为您列中的"NaN"
值是"NaN"
而不是np.nan
,因此我猜想该列的数据类型可以是“对象”。因此,您必须将该"NaN"
值转换为np.nan
,以便该列的对象可以根据需要是int或float,而hasnans将返回正确的布尔值。
那么首先,
df[df["B"] == "NaN"] = np.nan #it will convert "NaN" values into np.nan
现在您可以使用hasnans
或isnull().any()
干杯!