Question

我正在尝试从我的数据框中删除一行，其中一列的值为null。我能找到的大部分帮助都与去除NaN值有关，这些值迄今为止对我没用。

这里我创建了数据框：

  # successfully crated data frame
 df1 = ut.get_data(symbols, dates) # column heads are 'SPY', 'BBD'

# can't get rid of row containing null val in column BBD
# tried each of these with the others commented out but always had an 
# error or sometimes I was able to get a new column of boolean values
# but i just want to drop the row
df1 = pd.notnull(df1['BBD']) # drops rows with null val, not working
df1 = df1.drop(2010-05-04, axis=0)
df1 = df1[df1.'BBD' != null]
df1 = df1.dropna(subset=['BBD'])
df1 = pd.notnull(df1.BBD)


# I know the date to drop but still wasn't able to drop the row
df1.drop([2015-10-30])
df1.drop(['2015-10-30'])
df1.drop([2015-10-30], axis=0)
df1.drop(['2015-10-30'], axis=0)


with pd.option_context('display.max_row', None):
    print(df1)

这是我的输出：

Output

有人可以告诉我如何删除这一行，最好是通过空值识别行以及如何按日期删除？

我没有和熊猫一起工作很长时间，而且我已经坚持了一个小时。任何建议将不胜感激。

Answer 1

这应该做的工作：

df = df.dropna(how='any',axis=0)

它会删除每个行（axis = 0），其中包含“ any ”Null值。

示例：

#Recreate random DataFrame with Nan values df = pd.DataFrame(index = pd.date_range('2017-01-01', '2017-01-10', freq='1d')) # Average speed in miles per hour df['A'] = np.random.randint(low=198, high=205, size=len(df.index)) df['B'] = np.random.random(size=len(df.index))*2 #Create dummy NaN value on 2 cells df.iloc[2,1]=None df.iloc[5,0]=None print(df) A B 2017-01-01 203.0 1.175224 2017-01-02 199.0 1.338474 2017-01-03 198.0 NaN 2017-01-04 198.0 0.652318 2017-01-05 199.0 1.577577 2017-01-06 NaN 0.234882 2017-01-07 203.0 1.732908 2017-01-08 204.0 1.473146 2017-01-09 198.0 1.109261 2017-01-10 202.0 1.745309 #Delete row with dummy value df = df.dropna(how='any',axis=0) print(df) A B 2017-01-01 203.0 1.175224 2017-01-02 199.0 1.338474 2017-01-04 198.0 0.652318 2017-01-05 199.0 1.577577 2017-01-07 203.0 1.732908 2017-01-08 204.0 1.473146 2017-01-09 198.0 1.109261 2017-01-10 202.0 1.745309

有关详细信息，请参阅reference。

如果您的DataFrame一切正常，丢弃NaN应该就这么简单。如果仍然无效，请确保为列定义了正确的数据类型（pd.to_numeric会想到...）

Answer 2

您的列中的值似乎为“ null”，而不是dropna真正的NaN。所以我会尝试：

df[df.BBD != 'null']

或者，如果该值实际上是NaN，则

df[pd.notnull(df.BBD)]

Answer 3

----清除所有列为空-------

df = df.dropna(how='any',axis=0)

---如果要通过基于1列清除NULL 。---

df[~df['B'].isnull()]

                A         B
2017-01-01  203.0  1.175224
2017-01-02  199.0  1.338474
                              **2017-01-03  198.0       NaN** clean
2017-01-04  198.0  0.652318
2017-01-05  199.0  1.577577
2017-01-06    NaN  0.234882
2017-01-07  203.0  1.732908
2017-01-08  204.0  1.473146
2017-01-09  198.0  1.109261
2017-01-10  202.0  1.745309

请原谅任何错误。

Answer 4

删除所有空值dropna（）方法将很有帮助

df.dropna(inplace=True)

要删除其中包含特定空值的删除，请使用此代码

df.dropna(subset=['column_name_to_remove'], inplace=True)

Answer 5

我建议您试一试这两行之一：

df_clean = df1[df1['BBD'].isnull() == False]
df_clean = df1[df1['BBD'].isna() == False]

Answer 6

您可以尝试以下操作：

df.dropna(inplace=True)

对我有用。

从pandas数据框中删除具有空值的行

6 个答案: