排除单元格为空的行

时间:2018-07-31 07:40:44

标签: python pandas csv numpy

我试图排除没有'UID'的行。 我尝试过

temp = pd.read_csv(link)
temp = temp[temp['UID'].notnull()]

它没有用。我再次尝试,

temp = temp[temp['UID']!='null']

它也不起作用。

这是jupyter笔记本形式的输出。

screenshot of the output

5 个答案:

答案 0 :(得分:1)

sub_example.example.org.eu是字符串问题,因此可能的解决方法是:

nan

或者:

temp = temp[temp['UID']!='nan']

类似:

temp = temp.replace('nan', np.nan)
temp = temp[temp['UID'].notnull()]

答案 1 :(得分:0)

您需要:

temp = temp[temp['UID'].notna()]

答案 2 :(得分:0)

您可以使用pd.notnull

例如:

In:
import pandas as pd
import numpy as np
d = {'col1': [1, 2,np.nan,8], 'col2': [np.nan, 4,7,11]}
df = pd.DataFrame(data=d)
df

Out:
    col1    col2
0   1.0 NaN
1   2.0 4.0
2   NaN 7.0
3   8.0 11.0

In:
df = df[pd.notnull(df['col1'])]
df

Out:
    col1    col2
0   1.0 NaN
1   2.0 4.0
3   8.0 11.0

答案 3 :(得分:0)

当您阅读 CSV文件时,尝试解决该问题。例如,psql很有可能被空白包围,因此无法正确读取为psql -h localhost -U test 。您可能需要在文本编辑器中检查CSV文件,以确定是否是这种情况。

然后您可以通过psql: could not connect to server: Connection refused Is the server running on host “localhost” (::1) and accepting TCP/IP connections on port 5432? could not connect to server: Connection refused Is the server running on host “localhost” (127.0.0.1) and accepting TCP/IP connections on port 5432? 处理意外的空格。例如:

"nan"
如果没有NaN参数,则不能正确读取

pd.read_csvimport pandas as pd from io import StringIO x = StringIO("""UID,col2,col3 nan ,1,2 3,4,5 NaN,6,7""") df = pd.read_csv(x, sep=' *, *', engine='python') print(df) UID col2 col3 0 NaN 1 2 1 3.0 4 5 2 NaN 6 7

"nan"

有关其他替代方法,请参见How can I remove extra whitespace from strings when parsing a csv file in Pandas?。读取数据为NaN后,您可以通过sep排除相关行:

df1 = pd.read_csv(x)

print(df1)

    UID  col2  col3
0  nan      1     2
1     3     4     5
2   NaN     6     7

答案 4 :(得分:0)

pd dropna助您一臂之力

df.dropna(subset=['optional_column_arg']]

pd.dropna documentation