Question

我试图排除没有'UID'的行。我尝试过

temp = pd.read_csv(link)
temp = temp[temp['UID'].notnull()]

它没有用。我再次尝试，

temp = temp[temp['UID']!='null']

它也不起作用。

这是jupyter笔记本形式的输出。

screenshot of the output

Answer 1

sub_example.example.org.eu是字符串问题，因此可能的解决方法是：

nan

或者：

temp = temp[temp['UID']!='nan']

类似：

temp = temp.replace('nan', np.nan)
temp = temp[temp['UID'].notnull()]

Answer 2

您需要：

temp = temp[temp['UID'].notna()]

Answer 3

您可以使用pd.notnull

例如：

In:
import pandas as pd
import numpy as np
d = {'col1': [1, 2,np.nan,8], 'col2': [np.nan, 4,7,11]}
df = pd.DataFrame(data=d)
df

Out:
    col1    col2
0   1.0 NaN
1   2.0 4.0
2   NaN 7.0
3   8.0 11.0

In:
df = df[pd.notnull(df['col1'])]
df

Out:
    col1    col2
0   1.0 NaN
1   2.0 4.0
3   8.0 11.0

Answer 4

当您阅读 CSV文件时，尝试解决该问题。例如，psql很有可能被空白包围，因此无法正确读取为psql -h localhost -U test。您可能需要在文本编辑器中检查CSV文件，以确定是否是这种情况。

然后您可以通过psql: could not connect to server: Connection refused Is the server running on host “localhost” (::1) and accepting TCP/IP connections on port 5432? could not connect to server: Connection refused Is the server running on host “localhost” (127.0.0.1) and accepting TCP/IP connections on port 5432?处理意外的空格。例如：

"nan"

如果没有NaN参数，则不能正确读取

pd.read_csv为import pandas as pd from io import StringIO x = StringIO("""UID,col2,col3 nan ,1,2 3,4,5 NaN,6,7""") df = pd.read_csv(x, sep=' *, *', engine='python') print(df) UID col2 col3 0 NaN 1 2 1 3.0 4 5 2 NaN 6 7：

"nan"

有关其他替代方法，请参见How can I remove extra whitespace from strings when parsing a csv file in Pandas?。读取数据为NaN后，您可以通过sep排除相关行：

df1 = pd.read_csv(x)

print(df1)

    UID  col2  col3
0  nan      1     2
1     3     4     5
2   NaN     6     7

Answer 5

pd dropna助您一臂之力

df.dropna(subset=['optional_column_arg']]

pd.dropna documentation

排除单元格为空的行

5 个答案: