基于String值排除pandas行

时间:2014-03-04 14:43:33

标签: python pandas

我有一个pandas表,其中包含一个具有String数据类型的列。我需要的是从数据框中排除任何“未找到”作为字符串的行。我正在尝试:

df [df.some_column!=“未找到”],但这不起作用

期待回复。

示例数据:

card_number effective_date  expiry_date grouping_name       Ac. Year code
0       1206090    28 Sep 2012  21 Aug 2013    Dummy no.1  201213
1       1206090    21 Feb 2013  21 Aug 2013   Dummy no.2   201213
2       1206090    28 Sep 2012  30 Nov 2012    Dummy no.3  201213
3       1206090    03 Dec 2012  21 Aug 2013    Dummy no.3  201213
4       1206090    23 Apr 2013  31 Aug 2013   Dummy no.4   201213
5       1206090    28 Sep 2012  21 Aug 2013    Dummy no.5  201213
6       1206090    28 Sep 2012  21 Aug 2013    Dummy no.6  201213
7       1206090    24 Oct 2012  07 Aug 2013     Not found  201213
8       1206090    08 Jan 2013  08 Jan 2013     Not found  201213
9       1206090    08 Jan 2013  31 Aug 2013     Not found  201213
10    Not found    03 Jul 2013  21 Aug 2013    Dummy no.1  201213
11    Not found    03 Jul 2013  21 Aug 2013   Dummy no.2   201213

额外注意:我的字符串匹配必须非常奇怪...当运行df [grouping_name]!=“未找到”时,我认为7,8,9是真的...有谁知道为什么?

1 个答案:

答案 0 :(得分:1)

尝试:

df[df['some_column'] != "Not found"]

解决方案使用样本数据:

df = pd.read_csv("data.csv")
df

    card_number effective_date  expiry_date grouping_name   Ac. Year code
0    1206090     28 Sep 2012     21 Aug 2013     Dummy no.1  201213
1    1206090     21 Feb 2013     21 Aug 2013     Dummy no.2  201213
2    1206090     28 Sep 2012     30 Nov 2012     Dummy no.3  201213
3    1206090     03 Dec 2012     21 Aug 2013     Dummy no.3  201213
4    1206090     23 Apr 2013     31 Aug 2013     Dummy no.4  201213
5    1206090     28 Sep 2012     21 Aug 2013     Dummy no.5  201213
6    1206090     28 Sep 2012     21 Aug 2013     Dummy no.6  201213
7    1206090     24 Oct 2012     07 Aug 2013     Not found   201213
8    1206090     08 Jan 2013     08 Jan 2013     Not found   201213
9    1206090     08 Jan 2013     31 Aug 2013     Not found   201213
10   Not found   03 Jul 2013     21 Aug 2013     Dummy no.1  201213
11   Not found   03 Jul 2013     21 Aug 2013     Dummy no.2  201213


df[df['grouping_name'] != 'Not found']

card_number effective_date  expiry_date grouping_name   Ac. Year code
0    1206090     28 Sep 2012     21 Aug 2013     Dummy no.1  201213
1    1206090     21 Feb 2013     21 Aug 2013     Dummy no.2  201213
2    1206090     28 Sep 2012     30 Nov 2012     Dummy no.3  201213
3    1206090     03 Dec 2012     21 Aug 2013     Dummy no.3  201213
4    1206090     23 Apr 2013     31 Aug 2013     Dummy no.4  201213
5    1206090     28 Sep 2012     21 Aug 2013     Dummy no.5  201213
6    1206090     28 Sep 2012     21 Aug 2013     Dummy no.6  201213
10   Not found   03 Jul 2013     21 Aug 2013     Dummy no.1  201213
11   Not found   03 Jul 2013     21 Aug 2013     Dummy no.2  201213