Python 3 Pandas按多列值过滤/提取,包括<> 0

时间:2016-10-03 11:49:04

标签: pandas indexing filtering conditional-statements python-3.5

使用USASPENDING.gov提供的公开csv文件。能够从海军提取数据,但不知道添加第二个过滤器以排除Dollarsobligated = 0的所有记录的正确语法。

代码是:

import pandas as pd

df = pd.read_csv("2016_DOD_Contracts_Full_20160915.csv")
df.columns = [c.replace(' ','_') for c in df.columns]
new_df = df[(df.mod_agency == '1700: DEPT OF THE NAVY') & (df.dollarsobligated <> 0)]

# Export result to CSV
new_df.to_csv('example15.csv')

我收到一条错误消息,指出<>语法无效。网上还没有'不等于0'的例子。

2 个答案:

答案 0 :(得分:2)

我认为您需要将<>替换为boolean indexing中的!=,因为in Python3, <> was removed,谢谢unutbu

您也可以使用str.replace

df.columns = df.columns.str.replace(' ','_')
new_df = df[(df.mod_agency == '1700: DEPT OF THE NAVY') & (df.Dollarsobligated != 0)]

样品:

df = pd.DataFrame({'mod agency':['1700: DEPT OF THE NAVY',
                                 '1700: DEPT OF THE NAVY',
                                 '1800: DEPT OF THE NAVY'],
                   'Dollarsobligated':[1,0,0],
                   'C':[7,8,9]})

print (df)
   C  Dollarsobligated              mod agency
0  7                 1  1700: DEPT OF THE NAVY
1  8                 0  1700: DEPT OF THE NAVY
2  9                 0  1800: DEPT OF THE NAVY

df.columns = df.columns.str.replace(' ','_')
new_df = df[(df.mod_agency == '1700: DEPT OF THE NAVY') & (df.Dollarsobligated != 0)]

print (new_df)
   C  Dollarsobligated              mod_agency
0  7                 1  1700: DEPT OF THE NAVY

答案 1 :(得分:1)

你必须使用&#34;!=&#34;而不是&#34;&lt;&gt;&#34;