如何在特定条件下过滤pandas数据框中的列值?

时间:2017-03-07 14:33:54

标签: python pandas

我创建了一个Pandas数据帧,并想过滤一些值。数据框包含4列,即currency port supplier_id value,我希望有值可以接受下面提供的条件,

* port – expressed as a portcode, a 5-letter string uniquely identifying a port. Portcodes consist of 2-letter country code and 3-letter city code.
* supplier_id - integer, uniquely identifying the provider of the information
* currency - 3-letter string identifying the currency
* value - a floating-point number

df =  df[ (len(df['port']) == 5 & isinstance(df['port'], basestring)) & \
  isinstance(df['supplier_id'], int) & \
  (len(df['currency']) == 3 & isinstance(df['currency'], basestring))\
  isinstance(df['value'], float) ]

代码片段应该是显而易见的,并尝试实现前面提到的条件,但它不起作用。下面提供了df的打印件,

     currency   port  supplier_id   value
0         CNY  CNAQG         35.0   820.0
1         CNY  CNAQG         19.0   835.0
2         CNY  CNAQG         49.0   600.0
3         CNY  CNAQG         54.0   775.0
4         CNY  CNAQG        113.0   785.0
5         CNY  CNAQG          5.0   790.0
6         CNY  CNAQG         55.0   770.0
7         CNY  CNAQG         81.0   810.0
8         CNY  CNAQG          2.0   770.0
9         CNY  CNAQG         10.0   825.0


print df[df.supplier_id.isnull()] # prints below 
Empty DataFrame
Columns: [currency, port, supplier_id, value]
Index: []



df.info() # prints below     
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 6661 entries, 0 to 6660
Data columns (total 4 columns):
currency       6661 non-null object
port           6661 non-null object
supplier_id    6661 non-null float64
value          6661 non-null float64
dtypes: float64(2), object(2)
memory usage: 208.2+ KB
None

如何正确编写?

1 个答案:

答案 0 :(得分:2)

您可以使用if在一列中包含混合值 - 带字符串的数字:

SELECT . . .
FROM assigned_users au JOIN
     projects p
     ON au.Project_ID = p.Project_ID JOIN
     requirements r
     ON r.Project_ID = p.Project_ID
WHERE au.User_ID = 4;