我正在使用pandas
尝试计算在两个日期之间购买了特定类型合同的成员。我正在使用的数据框类似于:
Member Nbr Contract-Type Date-Joined
20 1 Year Membership 2011-08-01
3128 3 Month Membership 2011-07-22
3535 4 Month Membership 2015-02-18
3760 4 Month Membership 2010-02-28
3762 3 Month Membership 2010-01-31
3882 1 Month Membership 2010-04-24
3892 3 Month Membership 2010-03-24
4116 3 Month Membership 2014-12-02
4700 1 Month Membership 2014-11-11
4802 4 Month Membership 2014-07-26
5004 1 Year Membership 2012-03-12
5020 1 Year Membership 2010-07-28
5022 3 Month Membership 2010-06-25
5130 1 Year Membership 2011-01-04
...
如果只有一种合约类型,我有兴趣使用
,我可以得到计数print(len(df[(df['Date-Joined'] > '2010-01-01')
& (df['Date-Joined'] < '2012-02-01')
& (df['Member Type'] == '1 Year Membership')]))
当我通过使用以下代码指定1 Year Membership
或4 Month Membership
来尝试类似内容时
print(len(df[(df['Date-Joined'] > '2013-01-01')
& (df['Date-Joined'] < '2013-02-01')
& (df['Member Type'] == '1 Year Membership')
or (df['Member Type'] == '4 Month Membership')]))
我收到以下错误
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
并将or
条件替换为&
条件会返回0
答案 0 :(得分:5)
使用|
代替or
。此外,&
优先于|
,因此您的逻辑需要一组括号。
import io
import pandas as pd
data = io.StringIO('''\
Member Nbr,Contract-Type,Date-Joined
20,1 Year Membership,2011-08-01
3128,3 Month Membership,2011-07-22
3535,4 Month Membership,2015-02-18
3760,4 Month Membership,2010-02-28
3762,3 Month Membership,2010-01-31
3882,1 Month Membership,2010-04-24
3892,3 Month Membership,2010-03-24
4116,3 Month Membership,2014-12-02
4700,1 Month Membership,2014-11-11
4802,4 Month Membership,2014-07-26
5004,1 Year Membership,2012-03-12
5020,1 Year Membership,2010-07-28
5022,3 Month Membership,2010-06-25
5130,1 Year Membership,2011-01-04
''')
df = pd.read_csv(data)
print(df[
(df['Date-Joined'] > '2010-01-01') &
(df['Date-Joined'] < '2012-02-01') &
(df['Contract-Type'] == '1 Year Membership')
])
# Member Nbr Contract-Type Date-Joined
# 0 20 1 Year Membership 2011-08-01
# 11 5020 1 Year Membership 2010-07-28
# 13 5130 1 Year Membership 2011-01-04
print(df[
(df['Date-Joined'] > '2010-01-01') &
(df['Date-Joined'] < '2012-02-01') &
(df['Contract-Type'] == '1 Year Membership') |
(df['Contract-Type'] == '4 Month Membership')
])
# Member Nbr Contract-Type Date-Joined
# 0 20 1 Year Membership 2011-08-01
# 2 3535 4 Month Membership 2015-02-18 <====== BEWARE!
# 3 3760 4 Month Membership 2010-02-28
# 9 4802 4 Month Membership 2014-07-26 <====== BEWARE!
# 11 5020 1 Year Membership 2010-07-28
# 13 5130 1 Year Membership 2011-01-04
print(df[
(df['Date-Joined'] > '2010-01-01') &
(df['Date-Joined'] < '2012-02-01') &
((df['Contract-Type'] == '1 Year Membership') |
(df['Contract-Type'] == '4 Month Membership'))
])
# Member Nbr Contract-Type Date-Joined
# 0 20 1 Year Membership 2011-08-01
# 3 3760 4 Month Membership 2010-02-28
# 11 5020 1 Year Membership 2010-07-28
# 13 5130 1 Year Membership 2011-01-04