我正在尝试查找Date
是否属于数据框中的PromoInterval
。
print dset1
Store Date PromoInterval
1760 2 2013-05-04 Jan,Apr,Jul,Oct
1761 2 2013-05-03 Jan,Apr,Jul,Oct
1762 2 2013-05-02 Jan,Apr,Jul,Oct
1763 2 2013-05-01 Jan,Apr,Jul,Oct
1764 2 2013-04-30 Jan,Apr,Jul,Oct
def func(a,b):
y = b.split(",")
z = {1:'Jan',2:'Feb',3:'Mar', 4:'Apr',5:'May',6:'Jun',7:'Jul',8:'Aug',9:'Sep',
10:'Oct',11:'Nov',12:'Dec'}
return (z[a] in y)
dset1.apply(func, axis=1, args = (dset1['Date'].dt.month, dset1['PromoInterval']) )
触及以下错误:
dset1.apply(func,axis = 1,args =(dset1 ['Date']。dt.month,> dset1 ['PromoInterval'])) ('func()正好接受2个参数(3个给定)',未发生在索引1760')
数据集:
{'Date': {1760: Timestamp('2013-05-04 00:00:00'),
1761: Timestamp('2013-05-03 00:00:00'),
1762: Timestamp('2013-05-02 00:00:00'),
1763: Timestamp('2013-05-01 00:00:00'),
1764: Timestamp('2013-04-30 00:00:00')},
'PromoInterval': {1760: 'Jan,Apr,Jul,Oct',
1761: 'Jan,Apr,Jul,Oct',
1762: 'Jan,Apr,Jul,Oct',
1763: 'Jan,Apr,Jul,Oct',
1764: 'Jan,Apr,Jul,Oct'},
'Store': {1760: 2, 1761: 2, 1762: 2, 1763: 2, 1764: 2}}
答案 0 :(得分:3)
我首先使用'Date'
列上的lambda函数格式化月份的文本字符串:
df['Month'] = df['Date'].apply(lambda x: x.strftime('%b'))
然后我会在axis=1
上触发一个lambda函数,这意味着它在数据帧上的x轴上运行。在这里,我只是检查'Month'
是否在'PromoInterval'
df[['PromoInterval', 'Month']].apply(lambda x: x[1] in x[0], axis=1)
1760 False
1761 False
1762 False
1763 False
1764 True
dtype: bool
答案 1 :(得分:2)
解决方案是让你的函数占用一行而不是元素:
def func(row):
y = row[2].split(",")
z = {1:'Jan', 2:'Feb', 3:'Mar', 4:'Apr', 5:'May', 6:'Jun',
7:'Jul', 8:'Aug', 9:'Sep', 10:'Oct', 11:'Nov', 12:'Dec'}
return (z[row[1].month] in y)
然后你可以直接申请:
df['Result'] = df.apply(func, axis=1)
注意:该函数使用.month
,因为我使用pd.to_datetime
将日期转换为datetime对象。
答案 2 :(得分:1)
实际上这是因为该函数需要3个参数,而不是2个
def func(df,a,b):
print('---df----')
print(df)
print('---a---')
print(a)
print('---b---')
print(b)
y = b.split(",")
z = {1:'Jan',2:'Feb',3:'Mar', 4:'Apr',5:'May',6:'Jun',7:'Jul',8:'Aug',9:'Sep',
10:'Oct',11:'Nov',12:'Dec'}
return (z[a] in y)
In [98]:
dset1.apply(func, axis=1, args = (dset1['Date'].dt.month, dset1['PromoInterval']) )
In [99]:
---df----
Store 2
Date 2013-05-04 00:00:00
PromoInterval Jan,Apr,Jul,Oct
Name: 0, dtype: object
---a---
0 5
1 5
2 5
3 5
4 4
dtype: int64
---b---
0 Jan,Apr,Jul,Oct
1 Jan,Apr,Jul,Oct
2 Jan,Apr,Jul,Oct
3 Jan,Apr,Jul,Oct
4 Jan,Apr,Jul,Oct
Name: PromoInterval, dtype: object
相反,您可以执行以下操作
In [94]:
def func(df):
y = df['PromoInterval'].split(",")
z = {1:'Jan',2:'Feb',3:'Mar', 4:'Apr',5:'May',6:'Jun',7:'Jul',8:'Aug',9:'Sep',
10:'Oct',11:'Nov',12:'Dec'}
return (z[df.Date.month] in y)
In [95]:
dset1.apply(func, axis=1)
Out[112]:
0 False
1 False
2 False
3 False
4 True
dtype: bool