我想根据多个条件更改DateWork['Variable']
值,并在DateWork['Date']
更新
如果Frequency=3
和len(Variable)=6
,则用" -0"替换M;并在DateWork['Date']
中更新
如果Frequency=3
和len(Variable)=7
则将&M替换为" - "并在DateWork['Date']
DataFrame:DateWork
Frequency Variable Date
3 1950M2 1950-02-01
3 1950M3 1950-03-01
2 1950-07-01 1950-07-01
3 1950M9 1950-09-01
2 1950-10-01 1950-10-01
3 1950M10 1950-10-01
我的代码:
DateWork.loc[DateWork['Date']] = np.where(((DateWork['Frequency'] == 3) & (DateWork['variable'].str.len() == 6)), 'M', '-0', DateWork['Date'])
DateWork.loc[DateWork['Date']] = np.where(((DateWork['Frequency'] == 3) & (DateWork['variable'].str.len() == 7)), 'M', '-', DateWork['Date'])
DateWork.loc[DateWork['Frequency'] == 3, 'Date'] = DateWork.loc[DateWork['Frequency'] == 3, 'variable'] + '-01'
这会出错:
TypeError:where()最多需要3个参数(给定4个)
答案 0 :(得分:2)
您提出的错误是因为您将一个额外的参数传递给np.where
,您可以查看有关此方法的文档,链接如下。同样,这个问题得到修复,你编写代码的方式使得最后np.where
调用更新并替换之前的所有调用,因此它们需要“嵌套”才能正常工作。
如果您提出要求,我还提供了一个没有np.where
的解决方案。
numpy.where的解决方案:
# where frequenct == 3 and len(variable) == 6, we put variable and replace M with -0, if that's not
# the case, we search where frequency == 3 and len(variable) == 7 and put variable while replacing M with -
# else we just put Variable
DateWork['Date'] = np.where((DateWork['Frequency'] == 3) & (DateWork['Variable'].str.len() == 6), DateWork['Variable'].str.replace('M','-0'),
np.where((DateWork['Frequency'] == 3) & (DateWork['Variable'].str.len() == 7), DateWork['Variable'].str.replace('M','-'), DateWork['Variable']))
# we add first day date where frequency == 3
DateWork.loc[DateWork['Frequency'] == 3, 'Date'] = DateWork.loc[DateWork['Frequency'] == 3, 'Date'] + '-01'
pandas.dataframe.loc的解决方案:
# where frenquency == 3 and len(variable) == 6, in date we put variable and replace M with -0
DateWork.loc[(DateWork['Frequency'] == 3) & (DateWork['Variable'].str.len() == 6),'Date'] = DateWork['Variable'].str.replace('M','-0')
# where frequency == 3 and len(variable) == 7, in date we put variable and replace M with -
DateWork.loc[(DateWork['Frequency'] == 3) & (DateWork['Variable'].str.len() == 7),'Date'] = DateWork['Variable'].str.replace('M','-')
# where frequency == 2, in date we simply put variable
DateWork.loc[DateWork['Frequency'] == 2,'Date'] = DateWork['Variable']
# where frequency == 3, in date we add first day date.
DateWork.loc[DateWork['Frequency'] == 3, 'Date'] = DateWork.loc[DateWork['Frequency'] == 3, 'Date'] + '-01'
答案 1 :(得分:0)
如果难以阅读嵌套np.where
,
DateWork
Out[32]:
Frequency Variable Date
0 3 1950M2 1950-02-01
1 3 1950M3 1950-03-01
2 2 1950-07-01 1950-07-01
3 3 1950M9 1950-09-01
4 2 1950-10-01 1950-10-01
5 3 1950M10 1950-10-01
首先是:
其他条件是原始Date
列本身
DateWork['Date'] = np.where((DateWork['Frequency'] == 3) & (DateWork['Variable'].str.len() == 6), DateWork['Variable'].str.replace('M','-0'), DateWork['Date'])
DateWork
Out[34]:
Frequency Variable Date
0 3 1950M2 1950-02
1 3 1950M3 1950-03
2 2 1950-07-01 1950-07-01
3 3 1950M9 1950-09
4 2 1950-10-01 1950-10-01
5 3 1950M10 1950-10-01
第二个如果:
此处,else条件是上一步的输出date
列
DateWork['Date'] = np.where((DateWork['Frequency'] == 3) & (DateWork['Variable'].str.len() == 7), DateWork['Variable'].str.replace('M','-'), DateWork['Date'])
DateWork
Out[36]:
Frequency Variable Date
0 3 1950M2 1950-02
1 3 1950M3 1950-03
2 2 1950-07-01 1950-07-01
3 3 1950M9 1950-09
4 2 1950-10-01 1950-10-01
5 3 1950M10 1950-10