Question

我有2个数据帧第一个

df1 = pd.read_csv('t1.txt',delimiter="\t", parse_dates = True, index_col = 'Date')

Date  4001  4002  4003  4004  4005                                                                 
2017-01-01  151902  2755  0  0  0
2017-01-02  157271  143598  2343  0  0
2017-01-03  95806  138308  126034  2034  0
2017-01-04  68874  91469  129751  116066  1822

第二个

df2 = pd.read_excel('gg.xlsx', parse_dates = True, index_col='Date')  
Date  num  value
2017-01-01  4001  68
2017-01-02  4002  621
2017-01-03  4003  8
2017-01-04  4004  5
2017-01-05  4005  5

这将是一些元代码风格：
如您所见，df1.columns和df2['num']是同一个实体。 df1有错误，我想修复数据。对于相对df1.values

，如果df1.Date小于df2.Date，我需要将df1.column = df2.num (for 4001, 4002, etc)设置为0

for a in df1.columns:
    for b in df1.index:
       if (b < inst.loc[inst['cohort_number']==int(a)].index):
          nw1.at[b,a] = 0

如何比较数据框中的索引（日期时间类型）？

Answer 1

IIUC

v=df1.melt('Date')
v.variable=v.variable.astype(int)
s=v.merge(df2,left_on='variable',right_on='num',how='left')

v.loc[s.Date_x<s.Date_y,'value']=0

v.set_index(['Date','variable']).unstack()
Out[1211]:
             value
variable      4001    4002    4003    4004 4005
Date
2017-01-01  151902       0       0       0    0
2017-01-02  157271  143598       0       0    0
2017-01-03   95806  138308  126034       0    0
2017-01-04   68874   91469  129751  116066    0

如何比较pandas中的数据帧的日期时间索引

1 个答案: