Python熊猫中的星期和日期时间值重复

时间:2020-07-01 16:01:27

标签: python pandas python-datetime

我遇到了一个问题,试图将日期时间值一致地转换为年,周和月。 我能够弄清楚如何将特定日期转换为年/周/月的组合,但是由于周和月的数字重叠,所以我遇到了要重复的组合。例如:

  • 2019年/第31周/ 8月:这是因为8月1日仍是日历中第31周的一部分,但提取的月份是8月。
  • 2019年/第31周/ 7月:这是因为7月31日仍然是日历中第31周的一部分,但是提取的月份是7月 我的目标是避免重复和错误的值提取。另一个例子:
  • 2019年/第01周/ 12月:这是因为12月31日是新年第01周的一部分,并且与2019年日历挂钩。

enter image description here

这是我的代码:

  • req_df是原始数据框
  • req_total_grouped让我根据loc / filter将值分组,并按datecol分组,datecol是日期值(例如:2020-01-01)
import calendar
req_total_grouped = req_df.loc[req_df['datecol'] >= '2019-07-01'].groupby(req_df['datecol'])
req_total_df = req_total_grouped.count()
req_total_df['YEAR'] = req_total_df['datecol'].dt.year
req_total_df['WEEK'] = req_total_df['datecol'].dt.week.map("{:02}".format)
req_total_df['MONTH'] = req_total_df['datecol'].dt.month.apply(lambda x: calendar.month_abbr[x])
req_total_df['YR_WK_MTH'] = req_total_df['YEAR'].astype(str) + \
                            '/ Week ' + \
                            req_total_df['WEEK'].astype(str) + \
                            ' / ' \
                            + req_total_df['MONTH']

我想要的输出:

  • 在月份重叠的情况下,我希望有一个统一的值。我带他们去哪个月都没关系,只需要在同一周内。 (例如:2019/31周/ 8月和2019/31周/ 7月应合并为一个单一值'2019/31周/ 8月)
  • 如果年份超过圈数(例如:2019年/第01周/ 12月)应为2020年/第01周/ 1月

1 个答案:

答案 0 :(得分:0)

我想按“年”和“周”对行进行分组,并保持每个组的最后一个值可以得到您想要的结果。你可以试试吗?

数据(与您的数据相同吗?)

df = pd.DataFrame({'date': pd.date_range('01/01/2019', '12/31/2020', freq='D')})
df['year'] = df['date'].dt.year
df['month'] = df['date'].dt.month.apply(lambda x: calendar.month_abbr[x])
df['week'] = df['date'].dt.week.map("{:02}".format)
df['yr_wk_mth'] = df['year'].astype(str) + ' / Week ' + df['week'] + ' / ' + df['month']

代码:

print(df.groupby(['year','week'])['yr_wk_mth'].last())

结果:

                date month             yr_wk_mth
year week                                       
2019 01   2019-12-31   Dec  2019 / Week 01 / Dec
     02   2019-01-13   Jan  2019 / Week 02 / Jan
     03   2019-01-20   Jan  2019 / Week 03 / Jan
     04   2019-01-27   Jan  2019 / Week 04 / Jan
     05   2019-02-03   Feb  2019 / Week 05 / Feb
     06   2019-02-10   Feb  2019 / Week 06 / Feb
     07   2019-02-17   Feb  2019 / Week 07 / Feb
     08   2019-02-24   Feb  2019 / Week 08 / Feb
     09   2019-03-03   Mar  2019 / Week 09 / Mar
     10   2019-03-10   Mar  2019 / Week 10 / Mar
     11   2019-03-17   Mar  2019 / Week 11 / Mar
     12   2019-03-24   Mar  2019 / Week 12 / Mar
     13   2019-03-31   Mar  2019 / Week 13 / Mar
     14   2019-04-07   Apr  2019 / Week 14 / Apr
     15   2019-04-14   Apr  2019 / Week 15 / Apr
     16   2019-04-21   Apr  2019 / Week 16 / Apr
     17   2019-04-28   Apr  2019 / Week 17 / Apr
     18   2019-05-05   May  2019 / Week 18 / May
     19   2019-05-12   May  2019 / Week 19 / May
     20   2019-05-19   May  2019 / Week 20 / May
     21   2019-05-26   May  2019 / Week 21 / May
     22   2019-06-02   Jun  2019 / Week 22 / Jun
     23   2019-06-09   Jun  2019 / Week 23 / Jun
     24   2019-06-16   Jun  2019 / Week 24 / Jun
     25   2019-06-23   Jun  2019 / Week 25 / Jun
     26   2019-06-30   Jun  2019 / Week 26 / Jun
     27   2019-07-07   Jul  2019 / Week 27 / Jul
     28   2019-07-14   Jul  2019 / Week 28 / Jul
     29   2019-07-21   Jul  2019 / Week 29 / Jul
     30   2019-07-28   Jul  2019 / Week 30 / Jul
     31   2019-08-04   Aug  2019 / Week 31 / Aug
     32   2019-08-11   Aug  2019 / Week 32 / Aug
     33   2019-08-18   Aug  2019 / Week 33 / Aug
     34   2019-08-25   Aug  2019 / Week 34 / Aug
     35   2019-09-01   Sep  2019 / Week 35 / Sep
     36   2019-09-08   Sep  2019 / Week 36 / Sep
     37   2019-09-15   Sep  2019 / Week 37 / Sep
     38   2019-09-22   Sep  2019 / Week 38 / Sep
     39   2019-09-29   Sep  2019 / Week 39 / Sep
     40   2019-10-06   Oct  2019 / Week 40 / Oct
     41   2019-10-13   Oct  2019 / Week 41 / Oct
     42   2019-10-20   Oct  2019 / Week 42 / Oct
     43   2019-10-27   Oct  2019 / Week 43 / Oct
     44   2019-11-03   Nov  2019 / Week 44 / Nov
     45   2019-11-10   Nov  2019 / Week 45 / Nov
     46   2019-11-17   Nov  2019 / Week 46 / Nov
     47   2019-11-24   Nov  2019 / Week 47 / Nov
     48   2019-12-01   Dec  2019 / Week 48 / Dec
     49   2019-12-08   Dec  2019 / Week 49 / Dec
     50   2019-12-15   Dec  2019 / Week 50 / Dec
     51   2019-12-22   Dec  2019 / Week 51 / Dec
     52   2019-12-29   Dec  2019 / Week 52 / Dec
2020 01   2020-01-05   Jan  2020 / Week 01 / Jan
     02   2020-01-12   Jan  2020 / Week 02 / Jan
     03   2020-01-19   Jan  2020 / Week 03 / Jan
     04   2020-01-26   Jan  2020 / Week 04 / Jan
     05   2020-02-02   Feb  2020 / Week 05 / Feb
     06   2020-02-09   Feb  2020 / Week 06 / Feb
     07   2020-02-16   Feb  2020 / Week 07 / Feb
     08   2020-02-23   Feb  2020 / Week 08 / Feb
     09   2020-03-01   Mar  2020 / Week 09 / Mar
     10   2020-03-08   Mar  2020 / Week 10 / Mar
     11   2020-03-15   Mar  2020 / Week 11 / Mar
     12   2020-03-22   Mar  2020 / Week 12 / Mar
     13   2020-03-29   Mar  2020 / Week 13 / Mar
     14   2020-04-05   Apr  2020 / Week 14 / Apr
     15   2020-04-12   Apr  2020 / Week 15 / Apr
     16   2020-04-19   Apr  2020 / Week 16 / Apr
     17   2020-04-26   Apr  2020 / Week 17 / Apr
     18   2020-05-03   May  2020 / Week 18 / May
     19   2020-05-10   May  2020 / Week 19 / May
     20   2020-05-17   May  2020 / Week 20 / May
     21   2020-05-24   May  2020 / Week 21 / May
     22   2020-05-31   May  2020 / Week 22 / May
     23   2020-06-07   Jun  2020 / Week 23 / Jun
     24   2020-06-14   Jun  2020 / Week 24 / Jun
     25   2020-06-21   Jun  2020 / Week 25 / Jun
     26   2020-06-28   Jun  2020 / Week 26 / Jun
     27   2020-07-05   Jul  2020 / Week 27 / Jul
     28   2020-07-12   Jul  2020 / Week 28 / Jul
     29   2020-07-19   Jul  2020 / Week 29 / Jul
     30   2020-07-26   Jul  2020 / Week 30 / Jul
     31   2020-08-02   Aug  2020 / Week 31 / Aug
     32   2020-08-09   Aug  2020 / Week 32 / Aug
     33   2020-08-16   Aug  2020 / Week 33 / Aug
     34   2020-08-23   Aug  2020 / Week 34 / Aug
     35   2020-08-30   Aug  2020 / Week 35 / Aug
     36   2020-09-06   Sep  2020 / Week 36 / Sep
     37   2020-09-13   Sep  2020 / Week 37 / Sep
     38   2020-09-20   Sep  2020 / Week 38 / Sep
     39   2020-09-27   Sep  2020 / Week 39 / Sep
     40   2020-10-04   Oct  2020 / Week 40 / Oct
     41   2020-10-11   Oct  2020 / Week 41 / Oct
     42   2020-10-18   Oct  2020 / Week 42 / Oct
     43   2020-10-25   Oct  2020 / Week 43 / Oct
     44   2020-11-01   Nov  2020 / Week 44 / Nov
     45   2020-11-08   Nov  2020 / Week 45 / Nov
     46   2020-11-15   Nov  2020 / Week 46 / Nov
     47   2020-11-22   Nov  2020 / Week 47 / Nov
     48   2020-11-29   Nov  2020 / Week 48 / Nov
     49   2020-12-06   Dec  2020 / Week 49 / Dec
     50   2020-12-13   Dec  2020 / Week 50 / Dec
     51   2020-12-20   Dec  2020 / Week 51 / Dec
     52   2020-12-27   Dec  2020 / Week 52 / Dec
     53   2020-12-31   Dec  2020 / Week 53 / Dec