在python中分类后每周和每年对数据进行分组

时间:2021-05-17 15:56:14

标签: python pandas dataframe time-series frequency

问题说明:csv中的数据由两列组成——日期和产品数据:

  Date                  Prod
  1/2/2018  7:43:00 PM     A
  1/1/2018  11:41:00 AM    B
  1/1/2018  7:57:00 AM     C
  1/2/2018  1:56:00 PM     A
  1/5/2018  3:29:00 AM     A
  1/3/2018  7:23:00 AM     C
  1/3/2018  1:26:00 PM     B
  1/5/2018  2:08:00 AM     A
  1/5/2018  3:47:00 PM     B

我需要返回两个json数据,以product为key,value应该是product的频率

  1. 每周一次
  2. 每年一次

喜欢:

  1. [{"A":{"Week1":"3","Week2":"3","Week3":"5",...},{"B":{"Week1":"5","Week2":"7","Week3":"4",...},{"C":{...}}]

  2. [{"A":{"2018":"3","2019":"3","2020":"5",...},{"B":{"2018":"5","2019":"7","2020":"4",...},{"C":{...}}]

我试过了:

df['Date'] = pd.to_datetime(df['Date'])

weekly_series = df.groupby(pd.Grouper(key='Date', freq='W'))['Date'].count()

weekly_series.index = weekly_series.index.week

1 个答案:

答案 0 :(得分:0)

准备你的数据框:

df["Date"] = pd.to_datetime(df["Date"])
df["Year"] = df["Date"].dt.isocalendar().year
df["Week"] = df["Date"].dt.isocalendar().week

创建您的字典:

dyear = df.groupby("Prod")[["Year"]] \
          .apply(lambda x: x.value_counts("Year").to_dict()).to_dict()

dweek = df.groupby("Prod")[["Week"]] \
          .apply(lambda x: x.value_counts("Week").add_prefix("Week").to_dict()).to_dict()
>>> dyear
{'A': {2018: 4}, 'B': {2018: 3}, 'C': {2018: 2}}

>>> dweek:
{'A': {'Week1': 4}, 'B': {'Week1': 3}, 'C': {'Week1': 2}}