选择每个月最后一天的数据(行)

时间:2019-09-21 15:53:18

标签: pandas datetime

我只选择每个月的最后一天的行。例如: 以下数据帧的输出如下所示:

runtime: nodejs10

service: platform

automatic_scaling:
  max_instances: 1

handlers:
  - url: /static/js/(.*)
    static_files: build/static/js/\1
    upload: build/static/js/(.*)
  - url: /static/css/(.*)
    static_files: build/static/css/\1
    upload: build/static/css/(.*)
  - url: /static/media/(.*)
    static_files: build/static/media/\1
    upload: build/static/media/(.*)
  - url: /(.*\.(json|ico))$
    static_files: build/\1
    upload: build/.*\.(json|ico)$
  - url: /
    static_files: build/index.html
    upload: build/index.html
  - url: /.*
    static_files: build/index.html
    upload: build/index.html

输出:

date    Sales
0   2015-04-01  2416000
1   2015-04-02  2414000
2   2015-04-03  2416000
3   2015-04-04  2422000
4   2015-04-05  2434000
......

17  2015-05-18  2446000
18  2015-05-19  2454000
19  2015-05-20  2456000
20  2015-05-21  2453000
21  2015-05-22  2461000

我已经尝试过了:

 date        Sales
2015-04-05  2434000
2015-05-22  2461000

但是它给了我以下错误。

df.iloc[df.reset_index().groupby(df.date.to_period('M'))['index'].idxmax()]

任何帮助将不胜感激。谢谢

1 个答案:

答案 0 :(得分:2)

这看起来transformboolean_indexing

df[df['date'].eq(df.groupby([df['date'].dt.year,
           df['date'].dt.month])['date'].transform('max'))]

         date      Sales
4  2015-04-05  2434000.0
21 2015-05-22  2461000.0