我正试图从一个月中获取最后一天,但现在只返回了一年。我需要获取所有年份的所有记录。有什么建议吗?
import pandas as pd
import numpy as np
df = pd.read_csv('PETR4_BOV_D_cor.csv', engine='c', skiprows=1, parse_dates=['date'], names=['ticker', 'date', 'trades', 'close', 'low', 'high', 'open', 'vol', 'qty', 'avg'])
df.date = pd.to_datetime(df.date)
df = df.set_index('date')
结果
ticker trades close low high open vol qty avg
date
2015-05-29 PETR4 44895 11.577403 11.577403 11.999936 11.934209 901139500.0 72120400 11.732238
2015-06-01 PETR4 31861 11.614961 11.502286 11.877871 11.671299 489916746.0 39483500 11.650736
2015-06-02 PETR4 47249 12.056274 11.708858 12.159559 11.783975 582467511.0 45754100 11.953363
2015-06-03 PETR4 37454 12.046884 11.943598 12.300404 12.168949 629815443.0 48703400 12.142376
2015-06-05 PETR4 34917 11.793364 11.661910 11.999936 11.812143 452516624.0 36024200 11.794773
2016-12-23 PETR4 23100 13.370821 13.154859 13.474106 13.192418 309168316.0 21776900 13.330539
2016-12-26 PETR4 4840 13.539834 13.398989 13.568003 13.445938 82501537.0 5734300 13.509224
2016-12-27 PETR4 13617 13.530444 13.389600 13.661899 13.614951 215534672.0 14949200 13.537768
2016-12-28 PETR4 20265 13.877860 13.483496 13.906029 13.549223 277762881.0 18979900 13.741335
2016-12-29 PETR4 19721 13.962367 13.633730 13.971756 13.943587 266439891.0 18090600 13.829128
395 rows × 9 columns
分组依据
df.groupby(df.index.month).apply(pd.Series.tail, 1).reset_index(level=0, drop=True)
返回
ticker trades close low high open vol qty avg
date
2016-01-29 PETR4 64685 4.544577 4.244109 4.563356 4.413122 4.398262e+08 93013900 4.439976
2016-02-29 PETR4 36334 4.826265 4.676031 4.910772 4.769928 4.312967e+08 84165000 4.811617
2016-03-31 PETR4 44127 7.840334 7.690100 8.103243 7.849723 5.834259e+08 69529900 7.878831
2016-04-29 PETR4 39767 9.605582 9.399011 9.849713 9.774596 5.482716e+08 53536700 9.615911
2016-05-31 PETR4 56676 7.549255 7.549255 8.046905 7.849723 4.804290e+08 58131400 7.760052
2016-06-30 PETR4 19998 8.845023 8.676010 8.910751 8.845023 4.090867e+08 43553100 8.819483
2016-07-29 PETR4 44681 11.145480 10.901350 11.248766 11.042195 7.142205e+08 60478800 11.088579
2016-08-31 PETR4 45622 12.065663 11.934209 12.413079 12.328573 7.848222e+08 60716500 12.137024
2016-09-30 PETR4 28869 12.741716 12.619651 12.929508 12.704157 5.284275e+08 38771900 12.797209
2016-10-31 PETR4 48694 16.610240 16.535123 17.060942 16.929487 7.480264e+08 42059100 16.699535
2016-11-30 PETR4 57759 15.023394 14.657199 15.201797 14.929498 1.316175e+09 82547300 14.971282
2016-12-29 PETR4 19721 13.962367 13.633730 13.971756 13.943587 2.664399e+08 18090600 13.829128
答案 0 :(得分:0)
Tks @ Ch3steR和David。
我更改了类似于David示例的代码,并在新列中将月份和年份分开以对数据进行排序。
import pandas as pd
import numpy as np
df = pd.read_csv('PETR4_BOV_D_cor.csv', engine='c', skiprows=1, parse_dates=['date'], names=['ticker', 'date', 'trades', 'close', 'low', 'high', 'open', 'vol', 'qty', 'avg'])
df['year'] = pd.DatetimeIndex(df.date).year
df['month'] = pd.DatetimeIndex(df.date).month
df.date = pd.to_datetime(df.date)
df = df.set_index('date')
result = df.groupby([df.index.month, df.index.year]).apply(pd.Series.tail, 1).reset_index(level=0, drop=True)
sorted = result.sort_values(['year', 'month'])
结果
ticker trades close low high open vol qty avg year month
date date
2015 2015-05-29 PETR4 44895 11.577403 11.577403 11.999936 11.934209 9.011395e+08 72120400 11.732238 2015 5
2015-06-30 PETR4 41060 11.934209 11.877871 12.197118 12.112611 4.078387e+08 31922400 11.996086 2015 6
2015-07-31 PETR4 38481 9.859102 9.690089 10.046895 9.840323 3.838792e+08 36465800 9.884548 2015 7
2015-08-31 PETR4 64015 8.629062 7.999957 8.722958 8.178360 6.765446e+08 75379100 8.427373 2015 8
2015-09-30 PETR4 77202 6.798086 6.544566 6.816865 6.807475 7.660222e+08 107245600 6.706725 2015 9
... ... ... ... ... ... ... ... ... ... ... ... ...
2020 2020-02-28 PETR4 109660 25.340000 24.620000 25.560000 25.160000 2.230362e+09 89095300 25.033400 2020 2
2020-03-31 PETR4 169315 13.990000 13.600000 14.540000 13.600000 2.180450e+09 155314800 14.038900 2020 3
2020-04-30 PETR4 80537 18.050000 17.700000 18.420000 17.980000 1.433551e+09 79395500 18.055800 2020 4
2020-05-29 PETR4 85912 20.340000 19.300000 20.340000 19.550000 2.528730e+09 127224200 19.876200 2020 5
2020-06-02 PETR4 99499 21.400000 20.600000 21.400000 20.750000 1.592479e+09 76091600 20.928500 2020 6