我正试图找到一种每日分组数据的方法。
这是我的数据集的一个例子。
Dates Price1 Price 2
2002-10-15 11:17:03pm 0.6 5.0
2002-10-15 11:20:04pm 1.4 2.4
2002-10-15 11:22:12pm 4.1 9.1
2002-10-16 12:21:03pm 1.6 1.4
2002-10-16 12:22:03pm 7.7 3.7
答案 0 :(得分:2)
是的,我肯定会使用Pandas。最棘手的部分就是找出用于加载数据的pandas的日期时间解析器。之后,它只是对后续DataFrame的重新采样。
In [62]: parse = lambda x: datetime.datetime.strptime(x, '%Y-%m-%d %I:%M:%S%p')
In [63]: dframe = pandas.read_table("data.txt", delimiter=",", index_col=0, parse_dates=True, date_parser=parse)
In [64]: print dframe
Price1 Price 2
Dates
2002-10-15 23:17:03 0.6 5.0
2002-10-15 23:20:04 1.4 2.4
2002-10-15 23:22:12 4.1 9.1
2002-10-16 12:21:03 1.6 1.4
2002-10-16 12:22:03 7.7 3.7
In [78]: means = dframe.resample("D", how='mean', label='left')
In [79]: print means
Price1 Price 2
Dates
2002-10-15 2.033333 5.50
2002-10-16 4.650000 2.55
其中data.txt
:
Dates , Price1 , Price 2
2002-10-15 11:17:03pm, 0.6 , 5.0
2002-10-15 11:20:04pm, 1.4 , 2.4
2002-10-15 11:22:12pm, 4.1 , 9.1
2002-10-16 12:21:03pm, 1.6 , 1.4
2002-10-16 12:22:03pm, 7.7 , 3.7
答案 1 :(得分:0)
使用
data.groupby(data['dates'].map(lambda x: x.day))
答案 2 :(得分:0)
来自pandas文档:http://pandas.pydata.org/pandas-docs/stable/pandas.pdf
# 72 hours starting with midnight Jan 1st, 2011
In [1073]: rng = date_range(’1/1/2011’, periods=72, freq=’H’)