Question

我在jason文件中有一个日期列表，并且喜欢聚合它们以查看我在10分钟的时间间隔内有多少...我认为Pandas中的时间序列是我应该寻找的，但我很困惑！有什么想法吗？

[更多详情] 当我使用pd.read_json加载json文件时;我只得到一个专栏;它有大约10,000行。每一行都是一个pandas.tslib.Timestamp，例如“1970-01-01 20:12:16”。理想情况下，我喜欢将这些时间戳分组为10分钟的时间间隔;查看每个区间中有多少时间戳并绘制条形图（直方图）。

Answer 1

您可以执行此操作resample。

时间戳列上的第一个索引，如果您还没有这样做：

df.set_index('time', inplace=True)

添加数字列（您需要重新采样以进行聚合）：

df['count'] = 1

最后根据需要重新采样：

df.resample('10T', how='sum')

Answer 2

我使用截断来执行此操作：

import random
import pandas as pd
import datetime as dt

ts = [dt.datetime.now() + dt.timedelta(minutes = random.randint(1000)) for _ in range(1000)] 
df = pd.DataFrame(ts, columns = ['ts'])

def truncate(t):
    return dt.datetime(month = t.month, day = t.day, year = t.year, hour = t.hour, minute = (55 / 10) * 10)

df.ts.map(truncate).value_counts()

每10分钟会给你一次计数

2016-02-20 00:50:00    79
2016-02-19 23:50:00    75
2016-02-20 08:50:00    72
2016-02-19 21:50:00    70
...

修改：

A. Leistra的方法要好得多，我也学到了一些东西。它适用于上面的设置：

df.set_index('ts', inplace = True) df['count'] = 1 df.resample('10T', how = sum).head() count | ts --- | --- 2016-02-19 21:00:00 5 2016-02-19 21:10:00 11 2016-02-19 21:20:00 17 2016-02-19 21:30:00 13 2016-02-19 21:40:00 11

python - 聚合时间戳，以查看我在10分钟的时间间隔内有多少时间戳

2 个答案: