Question

我有一个具有数百万行的CSV文件，格式如下：

function withTemplateObjectDraggable(WrappedComponent: React.ComponentClass<TemplateObjectProps>) {
    class WithTemplateObjectDraggable extends React.Component<TemplateObjectCollectedProps> {
        public render() {
            return this.props.connectDragSource(
                <div>
                    <WrappedComponent {...this.props} />
                </div>,
            );
        }
    }

    return DragSource(ItemTypes.TEXTFIELD, templateObjectSource, collect)(WithTemplateObjectDraggable);
}

我想首先过滤此数据，然后对过滤后的数据进行一些计算。我使用熊猫Amount,Price,Time 0.36,13924.98,2010-01-01 00:00:08 0.01,13900.09,2010-01-01 00:02:04 0.02,13907.59,2010-01-01 00:04:54 0.07,13907.59,2010-01-01 00:05:03 0.03,13925,2010-01-01 00:05:41 0.03,13920,2010-01-01 00:07:02 0.15,13910,2010-01-01 00:09:37 0.03,13909.99,2010-01-01 00:09:58 0.03,13909.99,2010-01-01 00:10:03 0.14,13909.99,2010-01-01 00:10:03导入它，以获得一个DataFrame。

然后我将data = pd.read_csv()列转换为Time列（我不确定我想做什么），在其中我将时间差写成时间TimeDelta使用

2010-01-01 00:00:00

这是我努力奋斗的部分。我想要一个返回新DataFrame的函数，我只需要每data['TimeDelta'] = pd.to_timedelta(pd.to_datetime(data.Date)-pd.Timedelta(days=14610))/np.timedelta64(1, 'm')分钟之后的第一行，其中n是用户定义的整数。

例如。如果为n，则此函数对我的数据的期望输出为：

n=5

Amount,Price,Time 0.36,13924.98,2010-01-01 00:00:08 0.07,13907.59,2010-01-01 00:05:03 0.03,13909.99,2010-01-01 00:10:03的输出为：

n=3

我尝试使用Amount,Price,Time 0.36,13924.98,2010-01-01 00:00:08 0.02,13907.59,2010-01-01 00:04:54 0.15,13910,2010-01-01 00:09:37和其余的floor进行此操作，但是作为Python的初学者，我无法使其正常运行。

Answer 1

使用pd.Grouper：

n=5
df.groupby(pd.Grouper(key = 'Time', freq=f'{n} min')).first()

                      Amount   Price
Time                                 
2010-01-01 00:00:00    0.36  13924.98
2010-01-01 00:05:00    0.07  13907.59
2010-01-01 00:10:00    0.03  13909.99

Python Pandas：使用熊猫根据时间标准过滤行

1 个答案: