说我有一个(samples, timesteps, features)
形状的大张量,但是我想将其展开以对Pandas执行groupby
操作,如何在矢量化后相应地标记每个n:n + size个元素时尚?解决速度慢:
import pandas as pd
import numpy as np
df = pd.DataFrame(np.random.normal(0, 1, 500))
df["sample"] = np.nan
n_timesteps = 50
n_samples = len(df) // n_timesteps
size = n_timesteps
for i in range(n_samples):
id0 = i * n_timesteps
id1 = i * n_timesteps + n_timesteps
df.loc[id0:id1, "sample"] = i
答案 0 :(得分:2)
用index
按整数除法分配新列:
#default RangeIndex
df['sample'] = df.index // n_timesteps
或由arange
创建的一维numpy数组:
df['sample'] = np.arange(len(df)) // n_timesteps