Question

所以我们有一个在特定时间具有特定值的 Pandas DataFrame。

例如：

    @ts               @value Glucose Diff   smooth_diff new P          N    C1  C2

135 2021-10-29 11:16:00 167  167.0  -3.0    15.45   15.45   17.95   17.45   NaN 0.0
155 2021-10-29 12:56:00 162  162.0  -15.0   15.35   15.35   17.95   16.00   NaN 0.0
243 2021-10-29 20:16:00 133  133.0  0.0     15.25   15.25   19.85   15.75   NaN 0.0
245 2021-10-29 20:26:00 134  134.0  0.0     15.50   15.50   15.75   15.60   NaN 0.0
113 2021-10-29 09:26:00 130  130.0  1.0     16.75   16.75   0.00    21.70   NaN NaN

现在我们要删除彼此间隔 1 小时（@ts 列）的行（因此在此示例中，我们要删除 2021-10-29 20:26:00 的行作为它与前一个的时间跨度在 1 小时之内），但我们似乎无法找到一种方法来做到这一点。

有什么帮助吗？

Answer 1

这样的事情可能会奏效：

import pandas as pd

# create dataframe (only creating 2 cols for ease)
df = pd.DataFrame({
    '@ts': ['2021-10-29 11:16:00', '2021-10-29 12:56:00', '2021-10-29 20:16:00', 
            '2021-10-29 20:26:00'],
    '@value': [167, 162, 133, 134]
})

# split @ts column into separate columns - date(d) and time(t)
df[["d", "t"]] = df["@ts"].str.split(" ", expand=True)

# split time column into separate parts, hours, mins and secs
df[["h", "m", "s"]] = df["t"].str.split(":", expand=True)
# drop duplicates based on date and hour, keep the first row
df = df.drop_duplicates(subset=["d", "h"], keep="first")

Answer 2

将列转换为 datetime。用前一行 time 减去 time，然后计算 total seconds。计算 abs 值并检查它是否大于 3600 以创建布尔掩码。然后，使用布尔掩码过滤所需的行。

df['@ts'] = pd.to_datetime(df['@ts'])
df = df[~(df['@ts'] - df['@ts'].shift()
          ).dt.total_seconds().fillna(np.inf).apply(abs).lt(3600)]

在特定时间间隔内删除行

2 个答案: