Question

我有一个numpy数组，希望按日期时间进行过滤。我目前具有将输入日期时间（start和end）与数据框进行比较的功能，如下所示：

    if trim:
        columns = input_hdf.columns.get_level_values(0)
        print(str(columns))
        print(start)
        print(end)
        if start is not None and end is not None:
            mask = (columns >= start) & (columns <= end)
        elif start is not None:
            mask = (columns >= start)
        elif end is not None:
            mask = (columns <= end)
        else:
            # Should never reach this point, but just in case - mask will not affect the data
            mask = True
        input_hdf = input_hdf.loc[:, mask]

但是，我想添加将开始和结束的功能指定为“一年中的某天”，其中年份与比较无关-如果该天晚于10月1日，则将其排除在外2001年或2021年。

我目前正在通过以下方式将整数值转换为日期时间：

start = datetime.strptime(start, '%d-%m-%Y') if start else None

哪个年份默认为1900，这将成为比较的一部分。

Answer 1

熊猫对日期和时间有更好的支持。此答案利用了mm-dd形式的datetime-strings可排序的事实：

dates = <ndarray of dates>
s = pd.Series(dates, index=dates).dt.strftime('%m-%d')

# Select between Oct 1 and Dec 31 of all years
cond = ('10-01' <= s) & (s <= '12-31')
selected = s[cond].index.values

比较不带年份的numpy数组和日期时间

1 个答案: