只在熊猫数据框中的特定时间范围内保留行

时间:2021-04-03 13:25:03

标签: python pandas

我有以下数据框:

Conversion from string "Mtbk1" to type 'Boolean' is not valid.

我想选择另一个数据帧“dfcont2”中的特定行:

activity_level2
                          Date_and_time  ...  walking_frame
Date_and_time                            ...               
2020-07-24 23:00:00 2020-07-24 23:00:00  ...              0
2020-07-24 23:01:00 2020-07-24 23:01:00  ...              0
2020-07-24 23:02:00 2020-07-24 23:02:00  ...              0
2020-07-24 23:03:00 2020-07-24 23:03:00  ...              0
2020-07-24 23:04:00 2020-07-24 23:04:00  ...              0
2020-07-24 23:05:00 2020-07-24 23:05:00  ...              0
2020-07-24 23:06:00 2020-07-24 23:06:00  ...              0
2020-07-24 23:07:00 2020-07-24 23:07:00  ...              0
2020-07-24 23:08:00 2020-07-24 23:08:00  ...              0
2020-07-24 23:09:00 2020-07-24 23:09:00  ...              0
2020-07-24 23:10:00 2020-07-24 23:10:00  ...              0
2020-07-24 23:11:00 2020-07-24 23:11:00  ...              0
2020-07-24 23:12:00 2020-07-24 23:12:00  ...              0
2020-07-24 23:13:00 2020-07-24 23:13:00  ...              0
2020-07-24 23:14:00 2020-07-24 23:14:00  ...              0
2020-07-24 23:15:00 2020-07-24 23:15:00  ...              0
2020-07-24 23:16:00 2020-07-24 23:16:00  ...              0
2020-07-24 23:17:00 2020-07-24 23:17:00  ...              0
2020-07-24 23:18:00 2020-07-24 23:18:00  ...              0
2020-07-24 23:19:00 2020-07-24 23:19:00  ...              0
2020-07-24 23:20:00 2020-07-24 23:20:00  ...              0
2020-07-24 23:21:00 2020-07-24 23:21:00  ...              0
2020-07-24 23:22:00 2020-07-24 23:22:00  ...              0
2020-07-24 23:23:00 2020-07-24 23:23:00  ...              0
2020-07-24 23:24:00 2020-07-24 23:24:00  ...              0
2020-07-24 23:25:00 2020-07-24 23:25:00  ...              0
2020-07-24 23:26:00 2020-07-24 23:26:00  ...              0
2020-07-24 23:27:00 2020-07-24 23:27:00  ...              1
2020-07-24 23:28:00 2020-07-24 23:28:00  ...              1
2020-07-24 23:29:00 2020-07-24 23:29:00  ...              1
2020-07-24 23:30:00 2020-07-24 23:30:00  ...              1
2020-07-24 23:31:00 2020-07-24 23:31:00  ...              1
2020-07-24 23:32:00 2020-07-24 23:32:00  ...              1
2020-07-24 23:33:00 2020-07-24 23:33:00  ...              1
2020-07-24 23:34:00 2020-07-24 23:34:00  ...              1
2020-07-24 23:35:00 2020-07-24 23:35:00  ...              1
2020-07-24 23:36:00 2020-07-24 23:36:00  ...              1
2020-07-24 23:37:00 2020-07-24 23:37:00  ...              1
2020-07-24 23:38:00 2020-07-24 23:38:00  ...              1
2020-07-24 23:39:00 2020-07-24 23:39:00  ...              1
2020-07-24 23:40:00 2020-07-24 23:40:00  ...              1
2020-07-24 23:41:00 2020-07-24 23:41:00  ...              1
2020-07-24 23:42:00 2020-07-24 23:42:00  ...              1
2020-07-24 23:43:00 2020-07-24 23:43:00  ...              1
2020-07-24 23:44:00 2020-07-24 23:44:00  ...              1
2020-07-24 23:45:00 2020-07-24 23:45:00  ...              1
2020-07-24 23:46:00 2020-07-24 23:46:00  ...              1
2020-07-24 23:47:00 2020-07-24 23:47:00  ...              1
2020-07-24 23:48:00 2020-07-24 23:48:00  ...              1
2020-07-24 23:49:00 2020-07-24 23:49:00  ...              1
2020-07-24 23:50:00 2020-07-24 23:50:00  ...              1
2020-07-24 23:51:00 2020-07-24 23:51:00  ...              1
2020-07-24 23:52:00 2020-07-24 23:52:00  ...              1
2020-07-24 23:53:00 2020-07-24 23:53:00  ...              1
2020-07-24 23:54:00 2020-07-24 23:54:00  ...              1
2020-07-24 23:55:00 2020-07-24 23:55:00  ...              1
2020-07-24 23:56:00 2020-07-24 23:56:00  ...              1
2020-07-24 23:57:00 2020-07-24 23:57:00  ...              1
2020-07-24 23:58:00 2020-07-24 23:58:00  ...              1
2020-07-24 23:59:00 2020-07-24 23:59:00  ...              1

[60 rows x 7 columns]

我想选择 dfcont2 中满足以下条件的那些行:

dfcont2
                               waddling_count    MP  waddling_frame
Date_and_time                                                      
2020-07-24 23:00:01.065838656           943.0   0.0             0.0
2020-07-24 23:00:01.132505322           943.0   0.0             0.0
2020-07-24 23:00:01.199171988           943.0   0.0             0.0
2020-07-24 23:00:01.265838654           943.0   0.0             0.0
2020-07-24 23:00:01.332505320           943.0   0.0             0.0
                                      ...   ...             ...
2020-07-24 23:59:58.399136016          2160.0   0.0             0.0
2020-07-24 23:59:58.465802682          2160.0   0.0             0.0
2020-07-24 23:59:58.532469348          2160.0   0.0             0.0
2020-07-24 23:59:58.599136014          2160.0   0.0             0.0
2020-07-24 23:59:58.665802680          2160.0  21.0             0.0

[53965 rows x 3 columns]

我想要'activity_level2'中2个特定时间戳之间的所有行(所以整整1分钟) 我希望这样就清楚了……我不知道该怎么做……非常感谢任何帮助。

1 个答案:

答案 0 :(得分:0)

我建议试试这个:

# In case "Date_and_time" column is not already of type 'datetime' in both dfs:
activity_level2["Date_and_time"] = pd.to_datetime(
    activity_level2["Date_and_time"], format="%Y-%m-%d %H:%M:%S"
)
dfcont2["Date_and_time"] = pd.to_datetime(
    dfcont2["Date_and_time"], format="%Y-%m-%d %H:%M:%S"
)

# Rows of activity_level2 for which 'walking_frame' is equal to 0
filtered_activity = activity_level2.loc[activity_level2['walking_frame'] == 0, :]

# Rows of dfcont2 between the 2 specific timestamps in 'filtered_activity'
mask = dfcont2["Date_and_time"].isin(filtered_activity["Date_and_time"].values)
newdf = dfcont2.loc[mask, :]