我写了下面给出的代码。有两个Pandas数据框:df
包含列timestamp_milli
和pressure
,df2
包含列timestamp_milli
和acceleration_z
。两个数据帧都有大约100行和3900行。在下面显示的代码中,我搜索df
每行的df2
行的每个时间戳,其中时间差在一个范围内并且是最小的。
不幸的是,代码非常慢。此外,我收到了来自df_temp["timestamp_milli"] = df_temp["timestamp_milli"] - row["timestamp_milli"]
行的以下消息:
SettingWithCopyWarning:尝试在a的副本上设置值 从DataFrame切片。尝试使用.loc [row_indexer,col_indexer] = 代替值
如何加速代码并解决警告?
acceleration = []
pressure = []
for index, row in df.iterrows():
mask = (df2["timestamp_milli"] >= (row["timestamp_milli"] - 5)) & (df2["timestamp_milli"] <= (row["timestamp_milli"] + 5))
df_temp = df2[mask]
# Select closest point
if len(df_temp) > 0:
df_temp["timestamp_milli"] = df_temp["timestamp_milli"] - row["timestamp_milli"]
df_temp["timestamp_milli"] = df_temp["timestamp_milli"].abs()
df_temp = df_temp.loc[df_temp["timestamp_milli"] == df_temp["timestamp_milli"].min()]
for index2, row2 in df_temp.iterrows():
pressure.append(row["pressure"])
acc = row2["acceleration_z"]
acceleration.append(acc)