Question

我有2个不同长度的数据帧。

数据帧A如下：

time_A   | Column_2 | Column_3
00:00:00 |   Type A |   ...
  ...    |   ...    |   ...
23:55:00 |   Type A |   ...
00:00:00 |   Type B |   ...
  ...    |   ...    |   ...
23:55:00 |   Type B |   ...
00:00:00 |   Type C |   ...
  ...    |   ...    |   ...
23:55:00 |   Type C |   ...

time_A是 string 类型，并且包含少于1000行。

数据框B如下：

time_B           | Column_4 | Column_5
12/04/2019 00:00 |   abc    |   ...
12/04/2019 00:00 |   def    |   ...
  ...            |   ...    |   ...
12/04/2019 23:55 |   ghi    |   ...
12/04/2019 23:55 |   klm    |   ...

time_B也是 string 类型，它包含约200000行。

现在我有一个时间范围作为datetime对象。假设

[datetime.time(11, 0), datetime.time(20, 30)]

我要为time_B在上述范围内的所有行添加新列（Column_6）。

各个Column_6单元格的值如下所示：

for each_cell_with_time_B_inside_range: current_time = current_time_B df_B.["Column_6"] = each_Column_3_cell_with_current_time / sum(Column_3_of_type_x) * funct(each_Column_3_cell)

或其他示例：

if total_2025_typeC = 100       --> we added all type C column_3 for 20:25:00
and current_column_3 = 20       --> the only value for type C 20:25:00 is 20
then column_6 = (20/100) * formula(20)

到目前为止，我已经使用过：

newdf_index = df_B.index[(df_B["time_B"] >= range[0]) & (df_B["time_B"] <= range[1])].tolist()

，从df_B获取行索引，为此我将新值添加到新列。我还玩过loc，iloc，strftime，strptime和大量的组合，这些组合使我接近但由于很多原因而失败，以至于我无法真正写下来。

这时候我的大脑已经炸了。任何帮助将不胜感激。

也许是这样的：

df_B.loc[df_B.index[indexes_where_time_B_is_within_range], "Column_6"] = ???

映射数据框的各个部分，并使用两者的值添加新的列值

0 个答案: