我有3个pandas数据帧(类似于下面的一个)。我有2个列表list ID_1 = ['sdf', 'sdfsdf', ...]
和list ID_2 = ['kjdf', 'kldfjs', ...]
Table1:
ID_1 ID_2 Value
0 PUFPaY9 NdYWqAJ 0.002
1 Iu6AxdB qANhGcw 0.01
2 auESFwW jUEUNdw 0.2345
3 LWbYpca G3uZ_Rg 0.0835
4 8fApIAM mVHrayg 0.0295
Table2:
ID_1 weight1 weight2 .....weightN
0 PUFPaY9
1 Iu6AxdB
2 auESFwW
3 LWbYpca
Table3:
ID_2 weight1 weight2 .....weightN
0 PUFPaY9
1 Iu6AxdB
2 auESFwW
3 LWbYpca
我想要一个应该计算的数据框,如
for each x ID_1 in list1:
for each y ID_2 in list2:
if x-y exist in Table1:
temp_row = ( x[weights[i]].* y[weights[i]])
# here i want one to one multiplication, x[weight1]*y[weight1] , x[weight2]*y[weight2]
temp_row.append(value[x-y] in Table1)
new_dataframe.append(temp_row)
return new_dataframe
所需的new_dataframe应该类似于Table4:
Table4:
weight1 weight2 weight3 .....weightN value
0
1
2
3
我现在能做的是:
new_df = df[(df.ID_1.isin(list1)) & (df.ID_2.isin(list2))]
使用此功能,我将获得所有有效的ID_1
和ID_2
组合和值。但是我不知道如何从两个数据文件中获得权重的乘法(没有为每个weight[i]
进行循环)?
现在任务更容易,我可以遍历new_df
和for each row in new_df
,我会找到weight[i to n] for ID_1 from table 2
和weight[i to n] for ID_2 from table3
。然后,我可以将one-one multiplication
与"value" from table1
附加到新FINAL_DF
。但是我不想循环和做,我们能用一些更聪明的方法解决这个问题吗?
答案 0 :(得分:0)
是你想要的吗?
data = """\
ID_1
PUFPaY9
aaaaaaa
Iu6AxdB
auESFwW
LWbYpca
"""
id1 = pd.read_csv(io.StringIO(data), delim_whitespace=True)
data = """\
ID_2
PUFPaY9
Iu6AxdB
xxxxxxx
auESFwW
LWbYpca
"""
id2 = pd.read_csv(io.StringIO(data), delim_whitespace=True)
cols = ['weight{}'.format(i) for i in range(1,5)]
for c in cols:
id1[c] = np.random.randint(1, 10, len(id1))
id2[c] = np.random.randint(1, 10, len(id2))
id1.set_index('ID_1', inplace=True)
id2.set_index('ID_2', inplace=True)
df_mul = id1 * id2
一步一步:
In [215]: id1
Out[215]:
weight1 weight2 weight3 weight4
ID_1
PUFPaY9 8 9 1 1
aaaaaaa 6 1 9 2
Iu6AxdB 8 4 8 5
auESFwW 9 3 4 2
LWbYpca 7 7 1 8
In [216]: id2
Out[216]:
weight1 weight2 weight3 weight4
ID_2
PUFPaY9 6 5 5 1
Iu6AxdB 1 5 4 5
xxxxxxx 1 2 6 4
auESFwW 3 9 5 5
LWbYpca 3 3 6 7
In [217]: id1 * id2
Out[217]:
weight1 weight2 weight3 weight4
Iu6AxdB 8.0 20.0 32.0 25.0
LWbYpca 21.0 21.0 6.0 56.0
PUFPaY9 48.0 45.0 5.0 1.0
aaaaaaa NaN NaN NaN NaN
auESFwW 27.0 27.0 20.0 10.0
xxxxxxx NaN NaN NaN NaN