将列的每个元素乘以同一数据框中不同列的每个元素

时间:2017-05-17 04:52:19

标签: python pandas dataframe

我需要将列的每个元素乘以同一数据帧的不同列中的每个元素。我的原始数据集如下所示:

   origin    sum    sum2
    a.        2      1
    b.        4      2
    c.        6      3

我期待的结果类似于:

   origin    dest   result (sum * sum2)
    a.        a.      2
    a.        b.      4
    a.        c.      6
    b.        a.      4
    b.        b.      8
    b.        c.      12
    c.        a.      6
    c.        b.      12
    c.        c.      18

我正在编写的脚本如下,但我无法得到需要的结果:

x = 0
numerator = []

for index1, row1 in df.iterrows():
    constant = row1
    numerator.append([])

    for index2, row2 in df.iterrows():
        result = row2*constant

        numerator[x].append(result)

        x = x + 1

3 个答案:

答案 0 :(得分:3)

您可以使用:

mux = pd.MultiIndex.from_product([df.origin, df.origin], names=['origin','dest'])
data = np.outer(df['sum'], df['sum2']).ravel()
df = pd.DataFrame(data, index=mux, columns=['result']).reset_index()
print (df)
  origin dest  result
0     a.   a.       2
1     a.   b.       4
2     a.   c.       6
3     b.   a.       4
4     b.   b.       8
5     b.   c.      12
6     c.   a.       6
7     c.   b.      12
8     c.   c.      18

答案 1 :(得分:2)

您可以使用np.outer进行乘法。

np.outer(df['sum'], df['sum2'])
Out: 
array([[ 2,  4,  6],
       [ 4,  8, 12],
       [ 6, 12, 18]])

这可以转换为带有标签的系列,如下所示:

pd.DataFrame(np.outer(df['sum'], df['sum2']), 
             index=df['origin'],
             columns=df['origin']).rename_axis('dest', axis=1).stack()
Out: 
origin  dest
a.      a.       2
        b.       4
        c.       6
b.      a.       4
        b.       8
        c.      12
c.      a.       6
        b.      12
        c.      18
dtype: int64

(pd.DataFrame(np.outer(df['sum'], df['sum2']), 
             index=df['origin'],
             columns=df['origin']).rename_axis('dest', axis=1).stack()
             .to_frame('result').reset_index())
Out: 
  origin dest  result
0     a.   a.       2
1     a.   b.       4
2     a.   c.       6
3     b.   a.       4
4     b.   b.       8
5     b.   c.      12
6     c.   a.       6
7     c.   b.      12
8     c.   c.      18

答案 2 :(得分:0)

import pandas as pd
import itertools

# Make data example
df = pd.DataFrame()
df['origin']=['a.','b.','c.']
df['sum'] = [2,4,6]
df['sum2'] = [1,2,3]

# Record sum and sum2 for a. b. c.
df_dict = df.set_index('origin').to_dict()

df_final = pd.DataFrame()
for x,y in itertools.product(df['origin'],df['origin']):
    df_final = pd.concat([df_final,pd.DataFrame([x,y,df_dict['sum'][x]*df_dict['sum2'][y]]).T],axis=0)
df_final.columns =['origin','dest','result (sum * sum2)']

结果

  origin dest result (sum * sum2)
0     a.   a.                   2
0     a.   b.                   4
0     a.   c.                   6
0     b.   a.                   4
0     b.   b.                   8
0     b.   c.                  12
0     c.   a.                   6
0     c.   b.                  12
0     c.   c.                  18